Giter VIP home page Giter VIP logo

github2file's Introduction

GitHub2File

GitHub2File is a tool to download and process files from a GitHub repository, extracting and combining source code or other relevant files into a single output file.

Features

  • Download and process files from a GitHub repository.
  • Process local zip files and directories.
  • Filter files by programming language.
  • Include or exclude specific directories and files.
  • Convert IPython notebooks to Python scripts.
  • Remove comments and docstrings from Python files.
  • Optionally copy the output to the clipboard (MacOS only).

Installation

To install the package, run:

pip install .

Supported Languages and File Types

GitHub2File supports the following languages and file types:

  • Python (.py)
  • PDF (.pdf)
  • IPython Notebooks (.ipynb)
  • Markdown (.md, .markdown, .mdx)
  • JavaScript (.js)
  • Go (.go)
  • HTML (.html)
  • Mojo (.mojo)
  • Java (.java)
  • Lua (.lua)
  • C (.c, .h)
  • C++ (.cpp, .h, .hpp)
  • C# (.cs)
  • Ruby (.rb)
  • MATLAB (.m)
  • Shell (.sh)
  • TOML (.toml)

Usage

You can use GitHub2File either by downloading a repository directly from GitHub, processing a local zip file, or processing a local directory. Here are the different ways to call the script:

Download and Process a GitHub Repository

python -m github2file --repo_url <repository_url> [options]

Process a Local Zip File

python -m github2file --zip_file <path_to_zip_file> [options]

Process a Local Directory

python -m github2file --folder <path_to_folder> [options]

Options

  • --repo_url: The URL of the GitHub repository to download.
  • --zip_file: Path to the local zip file.
  • --folder: Path to the local folder.
  • --lang: The programming language(s) and format(s) of the repository (comma-separated). Default is python. Supported formats include python, pdf, and ipynb.
  • --keep-comments: Keep comments and docstrings in the source code (only applicable for Python).
  • --branch_or_tag: The branch or tag of the repository to download. Default is master.
  • --ipynb_nbconvert: Convert IPython Notebook files to Python script files using nbconvert. Default is True.
  • --pbcopy: Copy the output to the clipboard. Default is False.
  • --debug: Enable debug logging.
  • --include: Comma-separated list of subfolders/patterns to focus on.
  • --exclude: Comma-separated list of file patterns to exclude.
  • --excluded_dirs: Comma-separated list of directories to exclude. Default is docs,examples,tests,test,scripts,utils,benchmarks.
  • --name_append: Append this string to the output file name.
  • --ipynb_nbconvert: Convert IPython Notebook files to Python script files using nbconvert. Default is True.
  • --pbcopy: Copy the output to the clipboard. Default is False.

Example Usage

Download and Process a GitHub Repository

python -m github2file --repo_url https://github.com/user/repo --lang python,markdown,pdf --pbcopy --excluded_dirs env

or

python -m github2file --lang python,markdown --pbcopy --excluded_dirs env https://github.com/user/repo 

Process a Local Zip File

python -m github2file --zip_file /path/to/repo.zip --lang python,pdf --include src,lib --exclude test --keep-comments

Process a Local Directory

python -m github2file --folder /path/to/repo --lang python,pdf --excluded_dirs env,docs

Advanced Usage

You can combine multiple options to fine-tune the processing:

python -m github2file --folder /path/to/repo --lang python,pdf --keep-comments --include src,lib --name_append processed --debug --pbcopy

Contributing

If you want to contribute to this project, please fork the repository and create a pull request.

github2file's People

Contributors

synapticsage avatar ehartford avatar twilwa avatar nkkko avatar ohadrubin avatar oslook avatar imagineer99 avatar raghav3095 avatar georgeantonopoulos avatar

Forkers

edemerzel

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.