Giter VIP home page Giter VIP logo

repoagent's Introduction

ไธญๆ–‡ | Background | Features | Quick Start | Future Work | Supported Language | Citation

๐Ÿค— Introduction

RepoAgent is an Open-Source project driven by Large Language Models(LLMs) that aims to provide an intelligent way to document projects. It is designed to be a handy tool for developers who need to organize their code and cooperate with teammates.

RepoAgent

๐Ÿ‘พ Background

In the realm of computer programming, the significance of comprehensive project documentation, including detailed explanations for each Python file, cannot be overstated. Such documentation serves as the cornerstone for understanding, maintaining, and enhancing the codebase. It provides essential context and rationale for the code, making it easier for current and future developers to comprehend the purpose, functionality, and structure of the software. It not only facilitates current and future developers in grasping the project's purpose and structure but also ensures that the project remains accessible and modifiable over time, significantly easing the learning curve for new team members.

Traditionally, creating and maintaining software documentation demanded significant human effort and expertise, a challenge for small teams without dedicated personnel. The introduction of Large Language Models (LLMs) like GPT has transformed this, enabling AI to handle much of the documentation process. This shift allows human developers to focus on verification and fine-tuning, greatly reducing the manual burden of documentation.

๐Ÿ† Our goal is to create an intelligent document assistant that helps people read and understand repositories and generate documents, ultimately helping people improve efficiency and save time.

๐Ÿชญ Features

  • ๐Ÿค– Automatically detects changes in Git repositories, tracking additions, deletions, and modifications of files.
  • ๐Ÿ“ Independently analyzes the code structure through AST, generating documents for individual objects.
  • ๐Ÿ” Accurate identification of inter-object invocation relationships, enriching the global perspective of document content.
  • ๐Ÿ“š Seamlessly replaces Markdown content based on changes, maintaining consistency in documentation.
  • ๐Ÿ•™ Executes multi-threaded concurrent operations, enhancing the efficiency of document generation.
  • ๐Ÿ‘ญ Offer a sustainable, automated documentation update method for team collaboration.

๐Ÿ“ฆ Installation

First, ensure that your machine is installed with Python version 3.9 or higher.

$ python --version
python 3.11.4

Next, clone the project, create a virtual environment, and install dependencies within this environment.

cd RepoAgent
conda create -n RepoAgent python=3.11.4
conda activate RepoAgent
pip install -r requirements.txt

๐Ÿ“– Quick Start

Configuring RepoAgent

First, configure the OpenAI API parameters in the config.yml file. For details on obtaining these, please refer to OpenAI API.

In the config.yml file, configure other parameters like OpenAI API, the destination repository path, document language, and so on:

api_keys:
  gpt-3.5-turbo-16k:
    - api_key: sk-XXXX
      base_url: https://example.com/v1/
      api_type: azure
      api_version: XXX
      engine: GPT-35-Turbo-16k
      # you can use any kwargs supported by openai.ChatCompletion here
    - api_key: sk-xxxxx
      organization: org-xxxxxx
      model: gpt-3.5-turbo-16k
  ...

default_completion_kwargs:
  model: gpt-4
  temperature: 0.2
  request_timeout: 60

repo_path: /path/to/your/repo
project_hierarchy: .project_hierarchy.json # The paths of the global structure information json file
Markdown_Docs_folder: Markdown_Docs # The folder in the root directory of your target repository to store the documentation.
ignore_list: ["ignore_file1.py", "ignore_file2.py", "ignore_directory"] # Ignore some py files or folders that you don't want to generate documentation for by giving relative paths in ignore_list.

language: en # Two-letter language codes (ISO 639-1 codes), e.g. `language: en` for English. Refer to Supported Language for more languages.

Run RepoAgent

Enter the root directory of RepoAgent and type the following command in the terminal:

python repo_agent/runner.py

If it's your first time generating documentation for the target repository, RepoAgent will automatically create a JSON file maintaining the global structure information and a folder named Markdown_Docs in the root directory of the target repository for storing documents.

The paths of the global structure information json file and the documentation folder can be configured in config.yml.

Once you have initially generated the global documentation for the target repository, or if the project you cloned already contains global documentation information, you can then seamlessly and automatically maintain internal project documentation with your team by configuring the pre-commit hook in the target repository!

Configuring the Target Repository

RepoAgent currently supports generating documentation for projects, which requires some configuration in the target repository.

First, ensure that the target repository is a git repository and has been initialized.

git init

Install pre-commit in the target repository to detect changes in the git repository.

pip install pre-commit

Create a file named .pre-commit-config.yaml in the root directory of the target repository. An example is as follows:

repos:
  - repo: local
    hooks:
    - id: repo-agent
      name: RepoAgent
      entry: python path/to/your/repo_agent/runner.py
      language: system
      # You can specify the file types that trigger the hook, but currently only python is supported.
      types: [python]

For specific configuration methods of hooks, please refer to pre-commit. After configuring the yaml file, execute the following command to install the hook.

pre-commit install

In this way, each git commit will trigger the RepoAgent's hook, automatically detecting changes in the target repository and generating corresponding documents. Next, you can make some modifications to the target repository, such as adding a new file to the target repository, or modifying an existing file. You just need to follow the normal git workflow: git add, git commit -m "your commit message", git push The RepoAgent hook will automatically trigger at git commit, detect the files you added in the previous step, and generate corresponding documents.

After execution, RepoAgent will automatically modify the staged files in the target repository and formally submit the commit. After the execution is completed, the green "Passed" will be displayed, as shown in the figure below: Execution Result

The generated document will be stored in the specified folder in the root directory of the target warehouse. The rendering of the generated document is as shown below: Documentation

We utilized the default model gpt-3.5-turbo to generate documentation for the XAgent project, which comprises approximately 270,000 lines of code. You can view the results of this generation in the Markdown_Docs directory of the XAgent project on GitHub. For enhanced documentation quality, we suggest considering more advanced models like gpt-4 or gpt-4-1106-preview.

In the end, you can flexibly adjust the output format, template, and other aspects of the document by customizing the prompt. We are excited about your exploration of a more scientific approach to Automated Technical Writing and your contributions to the community.

โœ… Future Work

  • Identification and maintenance of parent-child relationship hierarchy structure between objects
  • Implement Black commit
  • Bi-direct reference Construct Bi-directional reference topology
  • chat with repo Chat with the repository by giving code and document at the same time
  • Generate README.md automatically combining with the global documentation
  • Multi-language support Support more languages like Java, C or C++, etc.
  • Local model support like Llama, chatGLM, Qianwen, etc.
  • Automatically generate better visualizations such as Gitbook

๐Ÿ‡บ๐Ÿ‡ณ Supported Language

Set the target language with the two-letter language codes (ISO 639-1 codes), Click on the 'Languages List' section below to expand the list of supported languages.

Languages List
Flag Code Language
๐Ÿ‡ฌ๐Ÿ‡ง en English
๐Ÿ‡ช๐Ÿ‡ธ es Spanish
๐Ÿ‡ซ๐Ÿ‡ท fr French
๐Ÿ‡ฉ๐Ÿ‡ช de German
๐Ÿ‡จ๐Ÿ‡ณ zh Chinese
๐Ÿ‡ฏ๐Ÿ‡ต ja Japanese
๐Ÿ‡ท๐Ÿ‡บ ru Russian
๐Ÿ‡ฎ๐Ÿ‡น it Italian
๐Ÿ‡ฐ๐Ÿ‡ท ko Korean
๐Ÿ‡ณ๐Ÿ‡ฑ nl Dutch
๐Ÿ‡ต๐Ÿ‡น pt Portuguese
๐Ÿ‡ธ๐Ÿ‡ฆ ar Arabic
๐Ÿ‡น๐Ÿ‡ท tr Turkish
๐Ÿ‡ธ๐Ÿ‡ช sv Swedish
๐Ÿ‡ฉ๐Ÿ‡ฐ da Danish
๐Ÿ‡ซ๐Ÿ‡ฎ fi Finnish
๐Ÿ‡ณ๐Ÿ‡ด no Norwegian
๐Ÿ‡ต๐Ÿ‡ฑ pl Polish
๐Ÿ‡จ๐Ÿ‡ฟ cs Czech
๐Ÿ‡ญ๐Ÿ‡บ hu Hungarian
๐Ÿ‡ฌ๐Ÿ‡ท el Greek
๐Ÿ‡ฎ๐Ÿ‡ฑ he Hebrew
๐Ÿ‡น๐Ÿ‡ญ th Thai
๐Ÿ‡ฎ๐Ÿ‡ณ hi Hindi
๐Ÿ‡ง๐Ÿ‡ฉ bn Bengali

e.g., language: en for English.

๐Ÿ“Š Citation

@misc{RepoAgent,
  author = {Qinyu Luo, Yining Ye, Shihao Liang, Arno},
  title = {RepoAgent: A LLM-based Intelligent tool for repository understanding and documentation writing},
  year = {2023},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/LOGIC-10/RepoAgent}},
}

repoagent's People

Contributors

logic-10 avatar umpire2018 avatar sailaoda avatar pooruss avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.