Giter VIP home page Giter VIP logo

fabiofurlano / autocoder Goto Github PK

View Code? Open in Web Editor NEW

This project forked from bin123apple/autocoder

0.0 0.0 0.0 26.44 MB

We introduced a new model designed for the Code generation task. Its test accuracy on the HumanEval base dataset surpasses that of GPT-4 Turbo (April 2024) and GPT-4o.

Home Page: https://arxiv.org/abs/2405.14906

License: Apache License 2.0

Shell 1.52% C++ 5.96% Python 83.55% Java 4.39% Fortran 0.14% Rust 3.28% Makefile 0.32% CMake 0.84%

autocoder's Introduction

AutoCoder

Introduction

We introduced a new model designed for the Code generation task. Its test accuracy on the HumanEval base dataset surpasses that of GPT-4 Turbo (April 2024). (90.9% vs 90.2%).

Additionally, compared to previous open-source models, AutoCoder offers a new feature: it can automatically install the required packages and attempt to run the code until it deems there are no issues, whenever the user wishes to execute the code.

  • Difference between the code interpreter of AutoCoder and the GPT-4 Turbo:

Below are the video demos for the code interpreter comparision between GPT-4 Turbo and AutoCoder:

GPT-4o can not access the external library.

AutoCoder-gpt4o.mp4

AutoCoder can automatically install the required packages. This feature expands the scope of code interpreter's application.

AutoCoder_demo.mp4
  • Difference between the code interpreter of AutoCoder and the current open-source code interpreter OpenCodeInterpreter:

The code interpreter of AutoCoder, like GPT-4 Turbo, is only called when the user has a need to verify the code, while OpenCodeInterpreter runs all generated python code.

Model

The Model is avaliable on Huggingface: AutoCoder (33B) AutoCoder-S (6.7B)

The base model is deepseeker-coder.

Quick Start

  1. Create the conda env
conda create -n AutoCoder python=3.11
conda activate AutoCoder
pip install -r requirements.txt
  1. Test on HumanEval 90.9% on base, 78.0% on base + extra.
cd Evaluation
python test_humaneval.py

You will receive a file named AutoCoder_HumanEval+.jsonl, which follows the EvalPlus format, after this step.

Then follow the testing framework of the EvalPlus GitHub. You will see the results.

NOTE:

  • Don't forget to use evalplus's evalplus.sanitize to post-process the code.
  • If you don't use the greedy method (for example set the do_sample=True) for the code generation. You will probably see the different results.
  1. Test on MBPP 82.5% on base, 70.6% on base + extra.
python test_humaneval.py

Post-process to delete the nature language for testing

python postprocess_mbpp.py

Your will get a AutoCoder_Mbpp+-sanitized.jsonl file after this step, it extracted all the code blocks. Then, directly test it by using EvalPlus GitHub (You don't need to use to use evalplus's evalplus.sanitize to post-process the code this time).

  1. Test on DS-1000.
python test_ds1000.py

Your will get a jsonl file after this step, it extracted all the code blocks. Then, directly test it by using DS-1000 GitHub.

  1. Web demo (Include code interpreter)

Install gradio related pakcages

cd /Web_demo
pip install -r requirements.txt

Run it:

python chatbot.py

NOTE:

  • Currently the model will only start the code interpreter if you ask it to verify its code. I am still finetuning it on a instructed dataset, which will give it the ability to enable the code interpreter upon a user request to run code. I will update the model when it is finished.

  • We suggest to set do_sample = True (default setting here) while using the code interpreter.

Contact

If you have any inquiries, please feel free to raise an issue or reach out to [email protected].

Citation

@misc{lei2024autocoder,
      title={AutoCoder: Enhancing Code Large Language Model with \textsc{AIEV-Instruct}}, 
      author={Bin Lei and Yuchen Li and Qiuwu Chen},
      year={2024},
      eprint={2405.14906},
      archivePrefix={arXiv},
      primaryClass={cs.SE}
}

Acknowledgments

Thanks to Tianyu Zheng, the first author of the OpenCodeInterpreter, for guidance on some technical details.

autocoder's People

Contributors

bin123apple avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.