madaan / pie-perf Goto Github PK

View Code? Open in Web Editor NEW

80.0 6.0 14.0 490.1 MB

Training language models to make programs faster

Home Page: htttps://pie4perf.com

C++ 0.72% Python 43.67% Jupyter Notebook 55.61%

code-generation code-optimization llms optimization software-engineering

pie-perf's People

Contributors

Stargazers

Watchers

Forkers

ccsnow127 chan0415 nashid syaikhipin chenzimin techsuni2023 18jeffreyma anirbandey303 evelynmitchell imvijay23 ankushjain7 vs666 pranav-deshpande

pie-perf's Issues

Some of the code comments are not in English

Were the comments translated or used for fine tuning as it is?
Data : pythonsplits/train.jsonl

Where can `data/problem_list.csv` be found ?

https://github.com/madaan/pie-perf/blob/main/README.md?plain=1#L18 mentions details on the problems that can be found in the said path. But I couldn't map out where to find them. It would be greatly helpful @madaan.

input.*.txt files in public_test_cases does not start with 0

For the following problems with test cases in the public_test_cases, the input.*.txt and output.*.txt files does not start with index 0:

This causes an error in the run_eval.py script that expects that the index starts with 0.

One solution is to simply rename these files.

Pretrained Model

Thank you for sharing the dataset and code!
I couldn't find the pretrained model trained with this data. Can you provide the CodeLlama 13B w/ FineTune used for reporting in Table 1 of the paper?
Also, is it possible to provide the synthetic data generated using GPT?
Thank you.
@madaan

Running run_eval.py

Thanks for script for evaluation on the CodeNet dataset!

I am trying to evaluate some predictions on the dataset with the command:

python3 src/codenet_eval/run_eval.py --eval_config eval_files/example_eval_config.yaml

In the output report file, all input_* and generated_answers_* columns are either null or 0. I tried to submit the file (generated from by setting the temp_dir option to run_eval.py) to the AtCoder website and it was able to compile and run.

In attachment I have uploaded one line of the jsonl files and yaml config file. Thanks if you can take a look at this.

example_data.zip

How to post-process the generated code?

I checked the whole paper but failed to find any information about how you post-process the generated code.
Is there any post-processing or not?
Thanks for any reply.

Question regarding the paper

Hello. Thank you so much for the great research!
I had a question while reading the paper, so I'm leaving it here.

I read section 5 "Analysis of Generated Code Edits" with great interest and was wondering if I could find more detailed information in the subsection "Comparing CODEX and CODEGEN", since it only shows the number of code edits that each model generated (i.e. optimized successfully). For example, I would like to see a list of problems that CODEX successfully optimized and the detailed %OPT/SpeedUp/%RTR for each problem.

Again, thank you so much for this great work.

madaan / pie-perf Goto Github PK

pie-perf's People

Contributors

Stargazers

Watchers

Forkers

pie-perf's Issues

Some of the code comments are not in English

Where can `data/problem_list.csv` be found ?

input.*.txt files in public_test_cases does not start with 0

Pretrained Model

Running run_eval.py

How to post-process the generated code?

Question regarding the paper

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent