Giter VIP home page Giter VIP logo

super-suggestion's Introduction

Self-infilling Code Generation

Self-infilling

This repository contains the official implementation of our ICML'2024 paper Self-Infilling Code Generation. Self-infilling offers an alternative approach for code generation, allowing for interrupting the decoding of uncertain contexts and crafting a definitive suffix first. The suffix is initiated with a suffix prompt to scaffold the overall generation, and the language model then works backward to complete the code by filling in the middle. Besides, self-infilling further facilitates a looped decoding procedure, which iteratively updates and synchronizes each piece of generation in a cyclic manner.

Setup

Our codebase builds upon bigcode-evaluation-harness. Use the following steps to set up the codebase:

  1. Create and activate a virtual Python environment:

    python3 -m venv envs/si_gen
    source envs/si_gen/bin/activate
  2. Clone the repository and install dependencies:

    git clone https://github.com/LZhengisme/self-infilling.git
    cd self-infilling
    
    # or use a different Pytorch/CUDA version.
    pip install torch==2.1.0+cu121 --index-url https://download.pytorch.org/whl/cu121
    
    pip install -e .
  3. Configure accelerate for multi-GPU code generation:

    accelerate config
  4. For DS-1000 tasks that require designated versions of Python packages, we recommend using a separate virtual environment for evaluation:

    python3 -m venv envs/si_eval
    source envs/si_eval/bin/activate
    
    cd self-infilling
    
    # torch==1.12.1 required for **EVALUATING** ds-1000. Download version with relevant GPU support etc., e.g.,
    pip install torch==1.12.1
    # addition options are required for the DS-1000 benchmark
    pip install -e ".[ds1000]"

Usage

Demo ๐Ÿš€

Use example.py to try out our demo for self-infilling generation. We have implemented the core functionality of self-infilling as a Huggingface LogitsProcessor and StoppingCriteria, making it straightforward to invoke self-infilling via the vanilla HF generation interface with a few extra lines of configuration code:

# load your model and tokenizer, etc.
...

gen_kwargs = {
    # common generation arguments
    "do_sample": True,
    "temperature": 0.2,
    "top_p": 0.95,
    "max_new_tokens": 128,

    # self-infilling configuration
    "stopping_criteria": StoppingCriteriaList([
        SelfInfillEndOfFunctionCriteria(
          0, stop_words, tokenizer
        )
    ]),
    "logits_processor": LogitsProcessorList([
        SelfInfillingLogitsProcessor(
            0, stop_words, tokenizer,
            tau=args.self_infill_tau,
            suffix_prompt=suffix_prompt
        )
    ])
}

# use HF's generation interface
si_generated_tokens = model.generate(
    input_ids=input_ids,
    attention_mask=attention_mask,
    **gen_kwargs,
)

...

We provide an example input prompt and suffix prompt in example_prompt.txt and example_suffix_prompt.txt, respectively. Running the following command would generate a completion via self-infilling that tries to connect the given suffix prompt:

python3 example.py --model codellama/CodeLlama-7b-hf --self-infill-tau 0.25 --prompt-file example_prompt.txt --suffix-prompt-file example_suffix_prompt.txt
Click to expand the sample output

Here <PRE>, <SUF>, and <MID> indicate the start of the prefix, the suffix, and the middle, respectively; <EOT> denotes the end of the infilled middle section.

######################################################
# Sample input:
<PRE>Train a logistic regression model, predict the labels on the test set and compute the accuracy score:

<code>
X_train, y_train, X_test, y_test = train_test_split(X, y, test_size=0.1)

# Sample suffix prompt:
def compute_accuracy

######################################################
# Sample output:
<SUF>def compute_accuracy(y_test, y_pred):
    return np.mean(y_test == y_pred)

accuracy = compute_accuracy(y_test, y_pred)
print(accuracy)
</code><MID>
logreg = LogisticRegression()
logreg.fit(X_train, y_train)
y_pred = logreg.predict(X_test)

<EOT>
  • Check out example.py for details about processing self-infilling's inputs and outputs.

  • Our full implementation of self-infilling, including the looping mechanism introduced in our paper, can be found at lm_eval/generation_pipelines/self_infill_utils.py.

  • It is worth noticing that sometimes (self-)infilling may fail to complete the procedure (e.g., it might produce a poor suffix or struggle to generate a coherent middle given the prefix and suffix). This issue is not uncommon in infill-capable language models, and we recommend following previous practices to resolve this, such as re-sampling the generation until a satisfactory result is obtained (e.g., see the example usage of Incoder).

Task Evaluation

We evaluated self-infilling generation across several code benchmarks including HumanEval, MBPP, DS-1000, and GSM8K. For detailed results and analyses, please refer to the table below as well as our paper.

Results

Run our entry script launch.sh for conducting code generation tasks evaluated in our paper:

# [optional] set your HF user access token here if needed
# e.g., load private/gated models
# see https://huggingface.co/docs/hub/security-tokens
export HF_TOKEN="hf_..."

# [optional] activate your virtual environment here
# for **evaluating** DS-1000 tasks
source envs/si_eval/bin/activate

bash launch.sh -r <RUN_MODE> \
    -d <DATASET> \
    -b <BATCH_SIZE> \
    -m <MODEL> \
    -s <SAVE_DIR> \
    -g <GENERATE_MODE> \
    -p <NUM_PROCESSES> \
    -e True \
    --use_self_infill <ENABLE_SELF_INFILL_OR_NOT> \
    ...

We provide detailed instructions for each option in launch.sh. To reproduce our experimental results on different tasks, execute run.sh and specify the task name as the argument. For example:

# Feel free to explore and modify parameters in `run.sh` as needed.
bash run.sh humaneval
bash run.sh mbpp
bash run.sh ds1000-all-completion
bash run.sh pal-gsm8k

Citation

If you find our work helpful, please cite:

@article{zheng2023self,
  title={Self-Infilling Code Generation},
  author={Zheng, Lin and Yuan, Jianbo and Zhang, Zhi and Yang, Hongxia and Kong, Lingpeng},
  journal={arXiv preprint arXiv:2311.17972},
  year={2023}
}

cd grpc && python -m grpc_tools.protoc -I./proto --python_out=./generated --grpc_python_out=./generated ./proto/code_generator_service.proto

cd api_service/grpc_config && python -m grpc_tools.protoc -I./proto --python_out=./generated --grpc_python_out=./generated --pyi_out=./generated ./proto/code_generator_service.proto

huggingface-cli login

super-suggestion's People

Contributors

vicksemmanuel avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.