michiyasunaga / drrepair Goto Github PK

View Code? Open in Web Editor NEW

190.0 7.0 33.0 1.82 MB

[ICML 2020] DrRepair: Learning to Repair Programs from Error Messages

Home Page: https://arxiv.org/abs/2005.10636

License: MIT License

Shell 2.91% Python 97.09%

code-generation graph-neural-networks deep-learning pre-training program-repair

drrepair's Issues

about dataset

Hi, Dear Author @michiyasunaga, I successfully downloaded the datasets and tried to understand JSON's internal contents. If understand correctly, are the codes in X['lines'] are fixed codes?

I also found that each program (in a single json file) has a lot groups of buggy information by ["err_line", "err_msg", "mod_line", "mod_code"]. If my understanding is correct, when using the workable program in X['lines'], we can independently generate multiple buggy codes by each group of ["err_line", "err_msg", "mod_line", "mod_code"], is that a correct usage?

Unable to evaluate model (on macOS)

DrRepair/evaluation/deepfix/out/code-compiler--2l-graph [master]
% for entry in ${test_split_root}/*
do
  probid=`basename $entry`
  python3 -u ../../test_deepfix.py \
  --input-code-dir ${program_data_root}/${probid}/erroneous \
  --repairer-server  http://0.0.0.0:8002/pred
done

Traceback (most recent call last):
  File "../../test_deepfix.py", line 331, in <module>
    main()
  File "../../test_deepfix.py", line 325, in main
    stitch()
  File "../../test_deepfix.py", line 290, in stitch
    stitch_helper(prog_fname)
  File "../../test_deepfix.py", line 231, in stitch_helper
    _code_str_tokenized = ' '.join(tokenize_code(_code, mod_brace=False))
DrRepair/utils/code_process.py", line 44, in tokenize_code
    clang.cindex.Config.set_library_path('/usr/local/Cellar/llvm/12.0.0/lib/')
  File "/Library/Python/3.8/lib/python/site-packages/clang/cindex.py", line 4107, in set_library_path
    raise Exception("library path must be set before before using " \
Exception: library path must be set before before using any other functionalities in libclang.

list index out of bounds error

tok=ex.src_vocab[idx - len(self.vocab)] in file DrRepair/model/repairer/data/err_dataset.py line 181 throws list index out of bounds error. How should i go about resolving this issue? I'm using the preprocessed models.

What is the word "Vanilla" or "Substitute"?

Hello, Can you tell me what is the "Vanilla" or "Substitute"? I can see "SubstituteErrData" and "VanillaErrData".Why is the name chosen like this? What is their difference？Thanks.

Abstract syntax tree (AST)

I need to add information from AST into program feedback graph. What is your opinion?

clang error

./2.run-gen-err-dataset--auto-corrupt--spoc.sh 

mkdir: cannot create directory ‘err-data-compiler--auto-corrupt--additional-codeforce--spoc-style’: File exists
joblib.externals.loky.process_executor._RemoteTraceback: 
"""
Traceback (most recent call last):

.local/lib/python3.8/site-packages/clang/cindex.py", line 4178, in get_cindex_library
    raise LibclangError(msg)
clang.cindex.LibclangError: libclang-11.so: cannot open shared object file: No such file or directory. To provide $
 path to libclang use Config.set_library_path() or Config.set_library_file().
"""

This fixed it for me

pip install clang
pip install libclang

RuntimeError: "index_select_out_cuda_impl" not implemented for 'Float'

Dear Dr. michiyasunaga

Hello ,I have a error when I train the dataset,

`TRAIN: 3%|▎ | 4993/150000 [15:45<7:43:02, 5.22it/s]TRAIN @ 5000: (n=838, loss_localize=0.246899, loss_edit=6.836520, acc_localize=73.63%, acc_edit1=21.72%, acc_edit2=0.00%, acc_repair=0.00%, grad_norm=27.990015)
Saving model to checkpoint 5000
Loaded 2010 dev examples

DEV: 0it [00:00, ?it/s]�[A
DEV: 0it [00:00, ?it/s]
Traceback (most recent call last):
File "/workspace/DrRepair/model/main_spoc.py", line 75, in
main()
File "/workspace/DrRepair/model/main_spoc.py", line 65, in main
experiment.train()
File "/workspace/DrRepair/model/repairer/experiments.py", line 146, in train
stats = self.process_batch(dev_batch, train=False, fout=fout)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/workspace/DrRepair/model/repairer/experiments.py", line 224, in process_batch
logit_edit1, label_edit1 = self.model.forward_edit(batch, all_enc_stuff, train_mode=False, beam_size=10) #follow the edit_lineno given
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/workspace/DrRepair/model/repairer/model/err_localize_edit.py", line 777, in forward_edit
dec_output, padded_gold_code_line = self.forward_helper_decode(batch, packed_dec_input, src_vocabs, src_map, train_mode) #(max_seq_len, batch_size, vocab_size)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/workspace/DrRepair/model/repairer/model/err_localize_edit.py", line 851, in forward_helper_decode
allHyp, allScores = self.beam_decode(hidden, enc_output, mask, extra_feed, src_vocabs, src_map)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/workspace/DrRepair/model/repairer/model/err_localize_edit.py", line 971, in beam_decode
beam.advance(log_probs, attn)
File "/workspace/DrRepair/model/repairer/model/beam_search_onmt.py", line 303, in advance
[self.alive_seq.index_select(0, self.select_indices),
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: "index_select_out_cuda_impl" not implemented for 'Float'

TRAIN: 3%|▎ | 5000/150000 [15:54<7:41:08, 5.24it/s]`

I've tried many ways to always get an error, maybe it's a problem with the torch/cuda version?

Thanks in advance~

0% trained data trained

I keep following the steps in the ReadMe but whatever I do after the training phase it always shows TRAIN: 0% |.

MacOS - TypeError: type torch.cuda.FloatTensor not available. Torch not compiled with CUDA enabled.

Since Macbooks do not have NVIDIA GPUs, CUDA won't work on them.

Is there a workaround to this?

This occurs while running

name="code-compiler--2l-graph"
mkdir -p out_deepfix/${name}
python3 -u main_deepfix.py -o ${name} train \
    configs/base.yml  configs/data-deepfix/err-data-orig.yml \
    configs/model-code-compiler/2l-graph--dec-attn-all.yml

About the license

Hello, it seems there is no information about the license in this repository.
Can I use your codes and dataset?

Is there a pretrained model we could use?

Undefined name threshold_ON in utils/repair_utils.py

Hello, I found: DrRepair/utils/repair_utils.py:182:12: F821 undefined name 'threshold_ON'

about test

Dear Dr. Yasunaga
Hello ,Dear Author @michiyasunaga , I am reproducing your paper(Graph-based, Self-Supervised Program Repair from Diagnostic Feedback).
I follow the step of your github.When i evaluate the model(code-compiler--2l-graph), I found the amount of passed and failed is 1,260.It is different from the paper(6,971).Is this path(/data/err-data-compiler--auto-corrupt--orig-deepfix/bin4) the real path of test set?
How can i evaluate the Deepfix raw test?

I would be very grateful indeed for any help you could give me.

Best wishes

michiyasunaga / drrepair Goto Github PK

drrepair's Issues

about dataset

Unable to evaluate model (on macOS)

list index out of bounds error

What is the word "Vanilla" or "Substitute"?

Abstract syntax tree (AST)

clang error

RuntimeError: "index_select_out_cuda_impl" not implemented for 'Float'

0% trained data trained

MacOS - TypeError: type torch.cuda.FloatTensor not available. Torch not compiled with CUDA enabled.

About the license

Is there a pretrained model we could use?

Undefined name threshold_ON in utils/repair_utils.py

about test

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent