miyyer / scpn Goto Github PK

View Code? Open in Web Editor NEW

166.0 166.0 35.0 786 KB

syntactically controlled paraphrase networks

Shell 0.09% Python 99.91%

scpn's People

Contributors

Stargazers

Watchers

scpn's Issues

Approaximately how much time required to complete individual parse_generator and scpn model training ?

Hi Guys,

It's been 10 days since training of scpn model was started, and after this 10 days, training has just completed 4 epochs out of 15 epochs with DEFAULT settings.

machine configuration :

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 396.54                 Driver Version: 396.54                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Tesla K80           Off  | 000075E1:00:00.0 Off |                    0 |
| N/A   67C    P0    89W / 149W |   9896MiB / 11441MiB |     84%      Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0     47372      C   xyzz/python2.7                                 1440MiB |
|    0     82095      C   scpn/python                                      8443MiB |
+-----------------------------------------------------------------------------+

using above system, can you guys tell me approximately how much time will it take to train scpn model(DEFAULT settings) ?
with that as another model has dependency with scpn which is "parse_generator", can you tell me how much time will this model(DEFAULT settings) take to train ?

Thanks.

Licence of your SCPN code

Hi, Would you please let me know what is the license of SCPN?

Using the Supplied Training Data for Training

Does anyone has any success in using the supplied training data for training (with Python 2.7 and Pytorch 3.1)?

It appears that it is non-trainable in my machine, and there is an infinite loop within the enumeration of the minibatches.

z = indexify_transformations(in_p, out_p, label_voc, args)
if z == None:
    continue

The above code always produces a z of None type, thus generating an infinite loop.

@miyyer @jwieting

How should I preprocess the data?

If I just want to train the SCPN model, I just need to preprocess the para-nmt dataset. But what if I want to use SCPN to generate syntactically adversarial examples for downstream task? Should I preprocess (for example, tokenizing and BPE) the para-nmt dataset with the downstream task's dataset together? How did you preprocess SST and SICK data ? @miyyer @jwieting Thank you very much!

issue in demo.sh

Thanks for code sharing.

while executing through below loop, there are two issues i'm facing.

# loop over sentences and transform them
    for d_idx, ex in enumerate(inrdr):
        stime = time.time()

1. exception : cuda runtime error : device-side assert triggered at /pytorch/aten/src/THC/THCTensorCopy.cpp:20
at line 3

# add EOS
        seg_sent.append(pp_vocab['EOS'])
        torch_sent = Variable(torch.from_numpy(np.array(seg_sent, dtype='int32')).long().cuda())
        torch_sent_len = torch.from_numpy(np.array([len(seg_sent)], dtype='int32')).long().cuda()

2. exception : tensor(101,device='cuda:0')
at line 2

for b_idx in beam_dict:
                prob,_,_,seq = beam_dict[b_idx][0]
                gen_parse = ' '.join([rev_label_voc[z] for z in seqs[b_idx]])
                gen_sent = ' '.join([rev_pp_vocab[w] for w in seq[:-1]])

when tried to debug , found something like this.
seqs contains [[tensor(101,device='cuda:0'),tensor(1,device='cuda:0')....etc]]
seq contains [tensor(5,device='cuda:0'),tensor(448,device='cuda:0')....etc]
rev_label_voc contains {0:'NN',1:'VB'...etc}
rev_pp_vocab contains {'is':229,'am':43...etc}
as you can see variables and their values , it's failing because ok KEYERROR.

any suggestion why is it happening ?

@miyyer @jwieting

An issue about the implement of train_scpn.py forward function

The line 241 in train_scpn.py:
copy_probs = copy_probs.view(-1);
I think it should be:
copy_probs = copy_probs.transpose(0, 1).contiguous().view(-1)
because decoder_states, decoder_copy_dists are transposed:
line 234: decoder_states = decoder_states.transpose(0, 1).contiguous().view(-1, self.d_hid * 2)
line 238: decoder_copy_dists = decoder_copy_dists.transpose(0, 1).contiguous().view(-1, self.len_voc)
@miyyer @jwieting

Questions about training time consuming

I try to train the scpn model but the training data is too large. I use one GPU. The batch size is 64 and for every batch I need 1.6 seconds to train. But there is 439586 batches. I try to use two GPUs to train but I fail. Could you tell me how you speed up the training process? Thank you so much. @miyyer @jwieting

Would this work well on question paraphrasing?

Hi, I was wondering if you think this would work well with question paraphrasing? Any insights would be great! Thanks.

Eric

Some questions about trans_embs

Hello, thank you so much for sharing the code with us. I have learned a lot. Thank you so much! But I have some questions about the trans_embs in this code.

In train_scpn.py, SCPN's parameter "len_parse_voc" is 103, which means the parse vocabulary doesn't include the token 'EOP'. But during the training of SCPN, the function indexify_transformations() is called to get valid instances of transformations. In this function, deleaf() is called and deleaf() will add 'EOP' at the end of the parse. But there isn't 'EOP' in the parse vocabulary which will result in mistakes when transform the parse tag into index.
There might be the token 'EOP' in the parses generated from ParseNet. But the trans_embs' shape in SCPN is (103*56), which means the embedding table doesn't include 'EOP'. This will result in errors when running generate_paraphrases.py.
SCPN and ParseNet use different trans_embs, what if they share the same trans_embs ?

RuntimeError: cuda runtime error (59) / Ranking output sentences

Hi @miyyer,

I had ran demo.sh with pytorch 0.4.0 in both python 2.7 & 3.6, GPU usage is about 1.4Gb, both give me the error:

/pytorch/aten/src/THC/THCTensorIndex.cu:360: void indexSelectLargeIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 2, SrcDim = 2, IdxDim = -2, IndexIsMajor = true]: block: [113,0,0], thread: [94,0,0] Assertion ``srcIndex < srcSelectDimSize`` failed.

/pytorch/aten/src/THC/THCTensorIndex.cu:360: void indexSelectLargeIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 2, SrcDim = 2, IdxDim = -2, IndexIsMajor = true]: block: [113,0,0], thread: [95,0,0] Assertion ``srcIndex < srcSelectDimSize`` failed.

THCudaCheck FAIL file=/pytorch/aten/src/THC/generic/THCTensorCopy.c line=70 error=59 : device-side assert triggered beam search OOM 2 1.59588813782 Traceback (most recent call last): File "generate_paraphrases.py", line 187, in <module> encode_data(out_file=args.out_file) File "generate_paraphrases.py", line 70, in encode_data torch_sent = Variable(torch.from_numpy(np.array(seg_sent, dtype='int32')).long().cuda()) RuntimeError: cuda runtime error (59) : device-side assert triggered at /pytorch/aten/src/THC/generic/THCTensorCopy.c:20

I had tried a few times but it always return this error, especially THCTensorCopy.c:20. Hope you can help me on this.

Running time

Hello
I used your model to extend my dataset for sentiment analysis. It performs well on small datasets. However, it is time consuming "4 seconds per item". So when trying it with a dataset of 500000 items, it takes three weeks to complete.

Do you have any idea about how can I accelerate the execution?

can i use same model for training spanish language dataset ?

Hi,

Thanks for sharing model. great work.

I want to get variations of spanish language text similar to english. can i do this using same model ?

and do you know , is there any text corpus for spanish language to train text variation ?

Thanks.

training from scratch

train_scpn.py when run just freezes at 1123 mb gpu usage, and the 15 gig dataset memory usage. prints nothing, does nothing, anyone seen this issue?

python 2.7, latest pytorch

Question about copy mechanism implementation

Thank you for sharing your code!
I have one question about the copy mechanism implementation

As far as I could see, you calculate the final word distribution as:
(1 - p_copy) * log word_dist_from_decoder + p_copy * log word_dist_by_copy

I think the logarithm should be removed.

The command you ran to parse the ParaNMT dataset

Hi, thanks for the paper and interesting topics. I would like to apply your pre-trained model on my own data using the code given:

java -Xmx12g -cp "*" edu.stanford.nlp.pipeline.StanfordCoreNLP -threads 1 -annotators tokenize,ssplit,pos,parse -ssplit.eolonly -filelist filenames.txt -outputFormat text -parse.model edu/stanford/nlp/models/srparser/englishSR.ser.gz -outputDirectory /outputdir/

I do not have a CS background. Could anyone give some explanation of this code? Which software to run and how to customize accurately the information to fill in? Thanks...

miyyer / scpn Goto Github PK

scpn's People

Contributors

Stargazers

Watchers

Forkers

scpn's Issues

Recommend Projects

Recommend Topics

Recommend Org