miyyer / scpn Goto Github PK
View Code? Open in Web Editor NEWsyntactically controlled paraphrase networks
syntactically controlled paraphrase networks
Hi Guys,
It's been 10 days since training of scpn model was started, and after this 10 days, training has just completed 4 epochs out of 15 epochs with DEFAULT settings.
machine configuration :
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 396.54 Driver Version: 396.54 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Tesla K80 Off | 000075E1:00:00.0 Off | 0 |
| N/A 67C P0 89W / 149W | 9896MiB / 11441MiB | 84% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 47372 C xyzz/python2.7 1440MiB |
| 0 82095 C scpn/python 8443MiB |
+-----------------------------------------------------------------------------+
using above system, can you guys tell me approximately how much time will it take to train scpn model(DEFAULT settings) ?
with that as another model has dependency with scpn which is "parse_generator", can you tell me how much time will this model(DEFAULT settings) take to train ?
Thanks.
Hi, Would you please let me know what is the license of SCPN?
Does anyone has any success in using the supplied training data for training (with Python 2.7 and Pytorch 3.1)?
It appears that it is non-trainable in my machine, and there is an infinite loop within the enumeration of the minibatches.
z = indexify_transformations(in_p, out_p, label_voc, args)
if z == None:
continue
The above code always produces a z of None type, thus generating an infinite loop.
If I just want to train the SCPN model, I just need to preprocess the para-nmt dataset. But what if I want to use SCPN to generate syntactically adversarial examples for downstream task? Should I preprocess (for example, tokenizing and BPE) the para-nmt dataset with the downstream task's dataset together? How did you preprocess SST and SICK data ? @miyyer @jwieting Thank you very much!
Thanks for code sharing.
while executing through below loop, there are two issues i'm facing.
# loop over sentences and transform them
for d_idx, ex in enumerate(inrdr):
stime = time.time()
1. exception : cuda runtime error : device-side assert triggered at /pytorch/aten/src/THC/THCTensorCopy.cpp:20
at line 3
# add EOS
seg_sent.append(pp_vocab['EOS'])
torch_sent = Variable(torch.from_numpy(np.array(seg_sent, dtype='int32')).long().cuda())
torch_sent_len = torch.from_numpy(np.array([len(seg_sent)], dtype='int32')).long().cuda()
2. exception : tensor(101,device='cuda:0')
at line 2
for b_idx in beam_dict:
prob,_,_,seq = beam_dict[b_idx][0]
gen_parse = ' '.join([rev_label_voc[z] for z in seqs[b_idx]])
gen_sent = ' '.join([rev_pp_vocab[w] for w in seq[:-1]])
when tried to debug , found something like this.
seqs contains [[tensor(101,device='cuda:0'),tensor(1,device='cuda:0')....etc]]
seq contains [tensor(5,device='cuda:0'),tensor(448,device='cuda:0')....etc]
rev_label_voc contains {0:'NN',1:'VB'...etc}
rev_pp_vocab contains {'is':229,'am':43...etc}
as you can see variables and their values , it's failing because ok KEYERROR.
any suggestion why is it happening ?
The line 241 in train_scpn.py:
copy_probs = copy_probs.view(-1);
I think it should be:
copy_probs = copy_probs.transpose(0, 1).contiguous().view(-1)
because decoder_states, decoder_copy_dists are transposed:
line 234: decoder_states = decoder_states.transpose(0, 1).contiguous().view(-1, self.d_hid * 2)
line 238: decoder_copy_dists = decoder_copy_dists.transpose(0, 1).contiguous().view(-1, self.len_voc)
@miyyer @jwieting
I try to train the scpn model but the training data is too large. I use one GPU. The batch size is 64 and for every batch I need 1.6 seconds to train. But there is 439586 batches. I try to use two GPUs to train but I fail. Could you tell me how you speed up the training process? Thank you so much. @miyyer @jwieting
Hi, I was wondering if you think this would work well with question paraphrasing? Any insights would be great! Thanks.
Eric
Hello, thank you so much for sharing the code with us. I have learned a lot. Thank you so much! But I have some questions about the trans_embs in this code.
Hi @miyyer,
I had ran demo.sh
with pytorch 0.4.0 in both python 2.7 & 3.6, GPU usage is about 1.4Gb, both give me the error:
/pytorch/aten/src/THC/THCTensorIndex.cu:360: void indexSelectLargeIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 2, SrcDim = 2, IdxDim = -2, IndexIsMajor = true]: block: [113,0,0], thread: [94,0,0] Assertion ``srcIndex < srcSelectDimSize`` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:360: void indexSelectLargeIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 2, SrcDim = 2, IdxDim = -2, IndexIsMajor = true]: block: [113,0,0], thread: [95,0,0] Assertion ``srcIndex < srcSelectDimSize`` failed.
THCudaCheck FAIL file=/pytorch/aten/src/THC/generic/THCTensorCopy.c line=70 error=59 : device-side assert triggered beam search OOM 2 1.59588813782 Traceback (most recent call last): File "generate_paraphrases.py", line 187, in <module> encode_data(out_file=args.out_file) File "generate_paraphrases.py", line 70, in encode_data torch_sent = Variable(torch.from_numpy(np.array(seg_sent, dtype='int32')).long().cuda()) RuntimeError: cuda runtime error (59) : device-side assert triggered at /pytorch/aten/src/THC/generic/THCTensorCopy.c:20
I had tried a few times but it always return this error, especially THCTensorCopy.c:20
. Hope you can help me on this.
Hello
I used your model to extend my dataset for sentiment analysis. It performs well on small datasets. However, it is time consuming "4 seconds per item". So when trying it with a dataset of 500000 items, it takes three weeks to complete.
Do you have any idea about how can I accelerate the execution?
Hi,
Thanks for sharing model. great work.
I want to get variations of spanish language text similar to english. can i do this using same model ?
and do you know , is there any text corpus for spanish language to train text variation ?
Thanks.
train_scpn.py when run just freezes at 1123 mb gpu usage, and the 15 gig dataset memory usage. prints nothing, does nothing, anyone seen this issue?
python 2.7, latest pytorch
Thank you for sharing your code!
I have one question about the copy mechanism implementation
As far as I could see, you calculate the final word distribution as:
(1 - p_copy) * log word_dist_from_decoder + p_copy * log word_dist_by_copy
I think the logarithm should be removed.
Hi, thanks for the paper and interesting topics. I would like to apply your pre-trained model on my own data using the code given:
java -Xmx12g -cp "*" edu.stanford.nlp.pipeline.StanfordCoreNLP -threads 1 -annotators tokenize,ssplit,pos,parse -ssplit.eolonly -filelist filenames.txt -outputFormat text -parse.model edu/stanford/nlp/models/srparser/englishSR.ser.gz -outputDirectory /outputdir/
I do not have a CS background. Could anyone give some explanation of this code? Which software to run and how to customize accurately the information to fill in? Thanks...
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.