uclanlp / synpg Goto Github PK

View Code? Open in Web Editor NEW

38.0 8.0 13.0 758 KB

Code for our EACL-2021 paper "Generating Syntactically Controlled Paraphrases without Using Annotated Parallel Pairs".

License: MIT License

Shell 2.63% Python 97.37%

paraphrase-generation disentangled-representations

synpg's Issues

TypeError: unsupported operand type(s) for +: 'int' and 'str'

Hi, when I run train.syng.sh, I get an error:

Traceback (most recent call last):
File "train_synpg.py", line 285, in
train(epoch, model, train_data, valid_data, train_loader, valid_loader, optimizer, criterion, dictionary, bpe, args)
File "train_synpg.py", line 90, in train
sent_ = bpe.segment(sent_).split()
File "/home/dingp/synpg-master/subwordnmt/apply_bpe.py", line 59, in segment
new_word = [out for segment in self._isolate_glossaries(word)
File "/home/dingp/synpg-master/subwordnmt/apply_bpe.py", line 60, in
for out in encode(segment,
File "/home/dingp/synpg-master/subwordnmt/apply_bpe.py", line 144, in encode
word = tuple(orig[:-1]) + ( orig[-1] + '',)
TypeError: unsupported operand type(s) for +: 'int' and 'str'

It seems something wrong with the BPE? Could you help me ? thanks !

Data preprocessing process

Hi,
We want to use a new dataset to test on this model. Could you open source your preprocessing code? More specifically, how to generate h5 files contaning sentence and syntax information from files processed by StanfordCoreNLP.
Big Thanks!

Data preprocessing

Hi,

Could you please provide the command you used to generate the parses to use as input, as well as any other preprocessing steps required? I have some data that I would like to test your system on, how should I generate an input file in the correct format?

Thanks!

Errors for train_synpg.sh and train_parse_generator.sh

Following the instructions for Training using: train_synpg.sh with the downloaded data

==== start training ====
Traceback (most recent call last):
  File "train_synpg.py", line 285, in <module>
    train(epoch, model, train_data, valid_data, train_loader, valid_loader, optimizer, criterion, dictionary, bpe, args)
  File "train_synpg.py", line 90, in train
    sent_ = bpe.segment(sent_).split()
  File "/home/wmk/synpg/subwordnmt/apply_bpe.py", line 59, in segment
    new_word = [out for segment in self._isolate_glossaries(word)
  File "/home/wmk/synpg/subwordnmt/apply_bpe.py", line 67, in <listcomp>
    self.glossaries)]
  File "/home/wmk/synpg/subwordnmt/apply_bpe.py", line 144, in encode
    word = tuple(orig[:-1]) + ( orig[-1] + '</w>',)
TypeError: unsupported operand type(s) for +: 'int' and 'str'

Following the instructions for Training using: train_parse_generator.sh

==== loading data ====
number of train examples: 45377426
number of valid examples: 12800
==== start training ====
Traceback (most recent call last):
  File "train_parse_generator.py", line 282, in <module>
    train(epoch, model, train_data, valid_data, train_loader, valid_loader, optimizer, criterion, dictionary, args)
  File "train_parse_generator.py", line 81, in train
    synt_ = ParentedTree.fromstring(synt_)
  File "/opt/conda/envs/synpg/lib/python3.7/site-packages/nltk/tree.py", line 669, in fromstring
    for match in token_re.finditer(s):
TypeError: cannot use a string pattern on a bytes-like object

Selecting top-1 output

The generation script gives 4 candidates outputs for each input - what is the "correct" way to select the preferred top-1 output? Is it simply the first generated output?

A bug in the generate.py

Hi, nice job! I find a bug in generate.py:

Line 90,

eos_pos = eos_pos[0]+1 if len(eos_pos) > 0 else len(idx)

the 'idx' is not defined.

License

Please can you add the license to the code. thanks!

I'm in China, "data" always prompts download error

Hello, I am in China. When I click the link "data" to download the data, I always get a download error when it reaches 500M. Could you please provide the original webpage for downloading data, and then I can download it from the corresponding webpage, or can you provide the link of Baidu net disk?

uclanlp / synpg Goto Github PK

synpg's Issues

TypeError: unsupported operand type(s) for +: 'int' and 'str'

Data preprocessing process

Data preprocessing

Errors for train_synpg.sh and train_parse_generator.sh

Selecting top-1 output

A bug in the generate.py

License

I'm in China, "data" always prompts download error

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent