Giter VIP home page Giter VIP logo

synpg's Issues

TypeError: unsupported operand type(s) for +: 'int' and 'str'

Hi, when I run train.syng.sh, I get an error:

Traceback (most recent call last):
File "train_synpg.py", line 285, in
train(epoch, model, train_data, valid_data, train_loader, valid_loader, optimizer, criterion, dictionary, bpe, args)
File "train_synpg.py", line 90, in train
sent_ = bpe.segment(sent_).split()
File "/home/dingp/synpg-master/subwordnmt/apply_bpe.py", line 59, in segment
new_word = [out for segment in self._isolate_glossaries(word)
File "/home/dingp/synpg-master/subwordnmt/apply_bpe.py", line 60, in
for out in encode(segment,
File "/home/dingp/synpg-master/subwordnmt/apply_bpe.py", line 144, in encode
word = tuple(orig[:-1]) + ( orig[-1] + '',)
TypeError: unsupported operand type(s) for +: 'int' and 'str'

It seems something wrong with the BPE? Could you help me ? thanks !

Data preprocessing process

Hi,
We want to use a new dataset to test on this model. Could you open source your preprocessing code? More specifically, how to generate h5 files contaning sentence and syntax information from files processed by StanfordCoreNLP.
Big Thanks!

Data preprocessing

Hi,

Could you please provide the command you used to generate the parses to use as input, as well as any other preprocessing steps required? I have some data that I would like to test your system on, how should I generate an input file in the correct format?

Thanks!

Errors for train_synpg.sh and train_parse_generator.sh

Following the instructions for Training using: train_synpg.sh with the downloaded data

==== start training ====
Traceback (most recent call last):
  File "train_synpg.py", line 285, in <module>
    train(epoch, model, train_data, valid_data, train_loader, valid_loader, optimizer, criterion, dictionary, bpe, args)
  File "train_synpg.py", line 90, in train
    sent_ = bpe.segment(sent_).split()
  File "/home/wmk/synpg/subwordnmt/apply_bpe.py", line 59, in segment
    new_word = [out for segment in self._isolate_glossaries(word)
  File "/home/wmk/synpg/subwordnmt/apply_bpe.py", line 67, in <listcomp>
    self.glossaries)]
  File "/home/wmk/synpg/subwordnmt/apply_bpe.py", line 144, in encode
    word = tuple(orig[:-1]) + ( orig[-1] + '</w>',)
TypeError: unsupported operand type(s) for +: 'int' and 'str'

Following the instructions for Training using: train_parse_generator.sh

==== loading data ====
number of train examples: 45377426
number of valid examples: 12800
==== start training ====
Traceback (most recent call last):
  File "train_parse_generator.py", line 282, in <module>
    train(epoch, model, train_data, valid_data, train_loader, valid_loader, optimizer, criterion, dictionary, args)
  File "train_parse_generator.py", line 81, in train
    synt_ = ParentedTree.fromstring(synt_)
  File "/opt/conda/envs/synpg/lib/python3.7/site-packages/nltk/tree.py", line 669, in fromstring
    for match in token_re.finditer(s):
TypeError: cannot use a string pattern on a bytes-like object

Selecting top-1 output

The generation script gives 4 candidates outputs for each input - what is the "correct" way to select the preferred top-1 output? Is it simply the first generated output?

A bug in the generate.py

Hi, nice job! I find a bug in generate.py:

Line 90,

eos_pos = eos_pos[0]+1 if len(eos_pos) > 0 else len(idx)

the 'idx' is not defined.

License

Please can you add the license to the code. thanks!

I'm in China, "data" always prompts download error

Hello, I am in China. When I click the link "data" to download the data, I always get a download error when it reaches 500M. Could you please provide the original webpage for downloading data, and then I can download it from the corresponding webpage, or can you provide the link of Baidu net disk?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.