Giter VIP home page Giter VIP logo

prime's Issues

IWSLT'14 DE-EN Numbers

Hi,

I followed all the commands mentioned in https://github.com/lancopku/Prime/blob/master/examples/parallel_intersected_multi-scale_attention(Prime)/README.md#iwslt14-de-en and ran it till 20000 steps. The bleu score for the best ckpt was 35.07 and the bleu score for the avg of the last 10 ckpts was 35.78. PPL was 4.7+. The repo mentions that the bleu score for the best ckpt is around 35.7. Is there any mistake in my implementation? or do i have tune the lenpen and beam size to get the numbers mentioned? Would be helpful if you could clarify these doubts. Thanks!

TypeError: argument of type 'NoneType' is not iterable

Traceback (most recent call last):
File "train.py", line 311, in
cli_main()
File "train.py", line 306, in cli_main
main(args)
File "train.py", line 49, in main
model = task.build_model(args)
File "/home/xgzhu/MUSE/fairseq/tasks/fairseq_task.py", line 169, in build_model
return models.build_model(args, self)
File "/home/xgzhu/MUSE/fairseq/models/init.py", line 50, in build_model
return ARCH_MODEL_REGISTRY[args.arch].build_model(args, task)
File "/home/xgzhu/MUSE/fairseq/models/transformer.py", line 188, in build_model
encoder = TransformerCombineEncoder(args, src_dict, encoder_embed_tokens)
File "/home/xgzhu/MUSE/fairseq/models/combine_transformer.py", line 57, in init
for i in range(args.encoder_layers)
File "/home/xgzhu/MUSE/fairseq/models/combine_transformer.py", line 57, in
for i in range(args.encoder_layers)
File "/home/xgzhu/MUSE/fairseq/models/combine_transformer.py", line 157, in init
dropout=args.attention_dropout, cur_attn_type='es'
File "/home/xgzhu/MUSE/fairseq/modules/multihead_attention.py", line 93, in init
num_heads=dynamic_num_heads, weight_dropout=0.1, )
File "/home/xgzhu/MUSE/fairseq/modules/dynamic_convolution.py", line 73, in init
self.weight_linear = Linear(self.query_size, num_heads * kernel_size * 1, bias=bias)
File "/home/xgzhu/MUSE/fairseq/modules/linear.py", line 7, in Linear
init_method = args.init_method if 'init_method' in args else 'xavier'
TypeError: argument of type 'NoneType' is not iterable

anyone running into 'nan'

I am changing the TransformerCombineEncoder to do a seq to seq job, but I got 'nan' after some steps, anyone has experience on this?

muse code?

nice work here and I really love the results of this paper, just wonder is the muse code already in this repo?

Reproducing IWSLT14-de-en results

Hi there,
Thanks so much for the great work!
I'm currently trying to reproduce IWSLT14-de-en (Prime model) results on a single P100 GPU. I follow the exact script at https://github.com/lancopku/Prime/blob/master/examples/parallel_intersected_multi-scale_attention(Prime)/README.md.
However, I'm unable to reproduce the results. It gave me 100+ perplexity after training is finished, and the BLEU score is below 30.

Do you have any suggestions? What is the expected perplexity / curve?

Spelling in the paper appendix

In one of the example sentences in the appendix, the letters ä, ö, and ß are missing. Please correct the sentence

und deswegen haben wir uns entschlossen in berlin eine halle zu bauen,in der wir sozusagen die elektrischen verhltnisse der insel im mastabeins zu drei ganz genau abbilden knnen.

to correctly:

und deswegen haben wir uns entschlossen in berlin eine halle zu bauen, in der wir sozusagen die elektrischen verhältnisse der insel im maßstab eins zu drei ganz genau abbilden können.

In Latex, these letters can be encoded using {\ss} and {"o} or {"a}.

Cheers!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.