The improving_lexical_choice_in_nmt from tnq177

improving_lexical_choice_in_nmt's Introduction

Tested with Python 2.7.3 and Tensorflow 1.1
Talk: https://www3.nd.edu/~tnguye28/naacl18.pdf

General

This is the code for the paper Improving Lexical Choice in Neural Machine Translation (accepted at NAACL HLT 2018). The branches are:

master: baseline NMT
tied_embedding: baseline NMT with tied embedding
fixnorm: fixnorm model in paper
fixnorm_lex: fixnorm+lex model in paper
arthur: apply the method of Arthur et al. on top of tied_embedding NMT

To train a model:

write a configuration function in configurations.py
run: python -m nmt --proto your_config_func

Depending on your config function, the code generates a direction under nmt/saved_models/your_model_name and saves all dev validations there, as well as dev perplexities, train perplexities, best model checkpoint, checkpoint so far (I've tested with saving 1 best checkpoint, not sure about > 1). You should use this checkpoint to translate on any other input.

To translate with UNK replacement:

run: python -m nmt --proto your_config_func --mode translate --unk-repl --model-file path_your_saved_checkpoint.cpkt --input-file path_to_input_file

Remember the checkpoint includes data file, meta file, ... but just link to .cpkt, ignore the extension.

References

Code & scripts might be inspired/borrowed from some sources:

improving_lexical_choice_in_nmt's People

Contributors

Stargazers

Watchers

improving_lexical_choice_in_nmt's Issues

Other activation function other than tanh?

Just wondering have you try other activation function other than tanh? Do you think it is worth exploring other activation functions? Is there any motivation for choosing tanh?

what if src_embedding_dim and tgt_embedding_dim is different

@tnq177 just wondering what if the src_embedding_dim and tgt_embedding_dim is different. For the implementation fixnorm+lex, in model.py 153 line:

lexicons = lex_hider.transform(lex_inputs) + lex_inputs

And If I understand correctly, this should be in corresponding with the following equation in your paper:

Here the output of lex_hider(i.e., tanh(W*f_t^l)) should be in dimension of tgt_embedding_dim, but the lex_inputs(i.e., f_t^l) is still in the dimension of src_embedding_dim, how to add the two matrix with different dimensions?

Recommend Projects

tnq177 / improving_lexical_choice_in_nmt Goto Github PK

improving_lexical_choice_in_nmt's Introduction

General

References

improving_lexical_choice_in_nmt's People

Contributors

Stargazers

Watchers

Forkers

improving_lexical_choice_in_nmt's Issues

Other activation function other than tanh?

what if src_embedding_dim and tgt_embedding_dim is different

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent