Giter VIP home page Giter VIP logo

njunmt-pytorch's Introduction

NJUNMT-pytorch

License: MIT Build Status

NJUNMT-pytorch is an open-source toolkit for neural machine translation. This toolkit is highly research-oriented, which contains some common baseline model:

  • DL4MT-tutorial: A rnn-base nmt model widely used as baseline. To our knowledge, this is the only pytorch implementation which is exactly the same as original model.(nmtpytorch is another pytorch implementation but with minor structure difference.)

  • Attention is all you need: A strong nmt model introduced by Google, which only relies on attenion mechanism. Our implementation is different from the official tensor2tenosr

Requirements

  • python 3.5+
  • pytorch 0.4.0
  • tqdm
  • tensorboardX

Usage

Data Preprocessing

See help of ./data/build_dictionary.py

Vocabulary will be stored as json format.

We highly recommend not to set the limitation of the number of words and control it by config files while training.

Configuration

See examples in ./configs folder. You can reproduce our Chinese-to-English Baseline by directly using those configures.

dl4mt_config.yaml is the configure file for DL4MT model using loss scheduling as the default.

transformer_base_config.yaml is the configure file for Transformer model using noam scheduling as the default

For more details on how to configure learning rate scheduler, please see examples in ./configs/lr_schedule_examples

Training

See training script ./scripts/train.sh

Translation

See translation script ./scripts/translation.sh

Benchmark

See BENCHMARK.md

Q&A

  1. What is shard_size ?

shard_size is trick borrowed from OpenNMT-py, which

could make large model run in the memory-limited condition.

For example, you can run wmt17 EN2DE task on a 8GB GTX1080 card

with batch size 64 by setting shard_size=10

WARNINIG: shard is currently not supported in pytorch 0.4.0!

  1. What is use_bucket ?

When using bucket, parallel sentences will be sorted partially according to the length of target sentence.

Set this option to true will bring considerable improvement but performance regression.

Acknowledgement

  • This code is heavily borrowed from OpenNMT/OpenNMT-py and have been simplified for research use.

njunmt-pytorch's People

Contributors

whr94621 avatar zhengzx-nlp avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.