Giter VIP home page Giter VIP logo

ecnmt's Introduction

ECNMT: Emergent Communication Pretraining for Few-Shot Machine Translation

This repository is the official PyTorch implementation of the following paper:

Yaoyiran Li, Edoardo Maria Ponti, Ivan Vulić, and Anna Korhonen. 2020. Emergent Communication Pretraining for Few-Shot Machine Translation. In Proceedings of the 28th International Conference on Computational Linguistics (COLING 2020). LINK

This method is a form of unsupervised knowledge transfer in the absence of linguistic data, where a model is first pre-trained on artificial languages emerging from referential games and then fine-tuned on few-shot downstream tasks like neural machine translation.

Emergent Communication and Machine Translation

Dependencies

  • PyTorch 1.3.1
  • Python 3.6

Data

COCO image features are available in the sub-folder half_feats here. Preprocessed EN-DE (DE-EN) data for translation are available in the sub-folder task1 here. Both are obtained from Translagent.

Please find the data for translation in the other language pairs (EN-CS, EN-RO, EN-FR) in the links below.

Dictionaries Train Sentence Pairs Reference Translations
EN-CS & CS-EN EN-CS & CS-EN EN-CS & CS-EN
EN-RO & RO-EN EN-RO & RO-EN EN-RO & RO-EN
EN-FR & FR-EN EN-FR & FR-EN EN-FR & FR-EN

Pretrained Models for Emergent Communication

Source / Target Target / Source
EN DE
EN CS
EN RO
EN FR

Experiments

Step 1: run EC pretraining (otherwise go to Step 2 and use a pretrained model).

cd ./ECPRETRAIN
sh run_training.sh

Step 2: run NMT fine-tuning (please modify the roots for training data, pretrained model and saved path before).

cd ./NMT
sh run_training.sh

Optional: run baseline

cd ./BASELINENMT
sh run_training.sh

Citation

@inproceedings{YL:2020,
  author    = {Yaoyiran Li and Edoardo Maria Ponti and Ivan Vulić and Anna Korhonen},
  title     = {Emergent Communication Pretraining for Few-Shot Machine Translation},
  year      = {2020},
  booktitle = {Proceedings of the 28th International Conference on Computational Linguistics},
}

Acknowledgements

Part of the code is based on Translagent.

The datasets for our experiments include MS COCO for Emergent Communication pretraining, Multi30k Task 1 and Europarl for NMT fine-tuning. Text preprocessing is based on Moses and Subword-NMT.

Please cite these resources accordingly.

ecnmt's People

Contributors

yaoyiran avatar ducdauge avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.