Giter VIP home page Giter VIP logo

csa-nct's Introduction

CSA-NCT

Code for EMNLP21 main conference paper: Towards Making the Most of Dialogue Characteristics for Neural Chat Translation

Training (Taking En->De as an example)

Our code is basically based on the publicly available toolkit: THUMT-Tensorflow (our python version 3.6). The following steps are training our model and then test its performance in terms of BLEU, TER, and Sentence Similarity.

Data Preprocessing

Please refer to the "data_preprocess_code" file. The data we used in this paper are BConTrasT and BMELD.

Two-stage Training

  • The first stage
1) bash train_ende_base_stage1.sh # Suppose the generated checkpoint file is located in path1
  • The second stage (i.e., fine-tuning on the chat translation data)
2) bash train_ende_base_stage2.sh # Here, set the training_step=1; Suppose the generated checkpoint file is located in path2
3) python thumt_stage1_code/thumt/scripts/combine_add.py --model path2 --part path1 --output path3  # copy the weight of the first stage to the second stage.
4) bash train_ende_base_stage2.sh # Here, set the --output=path3 and the training_step=first_stage_step + 5,000; Suppose the generated checkpoint file is path4
  • Test by multi-blue.perl
5) bash test_ende_stage2.sh # set the checkpoint file path to path4 in this script. # Suppose the predicted file is located in path5 at checkpoint step xxxxx
  • Test by SacreBLEU and TER Required TER: v0.7.25; Sacre-BLEU: version.1.4.13 (BLEU+case.mixed+numrefs.1+smooth.exp+tok.13a+version.1.4.13)
6) python SacreBLEU_TER_Coherence_Evaluation_code/cal_bleu_ter4ende.py # Please correctly set the golden file and predicted file in this file and in sacrebleu_ende.py, respectively.
  • Coherence Evaluation by Sentence Similarity Required: gensim; MosesTokenizer
7) python SacreBLEU_TER_Coherence_Evaluation_code/train_word2vec.py # firstly downloading the corpus in [2] and then training the word2vec.
8) python SacreBLEU_TER_Coherence_Evaluation_code/eval_coherence.py # putting the file containing three precoding utterances and the predicted file in corresponding location and then running it.

Citation

If you find this project helps, please cite our paper :)

@inproceedings{liang-etal-2021-towards,
    title = "Towards Making the Most of Dialogue Characteristics for Neural Chat Translation",
    author = "Liang, Yunlong  and
      Zhou, Chulun  and
      Meng, Fandong  and
      Xu, Jinan  and
      Chen, Yufeng  and
      Su, Jinsong  and
      Zhou, Jie",
    booktitle = "Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing",
    month = nov,
    year = "2021",
    address = "Online and Punta Cana, Dominican Republic",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2021.emnlp-main.6",
    pages = "67--79",
}

csa-nct's People

Contributors

xl2248 avatar

Stargazers

 avatar  avatar Jingyi Zhou avatar  avatar Dan avatar Shuichiro Shimizu avatar  avatar

Watchers

James Cloos avatar Edward Kamau avatar  avatar

Forkers

likatakuli

csa-nct's Issues

Timeline for code release?

Thank you for the nice work, I enjoyed the corresponding EMNLP paper! I was wondering what the timeline would be for releasing the code? We'd be potentially interested in reproducing the results.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.