amg's Introduction

Attend, Memorize and Generate: Towards Faithful Table-to-Text Generation in Few Shots

This repo provides the source code & data of our paper: "Attend, Memorize and Generate: Towards Faithful Table-to-TextGeneration in Few Shots" https://aclanthology.org/2021.findings-emnlp.347 (EMNLP Findings 2021)

@inproceedings{zhao-etal-2021-attend-memorize,
    title = "Attend, Memorize and Generate: Towards Faithful Table-to-Text Generation in Few Shots",
    author = "Zhao, Wenting  and
      Liu, Ye  and
      Wan, Yao  and
      Yu, Philip",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2021",
    month = nov,
    year = "2021",
    address = "Punta Cana, Dominican Republic",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2021.findings-emnlp.347",
    doi = "10.18653/v1/2021.findings-emnlp.347",
    pages = "4106--4117"
}

Feel free to reach out if you have any question!

Description

In AMG paper, we first do task-adaptive training for the AMG model using masked language model training objective to learn better model weights, then apply the checkpoint at second step to fine-tune the wiki-human/books/songs data on the model.
This repository provides the checkpoint after the task-adpative training, as well as data, code for the second fine-tuning step.

Installation

install pytorch

  conda install pytorch==1.7.1 torchvision==0.8.2 torchaudio==0.7.2 cudatoolkit=10.1 -c pytorch

All the conda environment packages(including version) are listed in requirements_version.txt.

Data

Data is from the Few-shot NLG with three domains : Humans, Songs and Books.

The original data, processed data, CKPT can be download from AMG-DATA-MODEL-CKPT Google Drive.

Data Organization:

AMG-DATA-MODEL-CKPT
├── 1-wiki-human-folder
│   ├── 1_original_data_500 (original data from wiki-dataset in Fewshot NLG)
│   └── 3_few_shot_plotmachine (the data format for the fine-tuning)
│   └── 4_pre_entity_embed_folder  (entity embedding for fine-tuning and inference)
│   └── CKPT (checkpoint after the task adaptive training)
├── 2-wiki-song-folder
│   ├── 1_original_data_500
│   └── 3_few_shot_plotmachine
│   └── 4_pre_entity_embed_folder
│   └── CKPT
├── 3-wiki-books-folder
│   ├── 1_original_data_500
│   └── 3_few_shot_plotmachine
│   └── 4_pre_entity_embed_folder
│   └── CKPT

Take wiki-Human folder as an example, 1_original_data_500 contains original data from wiki-dataset in Fewshot NLG, 3_few_shot_plotmachine has the data with format for the fine-tuning, while 4_pre_entity_embed_folder provides entity embedding for fine-tuning and inference.

Unzip AMG-DATA-MODEL-CKPT, and place three folders: 1-wiki-human-folder,2-wiki-song-folder and 3-wiki-books-folder under path /data0 .

Training

python ./biunilm/run_all_domains.py

Inference

python ./biunilm/decode_all_domains.py

amg's People

Contributors

Stargazers

Watchers

amg's Issues

Missing fine-tune CKPT

Hello Wenting,

I got some questions while trying to make this code work, could you please offer some help?

It seems that the AMG-DATA-MODEL-CKPT you've uploaded has no checkpoint for model inference,.
According to your script decode_all_domains.py, a fine-tuned checkpoint is required for building the Decoder,
as follow:

"""
model_check_point_dir = '/data0/temp_folder_for_Table-SpanMem_fine-tune_0515/'+
domain+'/' + 'example_'+ str(num) +'_saved_model
'"""

To create the Preprocess4Seq2seqMemDecoder, I used the checkpoint from
1-wiki-human-folder/CKPT/stage-2-unilm-memspan-mlm-humans-0319/model.7.bin

Is this the right way to create the decoder?

By doing this, I got a wired result. The decoded sequence of the first example from HUMAN is:
'started ( 10 ) 3 ) 2 ( 0 ) 0 ( 0 ) 0 ( 0 ) 0 ( 0 ) 0 ( 0 ) 0 1 ; stats is [ENTITY_CLS] points rebounds assists ( 5 ) [ENTITY_SEP] ; statsbel is [ENTITY_CLS] points rebounds started ( 5 ) ( 0 ) 0 ( 0 ) 0 ('

Could you help me find out what wrong, or share the right way for model inference.

Thanks a lot.

wentinghome / amg Goto Github PK

amg's Introduction

Attend, Memorize and Generate: Towards Faithful Table-to-Text Generation in Few Shots

Description

Installation

install pytorch

Data

Data Organization:

Training

Inference

amg's People

Contributors

Stargazers

Watchers

Forkers

amg's Issues

Missing fine-tune CKPT

ModuleNotFoundError: No module named 'pytorch_pretrained_bert.modeling_0321_low_rouge'

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent