Giter VIP home page Giter VIP logo

ghr's Introduction

GHR GitHub

Capturing Conversational Interaction for Question Answering via Global History Reasoning

NAACL Findings 2022

GHR Overview

We present GHR for conversational question answering (CQA). You can train ELECTRA by using our framework, GHR, described in our paper.

Requirements

$ conda create -n GHR python=3.8.10
$ conda activate GHR
$ conda install tqdm
$ conda install pytorch==1.5.0 torchvision==0.6.0 cudatoolkit=10.1 -c pytorch
$ pip install transformers==3.5.0

Datasets

We use the QuAC (Choi et al., 2018) dataset for training and evaluating our models, and test on the leaderboard.

Train

The following example fine-tunes ELECTRA on the QuAC dataset by using GHR. We performed all experiments using a single 16GB GPU (Tesla V100).

INPUT_DIR=./datasets/
OUTPUT_DIR=./tmp/model

CUDA_VISIBLE_DEVICES=0 python3 run_quac.py \
	--model_type electra  \
	--model_name_or_path   electra-large \
	--do_train \
	--do_eval \
        --data_dir ${INPUT_DIR} \
	--train_file train.json \
	--predict_file dev.json \
	--output_dir ${OUTPUT_DIR} \
	--per_gpu_train_batch_size 12 \
	--num_train_epochs 2 \
	--learning_rate 2e-5 \
	--weight_decay 0.01 \
	--threads 20 \
	--do_lower_case \
	--fp16 --fp16_opt_level "O2" \
	--evaluate_during_training \
	--max_answer_length 50 --cache_prefix electra-large

By default, we use mixed precision apex --fp16 for acceleration training and prediction.

Evaluation

The following example evaluates our trained model with the development set of QuAC.

INPUT_DIR=./datasets/
MODEL_DIR=./tmp/model/
OUTPUT_DIR=./tmp/

CUDA_VISIBLE_DEVICES=0 python3 run_quac.py \
	--model_type electra  \
	--model_name_or_path   ${MODEL_DIR} \
	--do_eval \
        --data_dir ${INPUT_DIR} \
	--train_file train.json \
	--predict_file dev.json \
	--output_dir ${OUTPUT_DIR} \
	--per_gpu_train_batch_size 12 \
	--num_train_epochs 2 \
	--learning_rate 2e-5 \
	--weight_decay 0.01 \
	--threads 20 \
	--do_lower_case \
	--fp16 --fp16_opt_level "O2" \
	--evaluate_during_training \
	--max_answer_length 50 --cache_prefix electra-large

Result

Evaluating models trained with predefined hyperparameters yields the following results:

DEV Results: {'F1': 74.9}  TEST Results: {'F1': 73.7}

Citation

@inproceedings{qian2022capturing,
  title={Capturing Conversational Interaction for Question Answering via Global History Reasoning},
  author={Qian, Jin and Zou, Bowei and Dong, Mengxing and Li, Xiao and Aw, Aiti and Hong, Yu},
  booktitle={Findings of the Association for Computational Linguistics: NAACL 2022},
  pages={2071--2078},
  year={2022}
}

ghr's People

Contributors

jaytsien avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.