Giter VIP home page Giter VIP logo

mrc_tf's Introduction

Machine Reading Comprehension

Machine reading comprehension (MRC), a task which asks machine to read a given context then answer questions based on its understanding, is considered one of the key problems in artificial intelligence and has significant interest from both academic and industry. Over the past few years, great progress has been made in this field, thanks to various end-to-end trained neural models and high quality datasets with large amount of examples proposed.

Figure 1: MRC example from SQuAD 2.0 dev set

Setting

  • Python 3.6.7
  • Tensorflow 1.13.1
  • NumPy 1.13.3
  • SentencePiece 0.1.82

DataSet

  • SQuAD is a reading comprehension dataset, consisting of questions posed by crowd-workers on a set of Wikipedia articles, where the answer to every question is a segment of text, or span, from the corresponding reading passage, or the question might be unanswerable.
  • CoQA a large-scale dataset for building Conversational Question Answering systems. The goal of the CoQA challenge is to measure the ability of machines to understand a text passage and answer a series of interconnected questions that appear in a conversation. CoQA is pronounced as coca
  • QuAC is a dataset for modeling, understanding, and participating in information seeking dialog. QuAC introduces challenges not found in existing machine comprehension datasets: its questions are often more open-ended, unanswerable, or only meaningful within the dialog context.

Usage

  • Run SQuAD experiment
CUDA_VISIBLE_DEVICES=0,1,2,3 python run_squad.py \
    --spiece_model_file=model/cased_L-24_H-1024_A-16/spiece.model \
    --model_config_path=model/cased_L-24_H-1024_A-16/xlnet_config.json \
    --init_checkpoint=model/cased_L-24_H-1024_A-16/xlnet_model.ckpt \
    --task_name=v2.0 \
    --random_seed=100 \
    --predict_tag=xxxxx \
    --data_dir=data/squad/v2.0 \
    --output_dir=output/squad/v2.0/data \
    --model_dir=output/squad/v2.0/checkpoint \
    --export_dir=output/squad/v2.0/export \
    --max_seq_length=512 \
    --train_batch_size=12 \
    --predict_batch_size=12 \
    --num_hosts=1 \
    --num_core_per_host=4 \
    --learning_rate=3e-5 \
    --train_steps=8000 \
    --warmup_steps=1000 \
    --save_steps=1000 \
    --do_train=true \
    --do_predict=true \
    --do_export=true \
    --overwrite_data=false
  • Run CoQA experiment
CUDA_VISIBLE_DEVICES=0,1,2,3 python run_coqa.py \
    --spiece_model_file=model/cased_L-24_H-1024_A-16/spiece.model \
    --model_config_path=model/cased_L-24_H-1024_A-16/xlnet_config.json \
    --init_checkpoint=model/cased_L-24_H-1024_A-16/xlnet_model.ckpt \
    --task_name=v1.0 \
    --random_seed=100 \
    --predict_tag=xxxxx \
    --data_dir=data/coqa/v1.0 \
    --output_dir=output/coqa/v1.0/data \
    --model_dir=output/coqa/v1.0/checkpoint \
    --export_dir=output/coqa/v1.0/export \
    --max_seq_length=512 \
    --train_batch_size=12 \
    --predict_batch_size=12 \
    --num_hosts=1 \
    --num_core_per_host=4 \
    --learning_rate=3e-5 \
    --train_steps=8000 \
    --warmup_steps=1000 \
    --save_steps=1000 \
    --do_train=true \
    --do_predict=true \
    --do_export=true \
    --overwrite_data=false
  • Run QuAC experiment
CUDA_VISIBLE_DEVICES=0,1,2,3 python run_quac.py \
    --spiece_model_file=model/cased_L-24_H-1024_A-16/spiece.model \
    --model_config_path=model/cased_L-24_H-1024_A-16/xlnet_config.json \
    --init_checkpoint=model/cased_L-24_H-1024_A-16/xlnet_model.ckpt \
    --task_name=v1.0 \
    --random_seed=100 \
    --predict_tag=xxxxx \
    --data_dir=data/quac/v0.2 \
    --output_dir=output/quac/v0.2/data \
    --model_dir=output/quac/v0.2/checkpoint \
    --export_dir=output/quac/v0.2/export \
    --max_seq_length=512 \
    --train_batch_size=12 \
    --predict_batch_size=12 \
    --num_hosts=1 \
    --num_core_per_host=4 \
    --learning_rate=3e-5 \
    --train_steps=8000 \
    --warmup_steps=1000 \
    --save_steps=1000 \
    --do_train=true \
    --do_predict=true \
    --do_export=true \
    --overwrite_data=false

Experiment

SQuAD v1.1

Figure 2: Illustrations of fine-tuning XLNet on SQuAD v1.1 task

Model Train Data # Train Steps Batch Size Max Length Learning Rate EM F1
XLNet-base SQuAD 2.0 8,000 48 512 3e-5 85.90 92.17
XLNet-large SQuAD 2.0 8,000 48 512 3e-5 88.61 94.28

Table 1: The dev set performance of XLNet model finetuned on SQuAD v1.1 task

SQuAD v2.0

Figure 3: Illustrations of fine-tuning XLNet on SQuAD v2.0 task

Model Train Data # Train Steps Batch Size Max Length Learning Rate EM F1
XLNet-base SQuAD 2.0 8,000 48 512 3e-5 80.23 82.90
XLNet-large SQuAD 2.0 8,000 48 512 3e-5 85.72 88.36

Table 2: The dev set performance of XLNet model finetuned on SQuAD v2.0 task

CoQA v1.0

Figure 4: Illustrations of fine-tuning XLNet on CoQA v1.0 task

Model Train Data # Train Steps Batch Size Max Length Max Query Len Learning Rate EM F1
XLNet-base CoQA 1.0 6,000 48 512 128 3e-5 76.4 84.4
XLNet-large CoQA 1.0 6,000 48 512 128 3e-5 81.8 89.4

Table 3: The dev set performance of XLNet model finetuned on CoQA v1.0 task

QuAC v0.2

Figure 5: Illustrations of fine-tuning XLNet on QuAC v0.2 task

Model Train Data # Train Steps Batch Size Max Length Max Query Len Learning Rate Overall F1 HEQQ HEQD
XLNet-base QuAC 0.2 8,000 48 512 128 2e-5 66.4 62.6 6.8
XLNet-large QuAC 0.2 8,000 48 512 128 2e-5 71.5 68.0 11.1

Table 3: The dev set performance of XLNet model finetuned on QuAC v0.2 task

Reference

mrc_tf's People

Contributors

stevezheng23 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

mrc_tf's Issues

BERT_for_MRC_TF

Hello, do you happen to have any examples of this project utilizing the BERT model? If so, would you mind sharing? I would greatly appreciate it!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.