Giter VIP home page Giter VIP logo

kear's Introduction

Human Parity on CommonsenseQA: Augmenting Self-Attention with External Attention

This PyTorch package implements the KEAR model that surpasses human on the CommonsenseQA benchmark, as described in:

Yichong Xu, Chenguang Zhu, Shuohang Wang, Siqi Sun, Hao Cheng, Xiaodong Liu, Jianfeng Gao, Pengcheng He, Michael Zeng and Xuedong Huang
Human Parity on CommonsenseQA: Augmenting Self-Attention with External Attention
The 31st International Joint Conference on Artificial Intelligence (IJCAI), 2022.

The package also includes codes for our earilier DEKCOR model as in:

Yichong Xu∗, Chenguang Zhu∗, Ruochen Xu, Yang Liu, Michael Zeng and Xuedong Huang
Fusing Context Into Knowledge Graph for Commonsense Question Answering
Findings of the 59th Annual Meeting of the Association for Computational Linguistics (ACL), 2021

Please cite the above papers if you use this code.

Results

This package achieves the state-of-art performance of 86.1% (single model), 89.4% (ensemble) on the CommonsenseQA leaderboard, surpassing the human performance of 88.9%.

Quickstart

  1. pull docker:
    > docker pull yichongx/csqa:human_parity

  2. run docker
    > nvidia-docker run -it --mount src='/',target=/workspace/,type=bind yichongx/csqa:human_parity /bin/bash
    > cd /workspace/path/to/repo
    Please refer to the following link if you first use docker: https://docs.docker.com/

Features

Our code supports flexible training of various models on multiple choice QA.

  • Distributed training with Pytorch native DDP or Deepspeed: see bash/task_train.sh
  • Pause and resume training at any step; use option --continue_train
  • Use any transformer encoders including ELECTRA, DeBERTa, ALBERT

Preprocessing data

Pre-processed data is located at data/.

We release codes for knowledge graph and dictionary external attention in preprocess/

  1. Download data
    > cd preprocess
    > bash download_data.sh
  2. Add ConceptNet triples and Wiktionary definitions to data
    > python add_knowledge.py
  3. We also add the most frequent relations in each question as a side information.
    > python add_freq_rel.py

Training and Prediction

  1. train a model
    > bash bash/task_train.sh
  2. make prediction
    > bash bash/task_predict.sh See task.py for available options.

Running codes for DEKCOR

The current code is mostly compatible to run DEKCOR. To run the original DEKCOR code, please checkout tag DEKCOR to use the previous version.

by Yichong Xu
[email protected]

Contributing

This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.

When you submit a pull request, a CLA bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact [email protected] with any additional questions or comments.

Trademarks

This project may contain trademarks or logos for projects, products, or services. Authorized use of Microsoft trademarks or logos is subject to and must follow Microsoft's Trademark & Brand Guidelines. Use of Microsoft trademarks or logos in modified versions of this project must not cause confusion or imply Microsoft sponsorship. Any use of third-party trademarks or logos are subject to those third-party's policies.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.