Giter VIP home page Giter VIP logo

qanet_keras's Introduction

QANet in keras

QANet: https://arxiv.org/abs/1804.09541

This keras model refers to QANet in tensorflow (https://github.com/NLPLearn/QANet). and the self-attention & position embedding are used from (https://kexue.fm/archives/4765, https://github.com/bojone/attention). Now the self-attention & position embedding are also revised from (https://github.com/NLPLearn/QANet).

We find that the conv based multi-head attention in (https://github.com/NLPLearn/QANet/blob/master/layers.py) performs 3%~4% better than the multiplying matrices based one in (https://github.com/bojone/attention/blob/master/attention_keras.py).

Pipline

  1. Download squad data from (https://rajpurkar.github.io/SQuAD-explorer/).

  2. Run preprocess.ipynb and handcraft.ipynb to get npys of the preprocessed data and handcraft features.

  3. Run train_QANet.py to start training.

  4. Fast demo: Use the god made model.fit() in QANet_fit_demo.py with random numpy data.

Updates

  • Add EMA (with about 3% improvement)
  • Add multi gpu (speed up)
  • Support adding handcraft features
  • Revised the MultiHeadAttention and PositionEmbedding in keras
  • Support parallel multi-gpu training and inference
  • Add layer dropout and revise the dropout bug (with about 2% improvement)
  • Update the experimental results and related hyper-parameters (Coming soon)
  • Revise the output Layer QAoutputBlock.py(with about 1% improvement)
  • Add slice operation to QANet(get the max context length from each batch dynamically to speed up the model)
  • Add data augmentation

I find that EMA in keras is hard to implement with GPU, and the training speed is greatly affected by it in keras. Besides, it's hard to add the slice op in keras too, so the training speed is further slower(cost about twice as much time compared with the optimized tensorflow version...). Moreover, there is also 2% gap of keras compared with the tensorflow version(https://github.com/NLPLearn/QANet).

Results

Result on dev set of squad result

qanet_keras's People

Contributors

ewrfcas avatar

Watchers

James Cloos avatar Tushar Bihani avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.