Giter VIP home page Giter VIP logo

chainer-qrnn's Introduction

chainer-qrnn

About

Re-implementation of Quasi-Recurrent Neural Networks (QRNN) by Chainer.

The original paper is:

James Bradbury, Stephen Merity, Caiming Xiong, and Richard Socher. 2016. Quasi-Recurrent Neural Networks

The original implementation of QRNN (which is also written in Chainer) is publicly available on this blog post. However, the author only provides so-called "core" implementation, which is only a chunk of code.

Instead, this repository aims to offer full-implementation of QRNN.

Implementation Details

What is included

  • QRNN with fo-pooling architecture (bin/QRNNLM/net/model.py)
  • Scripts for language modeling experiment (bin/QRNNLM/train_qrnn.py)

What is not included

  • QRNN Encoder-Decoder model

Dependencies

  • Python 3.5
  • Chainer 1.21.0

How to run

Data Preparation

Download preprocessed version of Penn Tree Bank from here.

Create data/ptb directory at the same level as bin and copy downloaded data (train.txt valid.txt test.txt) in it.

Training

Train the model with following command.

python train_qrnn.py --gpu <gpu_id> --epoch 100 --dim 640 --batchsize 20 --bproplen 105 --unit 640 --decay 0.0002

Testing

For computing the perplexity with the test set, use eval_qrnn.py

python eval_qrnn.py --model-path <path_to_trained_model> --config-path <path_to_settings.json>

Experiment

Task

Language modeling on MikolovPTB

Hyper-Parameters

LSTM QRNN
Number of Layers 2 2
Hidden Layer Units 640 640
Dropout 0.5 0.5
Zoneout No No
Weight Decay 0.0002 0.0002
GradientClipping 5 10
Epoch 100 100
Batchsize 20 20
BPTT length 35 105

Result

Dev Test
LSTM 84.99 81.87
QRNN 85.12 81.75

On my experiment, LSTM performed better than QRNN.

chainer-qrnn's People

Contributors

butsugiri avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

chainer-qrnn's Issues

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.