Giter VIP home page Giter VIP logo

memn2n-tensorflow's Introduction

End-To-End Memory Networks in Tensorflow

Tensorflow implementation of End-To-End Memory Networks for language modeling (see Section 5). The original torch code from Facebook can be found here.

alt tag

Prerequisites

This code requires Tensorflow. There is a set of sample Penn Tree Bank (PTB) corpus in data directory, which is a popular benchmark for measuring quality of these models. But you can use your own text data set which should be formated like this.

When you use docker image tensorflw/tensorflow:latest-gpu, you need to python package future.

$ pip install future

If you want to use --show True option, you need to install python package progress.

$ pip install progress

Usage

To train a model with 6 hops and memory size of 100, run the following command:

$ python main.py --nhop 6 --mem_size 100

To see all training options, run:

$ python main.py --help

which will print:

usage: main.py [-h] [--edim EDIM] [--lindim LINDIM] [--nhop NHOP]
              [--mem_size MEM_SIZE] [--batch_size BATCH_SIZE]
              [--nepoch NEPOCH] [--init_lr INIT_LR] [--init_hid INIT_HID]
              [--init_std INIT_STD] [--max_grad_norm MAX_GRAD_NORM]
              [--data_dir DATA_DIR] [--data_name DATA_NAME] [--show SHOW]
              [--noshow]

optional arguments:
  -h, --help            show this help message and exit
  --edim EDIM           internal state dimension [150]
  --lindim LINDIM       linear part of the state [75]
  --nhop NHOP           number of hops [6]
  --mem_size MEM_SIZE   memory size [100]
  --batch_size BATCH_SIZE
                        batch size to use during training [128]
  --nepoch NEPOCH       number of epoch to use during training [100]
  --init_lr INIT_LR     initial learning rate [0.01]
  --init_hid INIT_HID   initial internal state value [0.1]
  --init_std INIT_STD   weight initialization std [0.05]
  --max_grad_norm MAX_GRAD_NORM
                        clip gradients to this norm [50]
  --checkpoint_dir CHECKPOINT_DIR
                        checkpoint directory [checkpoints]
  --data_dir DATA_DIR   data directory [data]
  --data_name DATA_NAME
                        data set name [ptb]
  --is_test IS_TEST     True for testing, False for Training [False]
  --nois_test
  --show SHOW           print progress [False]
  --noshow

(Optional) If you want to see a progress bar, install progress with pip:

$ pip install progress
$ python main.py --nhop 6 --mem_size 100 --show True

After training is finished, you can test and validate with:

$ python main.py --is_test True --show True

The training output looks like:

$ python main.py --nhop 6 --mem_size 100 --show True
Read 929589 words from data/ptb.train.txt
Read 73760 words from data/ptb.valid.txt
Read 82430 words from data/ptb.test.txt
{'batch_size': 128,
'data_dir': 'data',
'data_name': 'ptb',
'edim': 150,
'init_hid': 0.1,
'init_lr': 0.01,
'init_std': 0.05,
'lindim': 75,
'max_grad_norm': 50,
'mem_size': 100,
'nepoch': 100,
'nhop': 6,
'nwords': 10000,
'show': True}
I tensorflow/core/common_runtime/local_device.cc:25] Local device intra op parallelism threads: 12
I tensorflow/core/common_runtime/direct_session.cc:45] Direct session inter op parallelism threads: 12
Training |################################| 100.0% | ETA: 0s
Testing |################################| 100.0% | ETA: 0s
{'perplexity': 507.3536108810464, 'epoch': 0, 'valid_perplexity': 285.19489755719286, 'learning_rate': 0.01}
Training |################################| 100.0% | ETA: 0s
Testing |################################| 100.0% | ETA: 0s
{'perplexity': 218.49577035468886, 'epoch': 1, 'valid_perplexity': 231.73457031084268, 'learning_rate': 0.01}
Training |################################| 100.0% | ETA: 0s
Testing |################################| 100.0% | ETA: 0s
{'perplexity': 163.5527845871247, 'epoch': 2, 'valid_perplexity': 175.38771414841014, 'learning_rate': 0.01}
Training |################################| 100.0% | ETA: 0s
Testing |################################| 100.0% | ETA: 0s
{'perplexity': 136.1443535538306, 'epoch': 3, 'valid_perplexity': 161.62522958776597, 'learning_rate': 0.01}
Training |################################| 100.0% | ETA: 0s
Testing |################################| 100.0% | ETA: 0s
{'perplexity': 119.15373237680929, 'epoch': 4, 'valid_perplexity': 149.00768378137946, 'learning_rate': 0.01}
Training |##############                  | 44.0% | ETA: 378s

Performance

The perplexity on the test sets of Penn Treebank corpora.

# of hidden # of hops memory size MemN2N (Sukhbaatar 2015) This repo.
150 3 100 122 129
150 6 150 114 in progress

Author

Taehoon Kim / @carpedm20

memn2n-tensorflow's People

Contributors

carpedm20 avatar gabrielhuang avatar j-min avatar mrchypark avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

memn2n-tensorflow's Issues

How to choose/calculate context in order to get better result?

In the code of this repo, context matrix of shape [batch_size, mem_size] is chosen randomly as below
m = random.randrange(self.mem_size, len(data)) target[b][data[m]] = 1 context[b] = data[m - self.mem_size:m]
My quesiton (I am sorry it is not actually an 'issue' but my personal quesion) is what approaches I can take to get better result rather than just random?
Any kind of material that is helpful is welcomed :)

segmentation fault issue

(tensorflow09GPU)โžœ  MemN2N-tensorflow git:(master) python main.py --nhop 6 --mem_size 100
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcublas.so locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcudnn.so locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcufft.so locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcuda.so locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcurand.so locally
Read 929589 words from data/ptb.train.txt
Read 73760 words from data/ptb.valid.txt
Read 82430 words from data/ptb.test.txt
{'batch_size': 128,
 'checkpoint_dir': 'checkpoints',
 'data_dir': 'data',
 'data_name': 'ptb',
 'edim': 150,
 'init_hid': 0.1,
 'init_lr': 0.01,
 'init_std': 0.05,
 'is_test': False,
 'lindim': 75,
 'max_grad_norm': 50,
 'mem_size': 100,
 'nepoch': 100,
 'nhop': 6,
 'nwords': 10000,
 'show': False}
[1]    9730 segmentation fault (core dumped)  python main.py --nhop 6 --mem_size 100

Exception: [!] Directory checkpoints not found

envy@ub1404:/media/envy/data1t/os_prj/github/MemN2N-tensorflow$ PYTHONPATH=~/os_prj/github/tensorflow/_python_build python main.py --nhop 6 --mem_size 100
Read 929589 words from data/ptb.train.txt
Read 73760 words from data/ptb.valid.txt
Read 82430 words from data/ptb.test.txt
{'batch_size': 128,
'checkpoint_dir': 'checkpoints',
'data_dir': 'data',
'data_name': 'ptb',
'edim': 150,
'init_hid': 0.1,
'init_lr': 0.01,
'init_std': 0.05,
'is_test': False,
'lindim': 75,
'max_grad_norm': 50,
'mem_size': 100,
'nepoch': 100,
'nhop': 6,
'nwords': 10000,
'show': False}
Traceback (most recent call last):
File "main.py", line 52, in
tf.app.run()
File "/home/envy/os_pri/github/tensorflow/_python_build/tensorflow/python/platform/default/_app.py", line 30, in run
sys.exit(main(sys.argv))
File "main.py", line 43, in main
model = MemN2N(FLAGS, sess)
File "/media/envy/data1t/os_prj/github/MemN2N-tensorflow/model.py", line 26, in init
raise Exception(" [!] Directory %s not found" % self.checkpoint_dir)
Exception: [!] Directory checkpoints not found
envy@ub1404:/media/envy/data1t/os_prj/github/MemN2N-tensorflow$

Directory not found

Hi, thanks for the code.

I'm getting the following error when running main

Traceback (most recent call last):
  File "main.py", line 52, in <module>
    tf.app.run()
  File "/home/eders/anaconda/lib/python2.7/site-packages/tensorflow/python/platform/default/_app.py", line 30, in run
    sys.exit(main(sys.argv))
Traceback (most recent call last):
  File "main.py", line 52, in <module>
    tf.app.run()
  File "/home/eders/anaconda/lib/python2.7/site-packages/tensorflow/python/platform/default/_app.py", line 30, in run
    sys.exit(main(sys.argv))
  File "main.py", line 43, in main
    model = MemN2N(FLAGS, sess)
  File "/home/eders/python/MemN2N-tensorflow/model.py", line 26, in __init__
    raise Exception(" [!] Directory %s not found" % self.checkpoint_dir)
Exception:  [!] Directory checkpoints not found

I tried to use --data_dir but that didn't work either.

How to use this model?

I don't know how to use this model.
I need a code to answer questions.

ex)
context{ Sam walks into the kitchen
Sam picks up an apple
Sam walks into the bedroom
Sam drops the apple
}

     Q: Where is the apple?
     A: Bedroom

please exam code

Issue with argument 'adjoint_b'

Hello @carpedm20 , thanks for this project, it is helping me get better insight into the paper :)
Quick question: during re-implementation of the code, I am getting the following error:

Aout = tf.matmul(self.hid3dim, Ain, adjoint_b=True) TypeError: matmul() got an unexpected keyword argument 'adjoint_b'

Any idea what the cause could be?

Thanks a ton :)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.