Giter VIP home page Giter VIP logo

neural-turing-machine's Introduction

Neural-Turing-Machine

A modular implementation of the Neural Turing Machine introduced by Alex Graves et al.

Currently, two tasks have been implemented, Copy Task and Associative Recall Task as tf.keras.Model wrapper, available in the NTM_Model.py

Use them as showed in the Training Notebooks

Architecture Implemented

Since the paper only provides the mathematical operations for the generation and use of the Heads' Weighings, not the full architecture, thus the complete architecture becomes an open ended problem, where I've used the following architecture:

Architecture

Task Results

1. Copy Task

Training the above NTM on randomized sequence length between 1 and 20 yields the following results.

1.1. Till 10,000 epochs on Cross Entropy Loss.

Test 1:-
Input:

Sequence Length = 9, including the Start Of File and End Of File delimeters.

Input seq_len = 9

Output:

Output

Test 2:-
Input:

Sequence Length = 33, including the Start Of File and End Of File delimeters.

Note that it is more than what the above NTM is trained upon.

Input seq_len = 33

Output:

Output

Test 3:-
Input:

Sequence Length = 73, including the Start Of File and End Of File delimeters.

Input

Output:

Output

1.2. Till 20,000 epochs on Cross Entropy Loss.

Input:

Sequence Length = 90, including the Start Of File and End Of File delimeters.

Input

Output:

Output

Error incurred

Error

Memory Matrix for this input after last timestep:

Memory Matrix

Results on more tasks to follow soon...

2. Associative Recall Task

Training the Associative Recall Model for 158,000 episodes on randomized item numbers between 2 and 6 yields the following results:

Input

Input

Output from NTM

Output

Write Weighing while Reading over time

RWeights

Memory Matrix compared with Read Vectors while Writing

MM RV while W

Memory Matrix Evolution over Time

MM Evolution

Progress Timeline:

  1. Wed, Jan 15:-
  • Completed the NTMCell Implementation along with various Vector Generation Tasks.
  • Also tested it's result with dynamic_RNN, observed some NaN values in the result, was fixed by initializing states by a considerably low (0.5 in this case) value.
  1. Sun, Jan 19:-
  • Added sigmoid layer on Heads_w_t which produced much better results on one time step passes (not the training)
  • Random Initialization works well now too
  • In the process of finalizing the training schedule.
  1. Fri, Jan 31:-
  • First Complete version, added Inputs Generator for Copy Task and some minor bug fixes.
  • One still needs to train this though, there maybe some problems during training which one needs to solve.
  1. Sun, Feb 2:-
  • Training with Cross Entropy Loss Function proved to be difficult as loss seem to be stuck somewhere between 0.4 - 0.55
  • *Using Huber Loss Function seem to generate much better results, as loss seem to decrease linearly from 1.2 to 0.6 on max sequence length in about 10,000 epochs, after 1 injection of randomized initial states while preserving the weights.
  1. Wed, Feb 6:-
  • More careful analysis brought some more subtle bugs, which were holding back the generalisation of the model, removing those increases generalization much better with Cross Entropy Loss now.
  1. Sun, Feb 16:-
  • Added Associative Recall Task.

neural-turing-machine's People

Contributors

whendustsettles avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar  avatar

Forkers

iitguwahati-ai

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.