inmoonlight / rmn Goto Github PK

View Code? Open in Web Editor NEW

17.0 1.0 3.0 5.46 MB

Relation Memory Network

License: MIT License

Python 100.00%

babi-tasks babi-story relation-network memory-network

rmn's Introduction

Related Memory Network (RMN)

End-to-End neural network architecture exploiting both memory network and relation network structures
State-of-the-art result in jointly trained bAbI-10k story-based question answering

Result

Task	MemN2N	DMN+	RN	RMN
1	0.0	0.0	0.0	0.0
2	0.3	0.3	6.5	0.5
3	9.3	1.1	12.9	14.7
4	0.0	0.0	0.0	0.0
5	0.6	0.5	0.5	0.4
6	0.0	0.0	0.0	0.0
7	3.7	2.4	0.2	0.5
8	0.8	0.0	0.1	0.3
9	0.8	0.0	0.0	0.0
10	2.4	0.0	0.0	0.0
11	0.0	0.0	0.4	0.5
12	0.0	0.0	0.0	0.0
13	0.0	0.0	0.0	0.0
14	0.0	0.0	0.0	0.0
15	0.0	0.0	0.0	0.0
16	0.4	45.3	50.3	0.9
17	40.7	4.2	0.9	0.3
18	6.7	2.1	0.6	2.3
19	66.5	0.0	2.1	2.9
20	0.0	0.0	0.0	0.0
Mean error	6.6	2.8	3.7	1.2
Failed tasks	4	1	3	1

Prerequisites

Python 3.6
Tensorflow 1.3.0
dependencies
- pip install tqdm colorlog

Usage

1. prepare data

To process bAbI story-based QA dataset, run:

$ python preprocessor.py --data story

To process bAbI dialog dataset, run:

$ python preprocessor.py --data dialog

2. train model

To train RMN on bAbI story-based QA dataset, run:

$ python ./babi_story/train.py

To train RMN on bAbI dialog dataset task 4, run:

$ python ./babi_dialog/train.py --task 4 --embedding concat --word_embed_dim 50

To use match, use_match flag is required:

$ python ./babi_dialog/train.py --task 4 --use_match True --embedding concat --word_embed_dim 50

To test on OOV dataset, is_oov flag is required:

$ python ./babi_dialog/train.py --task 4 --is_oov True --embedding concat --word_embed_dim 50

rmn's People

Contributors

Stargazers

Watchers

Forkers

codeaudit xiongfeihtp ml-lab

rmn's Issues

Some more work which might be relevant

Hi,

In your paper, one of the important points you make is that your method is O(n) whereas Relation nets are O(n^2).

In fact, if we write both architectures in a (very) simplified way (and ignoring certain complications such as softmax), we can see that RMNs are indeed a slight variation on relation nets, which prevent this problem:

Relation net:
r(X) = sum_i(sum_j(f_1(f_2(x_i), f_3(x_j)))
For some f_1, f_2, f_3.
(In the case of the paper, f_2 and f_3 are the identity, and f_1 is an MLP on the concatenation of the inputs, which is additionally parameterized by p)

FMN (2-stage):
r(X) = sum_i(g_1(g_2(x_i), sum_j(g_3(x_j)))) i.e. we move the sum inside the bracket so we can re-use it for all i
For some g_1, g_2, g_3.
(In this case g_3 is attention with parameter p, g_1 is another attention, and g_2 is identity)

Note: in both cases we can change the function r(X) into a transformation of the individual elements, rather than a representation of the set, by simply not taking the outer sum. This allows us to stack multiple layers on top of each other, as you do with your n-stage FMN.

While this similarity might seem superficial, since very different functions are used for f_1,f_2,f_3 and g_1,g_2,g_3, it turns out a lot of other architectures exist which also fall into this pattern. In particular, the self-attention used in https://arxiv.org/abs/1706.03762 is a better comparison to your architecture, as it uses attention mechanisms.

There's also been a few pieces of work done which would fall into this second category, in particular this paper https://arxiv.org/abs/1703.06114 . I've also written a bit about these variants (as well as relation nets) here. It's possible some of the notes in these might be useful, and it may be interesting to compare your architecture to theirs on their problems (or try theirs on yours), as ultimately they solve the same problem, but with a large difference in architecture.

I hope that all made sense. I have some block diagrams of various architectures that might make things clearer, although they don't really show much on the conceptual level.

Best wishes,

Will

babi_story accuracy is incorrect for path-finding task

From babi_story/module.py:
final_pred = tf.one_hot(tf.argmax(pred[0], axis=1), depth=self.answer_vocab_size) * answer_bool[0] + tf.one_hot(tf.argmax(pred[1], axis=1), depth=self.answer_vocab_size) * answer_bool[1] + tf.one_hot(tf.argmax(pred[2], axis=1), depth=self.answer_vocab_size) * answer_bool[2]

final_answer = a_s[0] * answer_bool[0] + a_s[1] * answer_bool[1] + a_s[2] * answer_bool[2]

The pathfinding task depends on the order of the answers but the code above makes the accuracy measurement independent from the order of pred[0], pred[1], pred[2] and a_s[0], a_s[1], a_s[2].

inmoonlight / rmn Goto Github PK

rmn's Introduction

Related Memory Network (RMN)

Result

Prerequisites

Usage

1. prepare data

2. train model

rmn's People

Contributors

Stargazers

Watchers

Forkers

rmn's Issues

Some more work which might be relevant

babi_story accuracy is incorrect for path-finding task

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent