songhune / memnet Goto Github PK

License: MIT License

Python 100.00%

memnet's Introduction

MemN2N

Implementation of End-To-End Memory Networks with sklearn-like interface using Tensorflow. Tasks are from the bAbl dataset.

songhune edited

In order to make my thesis, I've been changing the original code for experiment, if there is any inconvience including licence issues(though I think it would be only used for academical reasons), please contact me with [email protected]. Thank you!

Get Started

git clone [email protected]:songhune/MemNet.git

mkdir ./memn2n/data/
cd ./memn2n/data/
wget http://www.thespermwhale.com/jaseweston/babi/tasks_1-20_v1-2.tar.gz
tar xzvf ./tasks_1-20_v1-2.tar.gz

cd ../
python single.py

Examples

Running a single bAbI task

Running a joint model on all bAbI tasks

These files are also a good example of usage.

Requirements

tensorflow 1.6
scikit-learn 0.17.1
six 1.10.0

Single Task Results

For a task to pass it has to meet 95%+ testing accuracy. Measured on single tasks on the 1k data.

Pass: 1,4,12,15,20

Several other tasks have 80%+ testing accuracy.

Stochastic gradient descent optimizer was used with an annealed learning rate schedule as specified in Section 4.2 of End-To-End Memory Networks

The following params were used:

epochs: 100
hops: 3
embedding_size: 20

Task	Training Accuracy	Validation Accuracy	Testing Accuracy
1	1.0	1.0	1.0
2	1.0	0.86	0.83
3	1.0	0.64	0.54
4	1.0	0.99	0.98
5	1.0	0.94	0.87
6	1.0	0.97	0.92
7	1.0	0.89	0.84
8	1.0	0.93	0.86
9	1.0	0.86	0.90
10	1.0	0.80	0.78
11	1.0	0.92	0.84
12	1.0	1.0	1.0
13	0.99	0.94	0.90
14	1.0	0.97	0.93
15	1.0	1.0	1.0
16	0.81	0.47	0.44
17	0.76	0.65	0.52
18	0.97	0.96	0.88
19	0.40	0.17	0.13
20	1.0	1.0	1.0

Joint Training Results

Pass: 1,6,9,10,12,13,15,20

Again stochastic gradient descent optimizer was used with an annealed learning rate schedule as specified in Section 4.2 of End-To-End Memory Networks

The following params were used:

epochs: 60
hops: 3
embedding_size: 40

Task	Training Accuracy	Validation Accuracy	Testing Accuracy
1	1.0	0.99	0.999
2	1.0	0.84	0.849
3	0.99	0.72	0.715
4	0.96	0.86	0.851
5	1.0	0.92	0.865
6	1.0	0.97	0.964
7	0.96	0.87	0.851
8	0.99	0.89	0.898
9	0.99	0.96	0.96
10	1.0	0.96	0.928
11	1.0	0.98	0.93
12	1.0	0.98	0.982
13	0.99	0.98	0.976
14	1.0	0.81	0.877
15	1.0	1.0	0.983
16	0.64	0.45	0.44
17	0.77	0.64	0.547
18	0.85	0.71	0.586
19	0.24	0.07	0.104
20	1.0	1.0	0.996

Notes

Single task results are from 10 repeated trails of the single task model accross all 20 tasks with different random initializations. The performance of the model with the lowest validation accuracy for each task is shown in the table above.

Joint training results are from 10 repeated trails of the joint model accross all tasks. The performance of the single model whose validation accuracy passed the most tasks (>= 0.95) is shown in the table above (joint_scores_run2.csv). The scores from all 10 runs are located in the results/ directory.

memnet's People

Contributors

Watchers

memnet's Issues

#tf.flags에러

TensorFlow1.5.0 absl.flags._exceptions.UnparsedFlagAccessError in Jupyter Notebook #16935
tensorflow/tensorflow#16935

이 이슈 때문에 컴파일이 안됩니다. 일단 급한대로 absolute path 및 variables를 가지고 작업중입니다.

# locations between supporting sentence and query

@rizejin
교수님꼐서 몇몇 케이스를 가지고 실험하셨다곤 했는데
실제로 SS의 위치와 Query간의 거리상의 거리를 측정하고
몇몇 시각화 기반 데이터 수집(이를테면 tf-idf수치 등)을 해보고 싶은데
유효성 검토를 별도로 시각화할 수 있는 코드를 작성하는건 어떨까요?

알고리즘 설계

메모리<SS 일때,
inference(여기서는 단순히 softmax, ss와 q)를 해서 가장 연관성이 낮은 벡터를 제외한 나머지 벡터를 반환한다.
이렇게 해서 최종적으로는 메모리 사이즈에 맞춘다.
이과정을 tf가 아닌 np 내에서 한다.
word correlation이 알파벳 순으로만 되어있는 한계 포착.
cosine similarity 공부(td-idf)

what if we restrict the memory size into half

then we need a replacement technique

# zero padding

어떻게 구현해야 할지 감이 잡히지 않습니다. 이 부분에 대해 오늘 중으로 회의를 하고 싶습니다.

에러입니다

ㅇㅁ내머ㅐㅇ머ㅐㅑ어ㅐ먄ㅁ

19 task low performance

@rizejin 교수님 이 태스크 이슈는 어떻게 보고 계시는지요?

내용정리

논문작성관련

가설 및 관련자료에 대한 정보가 매우 적음
Embedding Text in Hyperbolic Spaces: Bhuwan Dhingra et al ACL 2018
Neural Word Embedding as Implicit Matrix Factorization: NIPS 2014
Word Embedding Revisited: A New Representation Learning and Explicit Matrix Factorization Perspective: IJCAI 2015
Linguistic Regularities in Sparse and Explicit Word Representations : CONLL 2014
이거 보고 있습니다.

이하처럼 19번 task에서 위치 정보에 대한 임베딩의 부재로 인한 정확도 낮은 문제 발생
3. diminishing the word vector, paragraph에 대한 임베딩 매트릭스를 어떻게 활용할 것인가.

Data Augmentation

현재 교수님이 해주신 코드로는 한계가 있어서 별도의 방향을 가지고 해보려고합니다.
일단 아래의 코드를 통해 story의 nid와 supporting sentence를 추출합니다.

 def supporting_no():
    with open(train_file) as f:
           for line in f.readlines():
            line = str.lower(line)
            nid, line = line.split(' ',1)
          

            ss=[]
            if nid == 1: #1번째 문장이라면 
                story = [] #새롭게 스토리 refresh를 한다
            if '\t' in line: # tab이 발생하면
                q, a, supporting = line.split('\t')#나누고 q, a, supporting sentence의 문장 넘버를 뽑는다
                ss.append(supporting)
    return ss,nids

이걸 가지고 non-supporting sentence를 map(int,non-supporting)해서 supporting과 joint하는 식으로 가려고 합니다. 안될까요?