My (slightly modified) Keras implementation of the Recurrent Convolutional Neural Network (RCNN) described here: http://www.aaai.org/ocs/index.php/AAAI/AAAI15/paper/view/9745.

License: MIT License

Python 100.00%

classification deep-learning keras natural-language-processing nlp

recurrent-convolutional-neural-network-text-classifier's Introduction

Recurrent Convolutional Neural Network Text Classifier

My (slightly modified) Keras implementation of the Recurrent Convolutional Neural Network (RCNN) described here.

recurrent-convolutional-neural-network-text-classifier's People

Contributors

Stargazers

Watchers

recurrent-convolutional-neural-network-text-classifier's Issues

something confuse about input

1

We shift the document to the right to obtain the left-side contexts.

| left_context_as_array = np.array([[MAX_TOKENS] + tokens[:-1]])
| # We shift the document to the left to obtain the right-side contexts.
| right_context_as_array = np.array([tokens[1:] + [MAX_TOKENS]])

would you mind explain why is this? i think equation should like this:
left_context_as_array = np.array([token[:]]) and
right_context_as_array = np.array( [token[::-1]] ) .

About the performance

Hi,

Thank you for sharing the code. I ran your code using IMDB dataset, and the accuracy was 0.5. I wonder why this happened. Have you run the model on any dataset to test the performance? Thanks.

Performance on diffirent card.

Hello,I run your code successfully on my PC and I also find a strange thing.It seems like your code run well and faster on k80 than titan xp.Do you know the reasons? Looking forward for you answer!
thx!

I have one question. When you get context right, you reverse the input sequences, but you don't reverse the output sequences. I think the output seqences should be reversed.Becaues the LSTM api just reverse the input! thanks!

Recurrent-Convolutional-Neural-Network-Text-Classifier/recurrent_convolutional_keras.py

Line 39 in 2c4d87d

 backward = LSTM(hidden_dim_1, return_sequences = True, go_backwards = True)(r_embedding) # See equation (2). 

different embedding must be used.

seems like the same embeddings are used for for all the right context, left context and the word embedding. If you look at the training section its stated clearly that these are different parameters

so it should probably something like


document      = Input(shape=(None,), dtype = "int32")

document_embedding       = Embedding(vocab_size, WORD_EMB_SIZE, weights=[initial_embeddings],
                                     input_length=DOC_SEQ_LEN, trainable=True)(document)
left_context_embedding   = Embedding(vocab_size, WORD_EMB_SIZE, weights=[left_context_embeddings],
                                     input_length=DOC_SEQ_LEN, trainable=True)(document)
right_context_embedding  = Embedding(vocab_size, WORD_EMB_SIZE, weights=[right_context_embeddings],
                                     input_length=DOC_SEQ_LEN, trainable=True)(document)

Which part represents the Convolutional layer？

Which part represent the Convolutional layer in that code？Dense, Input, Lambda, LSTM, TimeDistributed layers used in that code，and Which part represent the Convolutional layer？

gensim.models.Word2Vec.load("word2vec.gensim")

when i run the code ,i can not do !Could you please tell me the 'word2vec.gensim'?

Question about equal(1) and equal(2)

Thanks for your code! However I have a question about the implementtation of equal(1) abd equal(2). The means of equal(1) and equal(2) is just a original RNN？ And you instead it with LSTM?

TypeError: Expected int32, got list containing Tensors of type '_Message' instead

I get this error using tensorflow as backend.
Traceback (most recent call last):
File "test1.py", line 35, in
forward = LSTM(hidden_dim_1, return_sequences = True)(l_embedding) # See equation (1).
File "/home/s/anaconda2/lib/python2.7/site-packages/keras/layers/recurrent.py", line 243, in call
return super(Recurrent, self).call(inputs, **kwargs)
File "/home/s/anaconda2/lib/python2.7/site-packages/keras/engine/topology.py", line 558, in call
self.build(input_shapes[0])
File "/home/s/anaconda2/lib/python2.7/site-packages/keras/layers/recurrent.py", line 1012, in build
constraint=self.bias_constraint)
File "/home/s/anaconda2/lib/python2.7/site-packages/keras/legacy/interfaces.py", line 88, in wrapper
return func(*args, **kwargs)
File "/home/s/anaconda2/lib/python2.7/site-packages/keras/engine/topology.py", line 391, in add_weight
weight = K.variable(initializer(shape), dtype=dtype, name=name)
File "/home/s/anaconda2/lib/python2.7/site-packages/keras/layers/recurrent.py", line 1004, in bias_initializer
self.bias_initializer((self.units * 2,), *args, **kwargs),
File "/home/s/anaconda2/lib/python2.7/site-packages/keras/backend/tensorflow_backend.py", line 1681, in concatenate
return tf.concat([to_dense(x) for x in tensors], axis)
File "/home/s/anaconda2/lib/python2.7/site-packages/tensorflow/python/ops/array_ops.py", line 1075, in concat
dtype=dtypes.int32).get_shape(
File "/home/s/anaconda2/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 669, in convert_to_tensor
ret = conversion_func(value, dtype=dtype, name=name, as_ref=as_ref)
File "/home/s/anaconda2/lib/python2.7/site-packages/tensorflow/python/framework/constant_op.py", line 176, in _constant_tensor_conversion_function
return constant(v, dtype=dtype, name=name)
File "/home/s/anaconda2/lib/python2.7/site-packages/tensorflow/python/framework/constant_op.py", line 165, in constant
tensor_util.make_tensor_proto(value, dtype=dtype, shape=shape, verify_shape=verify_shape))
File "/home/s/anaconda2/lib/python2.7/site-packages/tensorflow/python/framework/tensor_util.py", line 367, in make_tensor_proto
_AssertCompatible(values, dtype)
File "/home/s/anaconda2/lib/python2.7/site-packages/tensorflow/python/framework/tensor_util.py", line 302, in _AssertCompatible
(dtype.name, repr(mismatch), type(mismatch).name))
TypeError: Expected int32, got list containing Tensors of type '_Message' instead.

TypeError: maketrans() takes exactly 2 arguments (1 given)

When running the code , I get this error:
TypeError: maketrans() takes exactly 2 arguments (1 given)

at this line of code:
text = text.strip().lower().translate(string.maketrans({key: " {0} ".format(key) for key in string.punctuation}))

How to resolve this ?

AttributeError: type object 'str' has no attribute 'maketrans'

You meant string.maketrans?

  File "recurrent_convolutional_keras.py", line 53, in <module>
    text = text.strip().lower().translate(str.maketrans({key: " {0} ".format(key) for key in string.punctuation}))
AttributeError: type object 'str' has no attribute 'maketrans'

does your code also support label data?

Hello!
your code is really perfect,but it seems like your project doen't support label text Classification,so i wonder how can I modify your code to support the supervised classfication?
thx!

NameError: global name 'backend' is not defined

Hi, I tried to run your code and found the following error:

File "recurrent_convolutional_keras.py", line 44, in <lambda>
    pool_rnn = Lambda(lambda x: backend.max(x, axis = 1), output_shape = (hidden_dim_2, ))(semantic) # See equation (5).
NameError: global name 'backend' is not defined

After installing backend by running pip install backend (and importing backend into recurrent_convolutional_keras.py), I got another error

File "recurrent_convolutional_keras.py", line 45, in <lambda>
    pool_rnn = Lambda(lambda x: backend.max(x, axis = 1), output_shape = (hidden_dim_2, ))(semantic) # See equation (5).
AttributeError: 'module' object has no attribute 'max'

Could you provide more information about what is backend module? Thanks

test

Train data

I don't know the structure of word2vec.gensim, could you give some explain?

input format

Hi,
In your code you don't specify how you embed the sentence / document and there's no use of word2vec , aside of indicating the embedding size. I imagine the part
doc_as_array = np.array([[1, 2, 3, 4]])
left_context_as_array = np.array([[MAX_TOKENS, 1, 2, 3]])
right_context_as_array = np.array([[2, 3, 4, MAX_TOKENS]])
is merely an example - but how do you actually go about embedding and feeding a sentence or document to the network?

Thanks

Getting error while running word2vec.gensi

hello,
I've tried to run your code but getting error as
FileNotFoundError: [Errno 2] No such file or directory: 'word2vec.gensim'

while running word2vec = gensim.models.Word2Vec.load("word2vec.gensim")

please help me rectify this error

Where is the Convolutional layer？

I saw that the same issues as me were raised, but I couldn't read the author's answer.

training with multiple documents at once

Thanks you for posting this code on github. It is functional for me as is, but I was looking into performing batch training with this network. Do you know how to approach this if the documents being classified have variable lengths? I was considering padding the inputs to the same size, but since my documents have huge variations in length, many documents would be heavily padded. I am looking for a better approach.

whis is the file named 'word2vec.gensim'?

model performance

I tried to recreate the model performance of the this network on the 20NewsGroups dataset. I used google's pretrained word2vec embedding with vector dimensionality of 300, which was the only difference to the paper outside of using LSTMs vs the paper's BD RNN's.

During the training, I tracked the validation set auc_roc.

epoch 1: roc-auc on val - 0.840
epoch 2: roc-auc on val - 0.995
epoch 3: roc-auc on val - 0.996
epoch 4: roc-auc on val - 0.998
epoch 5: roc-auc on val - 0.997
epoch 6: roc-auc on val - 0.998
epoch 7: roc-auc on val - 0.997

At model test time, after 7 epochs, my f1 macro score was 0.91, compared to 0.9649 reported by the paper.

Have you run into issues with the model matching the paper? If so, any ideas what could be causing the discrepancy? And do you think I am training this network for too many epochs?

Much appreciated!

‘recurrent’ in the paper doesn't seem to involve LSTM

Hi, thanks for sharing your implementation of the paper "RCNN for text classification". I cloned this repo and experimented on my text classification task, the performance didn't behave as expected. I am not sure it happened due to my data pre-processing or the model implementation.

And after looking through the paper, I found that Equation (1), (2) give the computation of cl and cr:

where cr and cl are the result of simply matrix multiplication and an activation function. But in this code, matrix multiplications are replaced with LSTM cells. Is that proved effective than the original one?

thanks.

airalcorn2 / recurrent-convolutional-neural-network-text-classifier Goto Github PK

recurrent-convolutional-neural-network-text-classifier's Introduction

Recurrent Convolutional Neural Network Text Classifier

recurrent-convolutional-neural-network-text-classifier's People

Contributors

Stargazers

Watchers

Forkers

recurrent-convolutional-neural-network-text-classifier's Issues

1

We shift the document to the right to obtain the left-side contexts.

Recommend Projects

Recommend Topics

Recommend Org