Used 25k reviews from standard IMDB dataset on various models to predict if a given review is positive OR negative.
- Simple 1D convolution model (
1Dconv_simple_noatt
): Passes whole review as a sequence to match a binary output. - Simple RNN model (
RNN_simple_noatt
): Passes whole review as a sequence to learn a review embedding which predicts a binary output. - Attention RNN model (
RNN_simple_att
): This time, the model learns which part of review embedding contributes how much to predict output. - Hierarchial RNN model (
RNN_hier_noatt
): Breaks the review into sentences and learns sentence embedding first and then uses them to learn review embedding. Also tries to predict a binary output. - Hierarchial Attention RNN model (
RNN_hier_att
): Understands Reviews as list of sentences (sentence embeddings) and learns what sentences weigh how much to the final score of predicting whether if a review is positive or negative.
Model | Dropout | Input Sequence Dims. | Best Validation Accuracy | Time To Train | Optimizer | Attention | Hierarchy | Memory-Dims. |
---|---|---|---|---|---|---|---|---|
1Dconv_simple_noatt | No | (21250, 1000) | 88.19% | 181s | rmsprop |
None |
None |
|
1Dconv_simple_noatt | Yes | (21250, 1000) | 85.39% | 183s | rmsprop |
None |
None |
|
RNN_simple_noatt | No | (21250,100) | 85.79% | 287s | rmsprop |
None |
None |
|
RNN_simple_noatt | No | (20000,100) | 82.74% | 183s | ADAM |
None |
None |
|
RNN_simple_noatt | Yes | (20000,100) | 84.22% | 131s | ADAM |
None |
None |
|
RNN_simple_noatt | No | (20000,500) | 87.18% | 415s | ADAM |
None |
None |
|
RNN_simple_noatt | Yes | (20000,500) | 87.34% | 608s | ADAM |
None |
None |
|
RNN_simple_noatt | Yes | (20000,1000) | 86.56% | 1234s | ADAM |
None |
None |
|
RNN_simple_att | No | (20000, 500) | 84.42% | 99s | ADAM |
word_level |
None |
|
RNN_simple_att | Yes | (20000, 500) | 84.40% | 101s | ADAM |
word_level |
None |
|
RNN_hier_noatt | No | (20000, 5, 100) | 81.80% | 375s | ADAM |
None |
5X100 |
LSTM-64 |
RNN_hier_noatt | No | (20000, 20, 50) | 88.38% | 739s | ADAM |
None |
20X50 |
LSTM-64 |
RNN_hier_noatt | No | (20000, 20, 50) | 88.04% | 1233s | ADAM |
None |
20X50 |
LSTM-100 |
RNN_hier_noatt | No | (20000, 20, 50) | 88.62% | 854s | ADAM |
None |
20X50 |
GRU-100 |
RNN_hier_noatt | No | (20000, 20, 20) | 87.12% | 1604s | ADAM |
None |
20X20 |
LSTM-300 |
RNN_hier_att | No | (20000, 20, 50) | 89.52% | 1258s | ADAM |
sentence_level |
20X50 |
GRU-100 |
RNN_hier_att | No | (20000, 20, 50) | 88.54% | 1473s | ADAM |
sentence_level |
20X50 |
LSTM-100 |
Poke and see what the model is actually learning. I hope it is something wonderful!