Deep-learning system presented in "EmoSence at SemEval-2019 Task 3: Bidirectional LSTM Network for Contextual Emotion Detection in Textual Conversations" at SemEval-2019.

Jupyter Notebook 100.00%

deep-learning semeval semeval-2019 sentiment-analysis emotion-detection lstm keras keras-models tensorflow neural-network

emosense-semeval2019-task3-emocontext's Introduction

EmoSense at SemEval-2019 Task 3: Bidirectional LSTM Network for Contextual Emotion Detection in Textual Conversations

Overview

This repository contains the source code of the models used for EmoSense submissions for SemEval-2019 Task 3 “EmoContext: Contextual Emotion Detection in Text”. The model is described in the paper "EmoSense at SemEval-2019 Task 3: Bidirectional LSTM Network for Contextual Emotion Detection in Textual Conversations".

The proposed approach achieved 72.59% micro-average F1 score for emotion classes at the test dataset, thereby significantly outperform the officially-released baseline, namely larger in 14%.

We designed a specific architecture of LSTM which allows not only to learn semantic and sentiment feature represen- tation, but also to capture user-specific conversation features. In this work, we didn’t use any tradi- tional NLP features such as sentiment lexicons or hand-crafted linguistic by substituting them with word embeddings which were calculated automatically from the text corpora with an advanced pre-processing stage.

Citation:

@inproceedings{smetanin-2019-emosense,
    title = "{E}mo{S}ense at {S}em{E}val-2019 Task 3: Bidirectional {LSTM} Network for Contextual Emotion Detection in Textual Conversations",
    author = "Smetanin, Sergey",
    booktitle = "Proceedings of the 13th International Workshop on Semantic Evaluation",
    year = "2019",
    address = "Minneapolis, Minnesota, USA",
    publisher = "Association for Computational Linguistics",
    url = "https://www.aclweb.org/anthology/S19-2034",
    pages = "210--214",
}

The architecture of a smaller version of the proposed model. LSTM unit for the first turn and for the third turn have shared weights.

Source Code of the Model

The source code of the model is provided in EmoSense at SemEval2019 Task 3 EmoContext.ipynb.
The models were trained using Keras 2.2.2 with TensorFlow 1.7.1 backend.

Pre-trained Word Embeddings

The emotion detection models were trained on top of pre-trained DataStories word embeddings, which were additionally fine-tuned on the automatically collected emotional dataset.

Texts were pre-processed by Ekphrasis. This tool helps to perform spell correction, word normalization and segmentation and allows to specify which tokens should be omitted, normalized or annotated with special tags.

Pre-trained 300 dimensional embeddings may be downloaded at the following link: emosense.300d.txt. Place the embeddings file in root directory for the program to find it.

Documentation and How to report bugs

Keras documentation: https://keras.io/documentation/.
Ekphrasis documentation: https://github.com/cbaziotis/ekphrasis.
Scikit-learn documentation: http://scikit-learn.org/stable/documentation.html.
If you find any issues, please open a bug here on GitHub.

emosense-semeval2019-task3-emocontext's People

Contributors

Stargazers

Watchers

Forkers

goryszewskig agn-7 chokapeek chokwangho m-mhm jypara tejaschauhan373 tuurash anvitagupta2000 computational-linguistics-research harshayelchuri

emosense-semeval2019-task3-emocontext's Issues

KeyError:'metrics'

121/121 [==============================] - ETA: 0s - loss: 0.7658 - acc: 0.6902Traceback (most recent call last):
File "Jarvis.py", line 208, in
history = model.fit([message_first_message_train, message_second_message_train, message_third_message_train],
File "C:\Users\DELL\anaconda3\lib\site-packages\tensorflow\python\keras\engine\training.py", line 108, in _method_wrapper
return method(self, *args, **kwargs)
File "C:\Users\DELL\anaconda3\lib\site-packages\tensorflow\python\keras\engine\training.py", line 1137, in fit
callbacks.on_epoch_end(epoch, epoch_logs)
File "C:\Users\DELL\anaconda3\lib\site-packages\tensorflow\python\keras\callbacks.py", line 416, in on_epoch_end
callback.on_epoch_end(epoch, numpy_logs)
File "C:\Users\DELL\anaconda3\lib\site-packages\kutilities\callbacks.py", line 103, in on_epoch_end
self.add_predictions(data if len(data) > 1 else data[0], name=name,
File "C:\Users\DELL\anaconda3\lib\site-packages\kutilities\callbacks.py", line 98, in add_predictions
self.params['metrics'].append(entry)
KeyError: 'metrics'

@sismetanin

Iam getting this error, please help me

unable to open hdf5 file

Hello!

In your code for the emotion detection task (link: https://github.com/sismetanin/emosense-semeval2019-task3-emocontext/blob/master/EmoSense%20at%20SemEval2019%20Task%203%20EmoContext.ipynb)

Did you encounter this error:

OSError: Unable to open file (unable to open file: name = 'models/bidirectional_LSTM_best_weights_0010-0.9125.hdf5', errno = 2, error message = No such file or directory', flags = 0, o_flags = 0)

While running this line [29]:
model.load_weights("models/bidirectional_LSTM_best_weights_0010-0.9125.hdf5")

I've tried !pip install h5py and import h5py but neither helped.

Although I'm getting this error, it doesn't stop the classification report from being created, However, the report I'm getting is entirely different from what you have (see below). I suspect the differences in our reports are due to the above OSError.

f1_e 0.0
precision_e 0.0
recoll_e 0.0
              precision    recall  f1-score   support

           0       0.85      1.00      0.92      2338
           1       0.00      0.00      0.00       142
           2       0.00      0.00      0.00       125
           3       0.00      0.00      0.00       150

    accuracy                           0.85      2755
   macro avg       0.21      0.25      0.23      2755
weighted avg       0.72      0.85      0.78      2755

UndefinedMetricWarning: Precision and F-score are ill-defined and being set to 0.0 in labels with no predicted samples. Use zero_division parameter to control this behavior.

To solve this, I tried setting zero_division=0 or 1 and neither helped.

Please help if you can! Thanks :)

Where is the dataset?

Hello,

Where are the training and testing sets?

NameError: name 'embeddings_matrix' is not defined

NameError Traceback (most recent call last)
in
45 return model
46
---> 47 model = buildModel(embeddings_matrix, MAX_SEQUENCE_LENGTH, lstm_dim=64, hidden_layer_dim=30, num_classes=4)

NameError: name 'embeddings_matrix' is not defined

Please help me in this.... @ sismetanin
@sismetanin

How long did training phase take on your model?

I saw 197,824,434 parameter and 20 epochs in your jupyter file. Also, in SemEval-2019 Task 3: EmoContext Contextual Emotion Detection in Text paper mentioned that the training set is contained around 30,000 rows.
I think these are big values, so how long it takes for training in which resource for you?

kutilities module!!!

ModuleNotFoundError: No module named 'kutilities'
Collecting kutilities
ERROR: Could not find a version that satisfies the requirement kutilities (from versions: none)
ERROR: No matching distribution found for kutilities

Also dev and test files are missing
plz HELP!!!!

Is there the SemEval 2019 EmoContext baseline source code?

Hi
I'm looking for the Baseline of this competition source code.

sismetanin / emosense-semeval2019-task3-emocontext Goto Github PK