Giter VIP home page Giter VIP logo

declare-lab / dialogue-understanding Goto Github PK

View Code? Open in Web Editor NEW
124.0 6.0 21.0 221.23 MB

This repository contains PyTorch implementation for the baseline models from the paper Utterance-level Dialogue Understanding: An Empirical Study

License: MIT License

Python 100.00%
dialogue-systems dialogue-understanding emotion-recognition-in-conversation dialogue-act conversational-ai conversational-agents bert-embeddings bert pretrained-models emotion-recognition

dialogue-understanding's Introduction

Utterance-level Dialogue Understanding

This repository contains pytorch implementations of the models from the paper Utterance-level Dialogue Understanding: An Empirical Study, and Exploring the Role of Context in Utterance-level Emotion, Act and Intent Classification in Conversations: An Empirical Study

Alt text

Task Definition

Given the transcript of a conversation along with speaker information of each constituent utterance, the utterance-level dialogue understanding (utterance-level dialogue understanding) task aims to identify the label of each utterance from a set of pre-defined labels that can be either a set of emotions, dialogue acts, intents etc. The figures above and below illustrate such conversations between two people, where each utterance is labeled by the underlying emotion and intent. Formally, given the input sequence of N number of utterances [(u1, p1), (u2,p2),...., (uN,pN)], where each utterance ui=[ui,1,ui,2,.....,ui,T] consists of T words ui,j and spoken by party pi, the task is to predict the label ei of each utterance ui. In this process, the classifier can also make use of the conversational context. There are also cases where not all the utterances in a dialogue have corresponding labels.

Emotion Intent

Data Format

The models are all trained in an end-to-end fashion. The utterances, labels, loss masks, and speaker-specific information are thus read directly from tab separated text files. All data files follow the common format:

Utterances: Each line contains tab separated dialogue id and the utterances of the dialogue.

train_1    How do you like the pizza here?    Perfect. It really hits the spot.
train_2    Do you have any trains into Cambridge today?    There are many trains, when would you like to depart?    I would like to leave from Kings Lynn after 9:30.

Lables: Each line contains tab separated dialogue id and the encoded emotion/intent/act/strategy labels of the dialogue.

train_1    0    1
train_2    2    0    2

Loss masks: Each line contains tab separated dialogue id and the loss mask information of the utterances of the dialogue. For contextual models, the whole sequence of utterance is passed as input, but utterances having loss mask of 0 are not considered for the calculation of loss and the calcualtion of classification metrics. This is required for tasks where we want to use the full contextual information as input, but don't want to classify a subset of utterances in the output. For example, in MultiWOZ intent classification, we pass the full dialogue sequence as input but don't classify the utterances coming from the system side.

train_1    1    1
train_2    1    0    1

Speakers: Each line contains tab separated dialogue id and the speakers of the dialogue.

train_1    0    1
train_2    0    1    0

Datasets

The original datasets can be found in the glove-end-to-end/datasets/_original and roberta-end-to-end/datasets/_original directories. The following datasets are included in this project:

 IEMOCAP (Emotion Recognition in Conversations)
 DailyDialog (Emotion Recognition in Conversations)
 DailyDialog (Act Classification)
 MultiWOZ (Intent Recognition)
 Persuasion for Good (Persuader's act classification)
 Persuasion for Good (Persuadee's act classification)

Utterance-level minibatch vs Dialogue-level minibatch

The original datasets are experimented using two distinct minibatch formation techniques. They can be found in these directories glove-end-to-end/datasets/dialogue_level_minibatch, glove-end-to-end/datasets/utterance_level_minibatch, and roberta-end-to-end/datasets/dialogue_level_minibatch. Please refer to the paper for more information.

glove-end-to-end/datasets/dialogue_level_minibatch : This folder contains the dataset that can be used to prepare minibatches where all the utterances in the context having a valid label are classified.

glove-end-to-end/datasets/utterance_level_minibatch : This folder contains the dataset that can be used to prepare minibatches where only one utterance in the context is classified and rest of the utterances are used as context.

Alt text

Context Perturbations of the Original Data

Along with the original datasets above, we also provide different variations of these datasets with context perturbations such as style transfer, paraphrasing, label augmentation, etc. Readers can find context perturbed versions of the original datasets in the following folders and refer to the paper to find relevant information.

glove-end-to-end/datasets/inter_speaker
glove-end-to-end/datasets/label_augmentation
glove-end-to-end/datasets/paraphrase_attack
glove-end-to-end/datasets/spelling_attack
glove-end-to-end/datasets/style_transfer

Models

We provide implementations for end-to-end without context classifier, bcLSTM and DialogueRNN models. For bcLSTM and DialogueRNN, we also provide training argument which lets you specify whether to use residual connections or not.

RoBERTa Based Models

Navigate to roberta-end-to-end. We also provide training arguments with which you can alternately use BERT or Sentence Transformers models as feature extractors.

GloVe Based Models

Navigate to glove-end-to-end. We have also released scripts with which you can run different analysis experiments that we report in the paper.

Alt text

Execution

Once navigate to roberta-end-to-end or glove-end-to-end directories to use RoBERTa or GloVe based feature extractors for the models, run the following commands to execute different models explained in the paper. Note that some of the models present in the glove-end-to-end folder are not available in the roberta-end-to-end folder. However, it should not be difficult to adapt these models to use RoBERTa embeddings.

Main Model (Dialogue-Level Minibatch)

To train and evaluate the without context classifier model and the bcLSTM/DialogueRNN model with full context and residual connections:

python train.py --dataset [iemocap|dailydialog|multiwoz|persuasion] --classify [emotion|act|intent|er|ee] --cls-model [logreg|lstm|dialogrnn] --residual

The --cls-model logreg corresponds to the without context classifier.

Main Model (Utterance-level Minibatch)

python train_utt_level.py --dataset [iemocap|dailydialog|multiwoz|persuasion] --classify [emotion|act|intent|er|ee] --cls-model [logreg|lstm|dialogrnn] --residual

Speaker Level Models

w/o inter : Trained at dialogue-level minibatch. To train and evaluate bcLSTM model in this setting i.e. only with context from the same speaker:

python train_intra_speaker.py --dataset [iemocap|dailydialog|multiwoz|persuasion] --classify [emotion|act|intent|er|ee] --residual

w/o intra : Trained at utterance-level minibatch. To train and evaluate bcLSTM model in this setting i.e. only with context from the other speaker:

python train_inter_speaker.py --dataset [iemocap|dailydialog|multiwoz|persuasion] --classify [emotion|act|intent|er|ee] --residual

Shuffled Context and Shuffled Context with Order Prediction Models

Trained at dialogue-level minibatch. To train and evaluate bcLSTM model with various shuffling strategies in train, val, test:

python train_shuffled_context.py --dataset [iemocap|dailydialog|multiwoz|persuasion] --classify [emotion|act|intent|er|ee] --residual --shuffle [0|1|2]

--shuffle 0 : Shuffled context in train, val, test.

--shuffle 1 : Shuffled context in train, val; original context in test.

--shuffle 2 : Original context in train, val; shuffled context in test.

Context Control Models

Trained at utterance-level minibatch. The script is train_context_control.py. You can specify training arguments to determine how to control the context.

Baseline Results

Note

If you are running GloVe-based end-to-end models, please run the scripts multiple times and average the test scores of those runs.

Citation

Utterance-level Dialogue Understanding: An Empirical Study. Deepanway Ghosal, Navonil Majumder, Rada Mihalcea, Soujanya Poria. arXiv preprint arXiv:2009.13902 (2020).

Exploring the Role of Context in Utterance-level Emotion, Act and IntentClassification in Conversations: An Empirical Study. Deepanway Ghosal, Navonil Majumder, Rada Mihalcea, Soujanya Poria. Findings of ACL (2021).

dialogue-understanding's People

Contributors

deepanwayx avatar nmder avatar soujanyaporia avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

dialogue-understanding's Issues

About DialogueRNN performance Prob.

Yours DialogueRNN performance:(in your endtoend implement, Bidirection & attention on emotion strategy are used.)
截屏2020-10-12 上午10 41 22
Original DialogueRNN performance:
1

From Above, your implement of DialogueRNN is worse than the original paper?

A question about DialogRNN

Hi, First, I would like to express my appreciation for your outstanding works in multimodal sentiment analysis, which have greatly inspired me.
But I have one problem about "Party GRU" of your DialogRNN code:
image

As shown in the figure above, my understanding is Speaker A's party state at time t-1 as input to the Party GRU. right?

the Label Shift part is quite confusing to me and the code does not contain this part

Hi, I read about the An Empirical Study paper and don't understand what you did when during the Intra- or Inter- Speaker Shift experiment. Did you select the specific samples only with shifts happen or use some other techniques?
I want to see the detail in the code but apparently they are not published. Would you please kindly explain a little?

I have a question during the training process.

First of all, thank you for doing interesting research.
I have a question in the process of training the roberta-end-to-end model (DialogueRNN).

  1. How can i use multi-gpu?

  2. Is it possible to use the latest transformers version (4.3.2)?

A problem.

Hi, I encounter an error when executing the codes using the command "python train.py --dataset iemocap --classify emotion --cls-model lstm --residual".
image
Is there something wrong with the codes?

CNN extractor Prob.

In your paper, the convoluted features are then max-pooled with a window size of 2(consists with the annotations). But in your code the max-pooled window size is the second dimension length of the Conv1d output. Is that actually what you want?

pooled = [F.max_pool1d(c, c.size(2)).squeeze() for c in convoluted]

Performance for DailyDialog

Hi. I have a question for reproducing performance for dailydialog.

performance_dailydialog

  1. In the photo, @best Valid F1 values ​​are Test F1 values ​​when validation F1 is the highest?

  2. I train the model with batch size =1 due to the computing power problem,.
    Can this be the cause of the difference between the performance of the paper (59.50) and my performance (57.5)?

How to run NN and model on novel input data?

Hi,

This looks to be a great system!

First, I am relatively new to NNs, and I've noticed that often times models are not distributed with code, nor is there an easy entry point for using the model on novel data. I am assuming I am missing something involved in the process, but I don't know what that would be. Could you please explain to me why oftentimes ML codebases don't come with pre-trained models or evaluation capabilities?

Second, I have used the following command to begin training a model:

cd roberta-end-to-end && python train.py --dataset dailydialog --classify act --cls-model lstm --residual --batch-size 2

I don't know Python very well. I am wondering is there an easy way to use this trained model, when completed, in order to process texts? I can definitely preprocess the text into any format using Perl. I just don't know how to run/evaluate the NN on specific input data for use in classifying texts (in my case I'm trying to extract commissives from multi-party dialogues (however it looked like these dialogues were two-person dialogues, so perhaps I should have trained the utterance model)).

Thank you for making this software available, it may be a great help to us,

Andrew

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.