Giter VIP home page Giter VIP logo

dstc11-track2-intent-induction's Introduction

Intent Induction from Conversations for Task-Oriented Dialogue

License

This repository contains data, relevant scripts and baseline code for the DSTC11 summer track on Intent Induction from Conversations for Task-Oriented Dialogue.

This track aims to evaluate methods for the automatic induction of customer intents in the realistic setting of customer service interactions between human agents and customers. As complete conversations will be provided, participants can make use of information in both agent and customer turns. The track includes two tasks: (1) intent clustering, which requires participants to assign labels to turns in the dialogues where customers express intents, and (2) open intent induction, in which participants must induce a set of intents from dialogues, with each intent defined by a list of sample utterances to be used as training data for an intent classifier.

Organizers: James Gung, Raphael Shu, Jason Krone, Salvatore Romeo, Arshit Gupta, Yassine Benajiba, Saab Mansour and Yi Zhang

Contact: dstc11-intent-induction (AT) amazon (DOT) com

News

Timeline

  • Development data release: June 13th, 2022
  • Test data release: September 26th, 2022
  • Entry submission deadline: October 3rd, 2022
  • Final result announcement: October 24th, 2022
  • Paper submission: December 2nd, 2022
  • Paper acceptance notification: January 20th, 2023
  • Camera-ready submission deadline: January 27th, 2023
  • DSTC11 Workshop: SIGDIAL x INLG 2023, September 11-15th, 2023

DSTC11 Track 2 Tasks

Running Baselines

Python 3 (>=3.7) is required. Using a conda/virtual environment is recommended.

# install dependencies
pip3 install -r requirements.txt

# run intent clustering (Task 1) baselines and evaluation
python3 -m sitod.run_experiment \
--data_root_dir dstc11 \
--experiment_root_dir results \
--config configs/run-intent-clustering-baselines.jsonnet

# run open intent induction (Task 2) baselines and evaluation
python3 -m sitod.run_experiment \
--data_root_dir dstc11 \
--experiment_root_dir results \
--config configs/run-open-intent-induction-baselines.jsonnet

Important Links

Security

See CONTRIBUTING for more information.

License

This project is licensed under the Apache-2.0 License.

Please cite the following papers if using the tasks, code, or data from this track in your work:

@misc{gung2023natcs,
      title={NatCS: Eliciting Natural Customer Support Dialogues}, 
      author={James Gung and Emily Moeng and Wesley Rose and Arshit Gupta and Yi Zhang and Saab Mansour},
      year={2023},
      eprint={2305.03007},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}
@misc{gung2023intent,
      title={Intent Induction from Conversations for Task-Oriented Dialogue Track at DSTC 11}, 
      author={James Gung and Raphael Shu and Emily Moeng and Wesley Rose and Salvatore Romeo and Yassine Benajiba and Arshit Gupta and Saab Mansour and Yi Zhang},
      year={2023},
      eprint={2304.12982},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

dstc11-track2-intent-induction's People

Contributors

amazon-auto avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

dstc11-track2-intent-induction's Issues

Development dataset: empty utterances

Some utterances in development dataset contain no words. Examples of the utterance content: '', '.', '...'.

The proposal identifies datasets as "spoken conversations with manual transcriptions and annotations".

The questions are:

  1. What is the reason to use empty transcriptions?
  2. Can it affect turns with intents in evaluation datasets?

Additional info:
Development dataset contains no turns with Intent or InformIntent labels and empty utterance.
The total number of empty utterances is 142.
The full list of turns with empty utterances: empty_utterances.txt

Development dataset: Agent's turns with intent labels

Some Agent's turns contain intent labels.

The main task of the track is to identify intents of Customer. However, development dataset contains intent labels for Agent's turns as well.

The questions are:

  1. Should we use only Customer's turns for clustering? Are Agent's turns with intent labels able to affect the results on evaluation datasets?
  2. Has Classifier for InformIntent label been trained on all turns with intent labels or only Customer's intents?
  3. Has Classifier for InformIntent label been trained on data from all domains (development and evaluation) or only on the particular domain?

Additional info:
The total number of Agent's turns with intent labels is 78, including 1 with InformIntent label as well.
The full list of Agent's turns with intent labels: agent_intent.txt

How to evaluate the submissions?

Hi. It is clarified in the proposal that ACC and NMI will be used to evaluate models for task 1, F1 and coverage will be used to evaluate models for task 2. However, in the released raw results, only ACC is used to rank the submissions. I want to make sure which metrics will be used finally for these two tasks.

Thanks!

About running clustering condition on experiments.py

https://github.com/amazon-research/dstc11-track2-intent-induction/blob/0312cbcd666dc3efb20a32a0658ca5e723081b78/sitod/experiment.py#L133

I think the intention is to skip running clustering when there're prediction results already and also you don't want to ovewrite them.
The codes would be like
skip condition: Path.exists() and not self._overwrite
run condition: not(Path.exists() and not self._ovewrite) -> not Path.exists() or self._ovewrite.

and not seems to be changed to or.

A list of intents

You said you will provide a list of intents, where it is, hope you can answer, thank you

Why test-utterances.jsonl files have only two sentences?

Hi I've participated dstc11-track2

I did some experiments and found weird scores like:

RunID	NMI	ARI	ACC	Precision	Recall	F1	Example Coverage	Reference K	K	# Intents	# Utterances	# Utterances per Intent
test_model	100.0	100.0	100.0	100.0	100.0	100.0	100.0	1.0	1.0	42.0	3684.0	87.7

So I check the test-banking and test-finance dataset, but there are only two sentences:

{"utterance": "Dummy test utterance 1", "utterance_id": "finance_0000", "intent": "DummyLabel"}
{"utterance": "Dummy test utterance 2", "utterance_id": "finance_0001", "intent": "DummyLabel"}

Done.

Did i get wrong datasets?

About ground-truth label of dialog acts

In dev set, only predicted labels of dialog acts are provided, so where can I find the ground-truth labels of dialog acts? I think utterances with intent labels include all samples informing intents in dev set, is it correct?

Thanks!

When will the test set be sent?

hello, excuse me, when will the test set be released? and how will the test set be distributed? and how do i know if I've signed up successfully?

thanks

About submission

Is there a requirement for the submitted file name? There is also an option to fill in the email address. Do we have to choose Google Mail? (I don't mean the email address of the bind form)

Is it right to use word 'open' when explain Task 1

Hello, Thank you for organizing this track!:)
I'm participating in task1 and have some questions.
In task 1, there was no in-domain train data for the test set.
For this reason, I think Task 1 is also an 'open' intent clustering task.
Is it right to use the word 'open' to explain task 1 in the workshop paper or other reports?

About the test set samples

I notice that in the test-banking/dialogues.jsonl or test-finance/dialogues.jsonl, there are cases where "dialogue_acts" is "[InformIntent]", but the intents are []. And there are also cases where "dialogue_acts" is "[InformIntent]", but the intents are ["DummyLabel"]. Does this mean we only need to cluster the sentence with the "["DummyLabel"]" intents?

Results of workshop paper

Hello, we saw "Camera-ready submission deadline: January 27th, 2023", but we have not received any news about the workshop. May I ask when you will announce it?

No google accounts

Hello. It seems the submission is done via google form, but i do not have a google account, and i have difficult creating one. Are there other ways of submitting the files?

sumission code issue

i wonder should we submit our complete code(how we trained model etc) in this deadline("Entry submission deadline: October 3rd, 2022")? or just summit one schema json file?
thanks!

About Dialogue act in task 2.

In dstc11/task2-open-intent-induction.md, it is written as "Participants will be able to use the provided automatic dialog act classifier predictions as inputs to their system.".
But there seems to be no related codes.
What do we have to do to use that automatic dialogue act classifier?

Hello, Question for Testing Dataset in DSTC 11

In Test Dataset, it contains only utterances labeled with Inform Intent.
We wonder whether the input in evaluation is the same as the dev dataset in GitHub.

{"intent_id": "intent_1", "utterances": ["I want to book a flight", "I need a flight", ...]}
{"intent_id": "intent_2", "utterances": ["I want to reserve a room", "I need a hotel room", ...]}
...

Thank you.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.