Giter VIP home page Giter VIP logo

nlu-evaluation-corpora's Introduction

README

This project is a collection of three corpora which can be used for evaluating chatbots or other conversational interfaces. Two of the corpora were extracted from StackExchange, one from a Telegram chatbot.

If you use the data and publish please let us know and cite our SIGdial 2017 paper:

@InProceedings{braun-EtAl:2017:SIGDIAL,
  author    = {Braun, Daniel  and  Hernandez-Mendez, Adrian  and  Matthes, Florian  and  Langen, Manfred},
  title     = {Evaluating Natural Language Understanding Services for Conversational Question Answering Systems},
  booktitle = {Proceedings of the 18th Annual SIGdial Meeting on Discourse and Dialogue},
  month     = {August},
  year      = {2017},
  address   = {Saarbrücken, Germany},
  publisher = {Association for Computational Linguistics},
  pages     = {174--185},
  url       = {http://www.aclweb.org/anthology/W17-3622}
}

Errata

There is an error in Table 5 of the paper. In the "true +" column, the overall sum should be 573, not 820, and accordingly precision, recall, and f-score are 0.92, 0.85, and 0.88.

[The reason for this error is in the Excel evaluation sheet, the total number of "true +" (573) was stored as number of "true +" for the chatbot corpus. Added up with the result for the other corpora (77, 170) we end up with 820.]

License

All three corpora are released under the CC BY-SA 3.0 license.

Content

Ask Ubuntu Corpus

162 questions and answers from https://askubuntu.com.

Five intents (MakeUpdate, SetupPrinter, ShutdownComputer, SoftwareRecommendation, None) and three entity types (Printer, Software, Version).

Web Applications Corpus

89 questions and answers from https://webapps.stackexchange.com.

Eight intents (ChangePassword, DeleteAccount, DownloadVideo, ExportData, FilterSpam, FindAlternative, SyncAccounts, None) and three entity types (WebService, OS, Browser).

Chatbot Corpus

206 questions from a Telegram chatbot for public transport in Munich.

Two intents (Departure Time, Find Connection) and five entity types (StationStart, StationDest, Criterion, Vehicle, Line).

Evaluation Scripts

Python scripts for automated evaluation are provided here.

Contact Information

If you have any questions, please contact:

Daniel Braun (Technical University of Munich) [email protected]

nlu-evaluation-corpora's People

Contributors

dabr01 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

nlu-evaluation-corpora's Issues

"text": "Xubuntu 12.04 LTS?"

There's an error in line 909 of Ask Ubuntu Corpus - the text of an intent's entity is "Xubuntu 12.04 LTS?" instead "Xubuntu 12.04 LTS". The error is probably caused by an error in "LTS?" tokenization, as the value of stop property looks OK.

Line 357, "SoftwareType" intent's entity is "Project" instead "Microsoft Project".

WebApplicatonCorpus, line 826, the web service annotated as Google instead Google Apps. Note that "Google Apps" is annotated as a service in other examples. It may be related to tokenization problem again ("...Google Apps?").

Corpora in format for other systems

Hi, could you please provide corpora in format you were using to test other systems. Depending on the way you convert data results may be different.

Data size for Web Applications Corpus

Hey you mentioned that Web Applications Corpus has 100 data sets, but only 89 (30 for train and 59 for test) is included in your json file. Can you explain why there is a difference.

thanks

greg

Possible wrong intent classification

There are some utterances that perhaps does not have the correct intent:

AskUbuntuCorpus

  • Utterance "Is there a Document scanning and archiving software?", is classified as "None", perhaps should be classified as "Software Recommendation"
  • Utterance "Is it recommended to upgrade to Lubuntu 15.04?" is classified as "Software Recommendation", perhaps should be classified as "Make Update"
  • Utterance "On really old Ubuntu 6.06 - How to upgrade" is classified as "Software Recommendation", perhaps should be classified as "Make Update"

WebApplicationsCorpus

  • The utterance "Good wireframing apps?" is classified as "None", but In my oppinion a better classification can be "Find Alternative". Basically if the utterance was "Good wireframing apps like balsamiq" this will be clearly a "Find Alternative". By ommiting the entity the intent should remain the same.
  • The utterance "What are alternatives to Postini for spam filtering?" is classified as "Filter Spam", perhaps should be classified as "Find Alternative"

Unable to find the logic in the start and stop integers for entities

Below are two sentences from AskUbuntuCorpus.json. For the entity in the first sentence the start and stop have same value and point to the same word. However, for the second sentence start and stop should also have the same values. One increment can be explained by the dot, the other cannot.

{
    "author": "Tom Brito",
    "url": "http://askubuntu.com/questions/102675/is-there-a-project-management-software-for-ubuntu-like-microsoft-project",
    "text": "Is there a project management software for Ubuntu like Microsoft Project?",
    "entities": [
        {
            "text": "Project",
            "entity": "SoftwareName",
            "stop": 10,
            "start": 10
        }
    ],
    "intent": "Software Recommendation",
    "answer": {
        "text": "<p>I can also suggest <a href=\"http://apt.ubuntu.com/p/planner\" rel=\"nofollow noreferrer\">planner</a> <a href=\"http://apt.ubuntu.com/p/planner\" rel=\"nofollow noreferrer\"><img src=\"https://hostmar.co/software-small\" alt=\"Install planner\"></a>.  It's available in Ubuntu software-center.</p>\n\n<p><a href=\"http://www.taskjuggler.org/\" rel=\"nofollow noreferrer\">TaskJuggler</a> is really powerful but also a bit harder to use and is not available in Software Center.</p>\n",
        "author": "roadmr"
    },
    "training": false
},
{
    "author": "C.S Oren",
    "url": "http://askubuntu.com/questions/95328/how-to-partially-upgrade-ubuntu-11-10-from-ubuntu-11-04",
    "text": "How to partially upgrade Ubuntu 11.10 from Ubuntu 11.04?",
    "entities": [
        {
            "text": "11.10",
            "entity": "UbuntuVersion",
            "stop": 7,
            "start": 5
        },
        {
            "text": "11.04",
            "entity": "UbuntuVersion",
            "stop": 12,
            "start": 10
        }
    ],
    "intent": "Make Update",
    "answer": {
        "text": "<p>Upgrade using alternate-ubuntu iso \nit is better for slower connections\nit may solve your problem</p>\n",
        "author": "Tachyons"
    },
    "training": false
},

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.