Giter VIP home page Giter VIP logo

open-eval's Introduction

OpenEval

Build Status Build Status

About

Data scientists working on machine learning problems have historically had several issues relating to evaluating their systems: spending time individually developing evaluation frameworks for tasks, comparing results over time, and keeping evaluations consistent among teams. OpenEval is a system designed to address these problems.

In developing this system we set out to build a centralized, easy-to-use platform for groups to evaluate their models. All the user needs to do to evaluate their solver is host it on a thin server, which we provide. Then, on the web interface, they need to select their desired task and dataset to test their solver. After their solver finishes processing the dataset, the user can view the results.

Modules

The project contains two main modules.

  • The OpenEval core, which contains the main functionalities and the web app.
  • A Learner, which acts as a toy system to be evaluated against core.

Quick Guide on Running the Apps

Your will need Java 8 in order to run App. openjdk on Ubuntu seems to have issues.

You will also need sbt

First, run sbt from the parent directory.

  • projects will show the names of the existing module names.
    • project core will take you inside the core package.
    • project learner will take you inside the examples package.
  • Inside each project you can compile it, or run it.

If you run the core you can browse to localhost:9000. To run on specific port simply add the port number after run. You also should not have to start it multiple times. You can just save code, and refresh the page.

Note: OpenEval server needs to store its backend information in a SQL database. If you want to run it on your machine first create a SQL DB, rename core/conf/application.conf.sample to core/conf/application.conf, and add the DB information (url, username and password).

open-eval's People

Contributors

dhruvvajpeyi avatar dshine2 avatar heglertissot avatar joshuacamp avatar paultgibbons avatar ryannk avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

open-eval's Issues

readme for core

I think we should have a readme for the core, where we explain "how to start using it", e.g. setting up DB, as well as its internal structure like evaluators, connections to DB, Learner etc.

change team name

change "team name" to "configuration name" on add configuration page

Fix TextAnnotation clone

   @Test
public void cloneTest() {
    String[] views = new String[] {ViewNames.POS, ViewNames.SENTENCE};
    TextAnnotation ta = DummyTextAnnotationGenerator.generateAnnotatedTextAnnotation(views, false);
    Assert.assertTrue(ta.hasView(ViewNames.SENTENCE));
    Assert.assertTrue(ta.hasView(ViewNames.POS));
    TextAnnotation copy = null;
    try{
        copy = (TextAnnotation) ta.clone();
    }
    catch(CloneNotSupportedException te){
        return;
    }

    copy.removeView(ViewNames.SENTENCE);
    Assert.assertFalse(copy.hasView(ViewNames.SENTENCE));
    Assert.assertTrue(ta.hasView(ViewNames.SENTENCE)); // FAILS ASSERTION

}

ordering configurations

Is there a way to sort the configurations shown in the landing page? Say by the last time they are used/updated?

Initiate the core

proposal:
Add a class (in app/controllers with package name, say edu.illinois.cs.cogcomp.core) for the core.
The core is supposed to contain the interface of the solver and the evaluation system. The internal datastructure of the system is supposed to be a (TextAnnotation)[https://github.com/IllinoisCogComp/illinois-cogcomp-nlp/blob/master/core-utilities/src/main/java/edu/illinois/cs/cogcomp/core/datastructures/textannotation/TextAnnotation.java#L21]. So we need add core-utilities as dependency (example here)[https://github.com/IllinoisCogComp/saul/blob/master/build.sbt#L14].

The evaluation framework is supposed to send a TextAnnotation object to the solver, and will receive a modified TextAnnotation.

Define CHUNK size

Retrieving and cleansing the TextAnnotations is very computationally expensive, so I propose we define a variable CHUNK_SIZE in a config file defining the number of TA's to send to the solver at a time.

Storing / Reading datasets

@mssammon @christos-c @cogcomp-dev and all open-eval team!

We have discussed this a couple of times (beyond the open-eval) that how to access datasets programmatically, meanwhile not adding them directly to our code repositories. This is something that can be very useful for any project, but essential for open-eval, since we want to load the data programmatically inside the evaluation system.

I want to bring AI2's datastore to your attention. This package saves all data files/folders on Amazon S3 (which is very cheap). I did a little experimentation with this package (A deploy is here).

Using it is very easy; for example here I am uploading the dataset we use for training/testing POS tagging from my computer:

    Datastore.publishDirectory(
      "/home/daniel/ideaProjects/saul/data/POS",
      "edu.illinois.cs.cogcomp",
      "POS-tagging-data",
      1,
      false)

Later for accessing the data (on any machine, without having the data locally) I can have do:

    val dataPath: java.nio.file.Path =
      Datastore.directoryPath(
        "edu.illinois.cs.cogcomp",
        "POS-tagging-data",
        1)

Running the above will download the data from S3 to a folder (in home dir) and give its path. (running it next time it just reads it from cache). Building on top of the above, I can evaluate Saul POS tagger in few simple steps:

def testPOSTagger() = {
    val dataPath: java.nio.file.Path =
      Datastore.directoryPath(
        "edu.illinois.cs.cogcomp",
        "POS-tagging-data",
        1)

    /** Read your data from datastore. */
    lazy val testData = {
      val testDataReader = new PennTreebankPOSReader("testData")
      testDataReader.readFile(dataPath.toString + "/22-24.br")

      var sentenceId = 0
      testDataReader.getTextAnnotations.flatMap(p => {
        val cons = commonSensors.textAnnotationToTokens(p)
        sentenceId += 1
        //      Adding a dummy attribute so that hashCode is different for each constituent
        cons.foreach(c => c.addAttribute("SentenceId", sentenceId.toString))
        cons
      }).toList
    }

    /** Populate your data in the model */
    POSDataModel.tokens.populate(testData, train = false)

    /** Load the models for the POS classifier */
    POSClassifiers.loadModelsFromPackage()

    testPOSTagger(testData)
  }


  def testPOSTagger(testData: List[Constituent]): Unit = {
    val tester = new TestDiscrete
    val testReader = new LBJIteratorParserScala[Constituent](testData)
    testReader.reset()

    testReader.data.foreach(cons => {
      val gold = POSDataModel.POSLabel(cons)
      val predicted = POSClassifiers.POSClassifier(cons)
      tester.reportPrediction(predicted, gold)
    })

    tester.printPerformance(System.out)
  }

which gives me the follow results:

...
[info] VBG         91.483  92.240  91.860   1933   1949
[info] VBN         86.041  90.395  88.164   2707   2844
[info] VBP         93.894  91.374  92.617   1565   1523
[info] VBZ         96.941  96.059  96.498   2639   2615
[info] WDT         97.967  90.753  94.222    584    541
[info] WP          98.596  99.293  98.944    283    285
[info] WP$        100.000 100.000 100.000     37     37
[info] WRB         99.671  99.671  99.671    304    304
[info] ``         100.000 100.000 100.000   1074   1074
[info] ------------------------------------------------
[info] Accuracy    96.439    -       -      -    129654

So what do you think about adopting this project for our work?

Side note: here is repeating the same steps in Java:

java.nio.file.Path dataPath = Datastore$.MODULE$.directoryPath( "edu.illinois.cs.cogcomp", "POS-tagging-data", 1);

        PennTreebankPOSReader testDataReader = new PennTreebankPOSReader("testData");
        testDataReader.readFile(dataPath.toString() + "/22-24.br");

        java.util.List<Constituent> testData = testDataReader.getTextAnnotations()
                .get(0).getView(ViewNames.TOKENS).getConstituents();

        scala.collection.Iterable<Constituent> testDataInScalaCollection = scala.collection.JavaConversions.asScalaBuffer(testData);

        /* Populate your data in the model */
        POSDataModel.tokens().populate(testDataInScalaCollection, false);

        /* Load the models for the POS classifier */
        POSClassifiers.loadModelsFromPackage();

        /* Make prediction on the input instances */
        for(Constituent constituent : testData ) {
            String predicted = POSClassifiers.POSClassifier(constituent);
            System.out.println(constituent + "  ->  " +  predicted);
        }

Renaming module names

When we deploy the project as jar files, the name will be learner_xx.jar which is not very intuitive.
Let's rename the module name to something else; say openeval-learner?

Following that, to keep the consistency, let's rename others as well, say to openeval-core.

Also let's make sure only the learner get's published not the other modules.

Fixing Tasks and Task Variants

Task -------------------------- Task Variant

Part of Speech Tagging -> Raw Text, Gold Token, Sentence Boundaries
Named Entity Recognition -> Raw Text, Gold Token, Sentence Boundaries
Parsing -> Raw Text, Gold Token, Sentence Boundaries
Co-reference -> Raw Text, Gold Token, Sentence Boundaries

Import POS tagging dataset

Since we are setting our learner to work based on POS tagging, let's import the POS tagging dataset into our DB, so that we can run our system against this data.
Since the data is propriety, I am not sharing it here. Instead I am share the data personally.

Here is how to read the data via a reader inside CoreUtils:

PennTreebankPOSReader testDataReader = new PennTreebankPOSReader("testData");
testDataReader.readFile(dataPath.toString() + "/22-24.br");

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.