Giter VIP home page Giter VIP logo

ner's Introduction

Project: TestNER
Author: Yana Ponomarova and Selsabil Gaied
Creation Date: 12/01/2016
Last updated: 18/01/2016 by Yana Ponomarova

Objective: To train a model based on the NER from Stanford NLP library to recognize the Supply Chain entities in the Oil & Gas iundustry:
PROD - product (eg., crude oil, gas, heavy crude, diesel),
TERM - terminal (eg., Torshamnen terminal),
OFI - oil field (eg., Western Siberian Oil Fields),
REF - refinery (eg., Lisichansk oil refinery),
PIPE - pipeline (or other transportation mode) (eg., Eilat Ashkelon Crude Oil Pipeline)
or O - other (all the rest).

This model is then applied to determine the entities in the unseen text.

Files:

1- main.ner.Train.scala: //CrfTrainer.trainClassifier("FileName","SerializeTo")

2- main.ner.interactiveTraining.scala:
  An interactive app that presents the user with a document and allows them to annotate each token as a named entity
  type.  By convention, an annotation of "O" means the token is not a named entity.

  Documents are presented to the user line-by-line.  For convenience, hitting "ENTER" indicates the current token is
  not a named entity and is automatically annotated with a "O".  Also for convenience, if the user sees there are no
  named entities on the current line, they may type "next", in which case all remaining tokens on the current line are
  automatically annotated with "O" and the next line is presented.

  All annotations are written to file in a format that can be read by the Stanford NER tool for training.

  Input: (Required)

    args(0): File path+name to write annotations
    val annotationsWriteFile = args(0)
    args(1):Location of directory containing documents on which to perform annotations.
    val documentDirectory = args(1)

3- main.ner.NERDemo.scala: Stanford NER Demo file rewritten in Scala

    Use:
    - If arguments aren't specified, they default to classifiers/english.muc.7class.distsim.crf.ser.gz
    and some hardcoded sample text.
    - Or: args(0): classifier
    - Else, input required: args(0): classifier
                            args(1): textFile
        Ex., classifiers/CrudeSupply2.classifier.ser.gz src/main/ressources/SupplyExample.txt

4- main.ner.NERMultiClassifier.scala: same as main.ner.NERDemo.scala, difference that it combines the result of the petro-gaz classifier with the stanford classifier.

5- run.sh (to be completed)

ner's People

Contributors

yana-ponomarova avatar sgaied avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.