Giter VIP home page Giter VIP logo

lightner's Introduction

LightNER

License PyPI version Downloads

Check Our New NER Toolkit🚀🚀🚀

  • Inference:
    • LightNER: inference w. models pre-trained / trained w. any following tools, efficiently.
  • Training:
    • LD-Net: train NER models w. efficient contextualized representations.
    • VanillaNER: train vanilla NER models w. pre-trained embedding.
  • Distant Training:
    • AutoNER: train NER models w.o. line-by-line annotations and get competitive performance.

This package supports to conduct inference with models pre-trained by:

  • Vanilla_NER: vanilla sequence labeling models.
  • LD-Net: sequence labeling models w. efficient contextualized representation.
  • AutoNER: distant supervised named entity recognition models (no line-by-line annotations for training).

We are in an early-release beta. Expect some adventures and rough edges.

Quick Links

Installation

To install via pypi:

pip install lightner

To build from source:

pip install git+https://github.com/LiyuanLucasLiu/LightNER

or

git clone https://github.com/LiyuanLucasLiu/LightNER.git
cd LightNER
python setup.py install

Usage

Pre-trained Models

Model Task Performance
LD-Net pner1.th NER for (PER, LOC, ORG & MISC) F1 92.21
LD-Net pnp0.th Chunking F1 95.79
Vanilla_NER NER for (PER, LOC, ORG & MISC)
Vanilla_NER Chunking
AutoNER autoner0.th Distant NER trained w.o. line-by-line annotations (Disease, Chemical) F1 85.30

Decode API

The decode api can be called in the following way:

from lightner import decoder_wrapper
model = decoder_wrapper()
model.decode(["Ronaldo", "won", "'t", "score", "more", "than", "30", "goals", "for", "Juve", "."])

The decode() method also can conduct decoding at document level (takes list of list of str as input) or corpus level (takes list of list of list of str as input).

The decoder_wrapper method can be customized by choosing a different pre-trained model or passing an additional configs file as:

model = decoder_wrapper(URL_OR_PATH_TO_CHECKPOINT, configs)

And you can access the config options by:

lightner decode -h

Console

After installing and downloading the pre-trained mdoels, conduct the inference by

lightner decode -m MODEL_FILE -i INPUT_FILE -o OUTPUT_FILE

You can find more options by:

lightner decode -h

The current accepted paper format is as below (tokenized by line break and -DOCSTART- is optional):

-DOCSTART-

Ronaldo
won
't
score
more
30
goals
for
Juve
.

The output would be:

<PER> Ronaldo </PER> won 't score more than 30 goals for <ORG> Juve </ORG> . 

lightner's People

Contributors

liyuanlucasliu avatar shangjingbo1226 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

lightner's Issues

language support?

Do lightNER and AutoNER support other languages?
Chinese, French, spanish etc.

Thanks!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.