Giter VIP home page Giter VIP logo

lapos's Introduction

Build Status

About

This is an un-official fork of the Lapos tagger, based on version 0.1.2. Official source available here.

The goal of this fork is to add Unicode support for use in the Classical Language Toolkit. Once fixed, the CLTK hopes that these changes will be merged upstream.

Build

There are two branches, master being for Linux and apple being for Mac OS (some changes were made for Clang, see below).

Use

For full instructions, see README. The CLTK's Latin model (based on Perseus treebanks) was made with the following command:

$ ./lapos-learn -m ./model latin_training_set.pos

Note: You can get this trainined set with curl -O https://raw.githubusercontent.com/cltk/latin_treebank_perseus/master/latin_training_set.pos.

For running, use echo to pass one sentence at a time:

$ echo "He opened the window." | ./lapos -t -m ./model_wsj02-21
He/PRP opened/VBD the/DT window/NN ./.

Changes

To compile on Clang, a few changes need to be made, namely removing tr1 from, e.g., (<tr1/unordered_map> and td::tr1::unordered_map).

We also increased the maximum number of tags, from 50 to 2000 (in crf.h, commenting out enum { MAX_LABEL_TYPES = 50 }; and uncommenting const static int MAX_LABEL_TYPES = 2000;). Also removed the unnecessary empty-input-line warning in crf.ppp ("warning: empty sentence").

License

Lapos created by Yoshimasa Tsuruoka, Yusuke Miyao, and Jun'ichi Kazama. For all technical details, see README and for license LICENSE.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.