Giter VIP home page Giter VIP logo

texty's Introduction

Texty

Texty is a collection of simple script frontends to NLP toolkits.

Oftentimes when doing the initial stages of text analysis, you'll want to do some quick and dirty analysis, testing various methods to see what works for your particular application. To help in this, there are many excellent open source NLP tools out in the wild. However, each of these tools has their own pecularities, requirements, data formats etc., and just getting your data into the right format can take quite some time digging through documentation and Java APIs to figure out.

This is where Texty comes in. Texty is essentially a collection of scripts, intended to provide a common command line frontend for NLP tasks: POS tagging, classification, topic modeling, etc. It leverages a number of existing open source tools under the hood for document conversion and algorithm implementations. Essentially, this project is the outcome of my frustration with having to mess with all these tools in a nonuniform manner.

What Texty is not:

  1. It does not contain any new NLP algorithms. Texty relies on underlying libraries to perform NLP tasks.
  2. It probably won't give you the state-of-the-art answer. Texty is intended for rapid prototyping purposes.

Usage

Run the texty script to see available commands.

Currently integrated:

  1. Document categorization using OpenNLP: texty cat, texty cat-train

Requirements

If you're in Windows, you'll need Cygwin. Otherwise, everything should be bundled with Texty.

TODO

  • Integrate more functionality
  • Test in Linux

Links

texty's People

Contributors

mikelieberman avatar

Watchers

James Cloos avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.