Giter VIP home page Giter VIP logo

neural_bow_toolkit's Introduction

neural_BOW_toolkit

The code implements the neural bag-of-words models proposed in coling 2016: [Weighted Neural Bag-of-n-grams Model: New Baselines for Text Classification] (http://www.aclweb.org/anthology/C/C16/C16-1150.pdf). This toolkit achieves state-of-the-art results on a range text classification and sentiment analysis tasks. We think this toolkit is a good start point for NLP beginners. Also, we recommend to use this toolkit on real-world challenges since it is much more efficient compared with complex deep neural models.

project

Our project include three components: (1) the datasets: nine binary classification datasets (2) liblinear: logistic regression (3) src: including entire source code

datasets

IMDB RT2k RTs subj AthR BbCrypt XGraph MPQA CR

One can download the datasets at datasets. Put the file in the path neural_BOW_toolkit/

liblinear

logistic regression

source code

The source code consists of three parts. (1) The dataset classes. Each .java file corresponds to a dataset, including the dataset's meta data and file loading operation. (2) The model part. Including the entire training processing. (3) the utiliy parts. Sample class encapsulates the samples (or instances) in the datasets. LightSample class is a simplified version of Sample Class. LightSample class only stores information that is used during the training process. The Classifier class is used for calling the liblinear tool. ย 

Acknowledgements

Three projects help us a lot. The first is the work from Mesnil et al in 2015 ICLR workshop. They implement Paragraph Vector model in C at https://github.com/mesnilgr/iclr15. Another is the work done by Li et al. in 2016 ICLR workshop. They implement n-gram Paragraph Vector model in JAVA at https://github.com/libofang/DV-ngram. The last is the work from Wang and Manning. They propose the NBSVM and publish their code at https://github.com/sidaw/nbsvm. In fact, our models are the neural counterparts of NBSVM. We use the exactly the same datasets with NBSVM.

neural_bow_toolkit's People

Contributors

zhezhaoa avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

Forkers

meshiguge zhuster

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.