Giter VIP home page Giter VIP logo

nlp-sentiment-analysis's Introduction

Aspect-based Sentiment Analysis

Requirements

The sentiment analysis software is written in Python, and supports Python 3.7 and above. In addition, folders data and models must contain some files. Check README files in respective folders for further information.

To install requried dependencies, you can run pip install -r requirements.txt.

Running with Docker

This project is also available as a docker container. It is recommended to use Docker and docker-compose for running this project. Commands to run phases within Docker containers are given in chapter Usage along with the regular commands for running the project locally. We use the same docker-compose volumes for all steps. Therefore, as long as you do not delete these volumes (by e.g. runnnig docker-compose down), the intermediate data should be persistent.

The docker container contains all data, models and code required for running this project successfully. Its size is around 3.8GB.

Usage

1. Preprocessing & Feature Extraction

Before any sentiment prediction can be done on the SentiCoref corpus we have to preprocess the data. This can be done by running

python -m nlp_code.preprocessing

# Docker
docker-compose run preprocessing

from the root folder of the repository. This will perform lemmatisation and POS tagging on the corpus using the Stanza library. Results at this step will be saved to data/cache.

Next, feature extraction is performed, using FeaturePipeline with defined feature extracotrs. The latter are defined in module nlp_code.features. Features extracted from preprocessed text are saved in data/features. These TSV files contain one row of extracted features for each word in a coreference chain.

It is recommended that feature extractors are only added to the FeaturePipeline, since removing them might break previously implemented models that depend on some removed features.

2. Model Evaluation

Classical models

After preprocessing is completed, model evaluation can be started by running

python -m nlp_code.models

# Docker
docker-compose run models

from the root folder of the repository. Models can use any of the features stored in data/features.

Neural models

To run the CustomSentiCorefModel, you can run

python -m nlp_code.models_neural

# Docker
docker-compose run models_neural

from the root folder of the repository. This will perform the required preprocessing, training and evaluation of this model. Make sure you downloaded the required BERT model, described here.

To run the BertEmbeddingsSentiCoref (takes quite some time), you can run

python -m nlp_code.bert_embeddings

# Docker
docker-compose run bert_embeddings

For running the pretrained model with balanced training set run the Docker command (see instructions from CustomSentiCorefModel) or download the trained model from here, unzip it into models/ and run

python -m nlp_code.pretrained_bert_embeddings_balanced

# Docker
docker-compose run pretrained_bert_embeddings_balanced

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.