Giter VIP home page Giter VIP logo

cnntweets's Introduction

Deep Ensemble for Sentiment Analysis

Installation

  • clone the repository
git clone [email protected]:bgshin/cnntweets.git
  • make virtual env
mkvirtualenv sent
  • Dependencies
pip install -r requirements.txt
  • Python 2.7
  • requirements
    • boto==2.40.0
    • bz2file==0.98
    • gensim==0.12.4
    • numpy==1.11.0
    • protobuf==3.0.0b2
    • requests==2.10.0
    • scipy==0.17.1
    • six==1.10.0
    • smart-open==1.3.3
    • tensorflow==0.8.0

Usage

  • WITHOUT pre-trained w2v

    • Train

       cd cnn
       nohup python cnn_train.py > out.txt &
    • Test

      • Modify cnn/cnn_test.py

         savepath = 'model_path/model-xxxx'
      • Run test script

         cd cnn
         python cnn_test.py
  • WITH pre-trained w2v

    • Download and extract the compressed file to have the pre-trained w2v bin file

    • Modify w2v_cnn/cnn_train.py

       model_path = 'path_to_w2v_bin/word2vec_twitter_model.bin'
    • Train

       cd w2v_cnn
       nohup python cnn_train.py > out.txt &
    • Test

      • Modify w2v_cnn/cnn_test.py

         savepath = 'model_path/model-xxxx'
      • Run test script

         cd w2v_cnn
         python cnn_test.py

Dataset

Semeval 2016

  • Dev (semeval16_T4A_devtest_npo)

    • number of data: 1588
  • Tst (semeval16_T4A_test_npo)

    • number of data: 20632
  • Trn (semeval13_T2B_16T4A_train_dev_npo)

    • number of data: 15385
  • Data files

    • semeval13_T2B_16T4A_train_dev_devtest_npo - 1588+15385 = 16973
    • semeval16_T4A_devtest_npo = 1588
    • semeval13_T2B_16T4A_train_dev_npo = 15385
    • semeval16_T4A_dev_npo = 1595
    • semeval16_T4A_test_npo = 20632
    • semeval16_T4A_train_npo = 4796
  • Format of data (TAB separated)

    no sentiment sentences
    1 objective I may be the ...
    2 positive TGIF folks! ...

Preprocessing

  • Label definition
    • 'objective': [0, 1, 0], 1
    • 'positive': [0, 0, 1], 2
    • 'negative': [1, 0, 0], 0

Reference

Pre-trained Word2vec done by Fréderic Godin

cnntweets's People

Contributors

bgshin avatar tlee54 avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.