Giter VIP home page Giter VIP logo

sentimentanalysis-chinese's Introduction

基于中文词典的中文情感分析,主要针对电商网站的商品评论:
1.词典来源:ntusd情感词典+HowNet情感词典 http://www.datatang.com/data/44317 http://www.keenage.com/html/e_index.html
 positive.txt 正面情感词典。
 negative.txt 负面情感词典。
 deny_adv.txt 否定副词词典。
 level_adv.txt 程度副词词典(一共有6类程度副词:“极其”,“很”,“较”,“稍稍”,“不足,稍欠”,“超”)。

2.词典构造
 BuildDict.py 有两个class, sentiDictionary类用于构造情感词典,featureDictionary类用于构造特征维度词典。
                   设置了10个特征维度:quality(质量),color(颜色),material(材质),style(风格),function(功能)
                                   package(包装),price(价格),service(服务),logistic(物流),description(图片与实物)
                                   不涉及以上10类的在最后都归到others。
                   对维度的修改可以直接在该部分修改。

 SentiAnalysis.py 输入一段文本,计算该文本分别对应11个维度的词(在句中的位置以及词本身)和情感值。
                       可以在该部分设置正面和负面情感的分值,否定词的分值,以及各类程度副词的权值。

 MAIN.py 主函数,参数为ReviewID,即表Review中的ID。
              包含三部分功能:(1)从Reviews表中读取ReviewID对应的ReviewContent。如果Reviews表中不存在该ID或者SentiValues
                            表中已经有该ReviewID,则不用再继续操作。
                           (2)计算从Reviews表中读取的ReviewContent的情感值。
                           (3)将ReviewID, ReviewContent, 11大维度对应的内容和情感值, 以及ReviewScore写入数据库。
              需要修改服务器 self.ms = MSSQL(host="localhost", user="sa", pwd="Robin_go1992", db="database_JdMASU")
              计算综合情感值的时候,默认11类特征的权重均为1,但可以手动设置。

sentimentanalysis-chinese's People

Contributors

codelegant92 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.