Giter VIP home page Giter VIP logo

sentimen-bahasa's Introduction

Analisis Sentimen Teks Bahasa Indonesia: Evaluasi Leksikon & Metode Ekstraksi Fitur

Python 3.7|3.8|3.9 GitHub license

Implementasi analisis sentimen untuk mengevaluasi performa leksikon dan metode ekstraksi fitur pada teks berbahasa Indonesia dengan Python di JupyterLab.

Sentiment analysis implementation to evaluate lexicon and extraction feature methods performance using Python in JupyterLab. Primarily made for dealing with text in social media using Indonesian language (bahasa Indonesia). *Note: jupyter notebooks with English documentation can be found in ipynb-en folder.

Set

tl;dr

Repositori ini dibuat untuk mengevaluasi performa leksikon dan metode ekstraksi fitur pada analisis sentimen teks berbahasa Indonesia mengenai penanganan Covid-19 dengan Support Vector Machine (SVM). Pendekatan analisis sentimen dilakukan dengan pemelajaran semisupervisi—menggabungkan pendekatan berbasis leksikon dan pendekatan berbasis pemelajaran mesin. Setiap jupyter notebook (ipynb) disertai dengan petunjuk. Algoritma dibuat dengan memanfaatkan modul RegEx bawaan Python dan library NLTK, Scikit-learn, juga imbalanced-learn. Validasi dilakukan dengan k-Fold cv setelah sebelumnya data disintesis (oversampling) dengan borderline SMOTE SVM atau SVM-SMOTE.

Algoritma yang termasuk

  • Pembersihan kata/prapengolahan teks
  • Penggantian kata tidak baku
  • Penghapusan stop words
  • Pelabelan leksikon: InSet, sentiwords_id (dari sentistrength_id)
  • Ekstraksi fitur: term presence, BoW, TF-IDF
  • Sintesis data: SVM-SMOTE
  • Klasifikasi: SVM
  • Plotting

Prasyarat

  • pandas >= 0.25.0
  • numpy >= 1.16.6
  • nltk
  • scikit-learn
  • imbalanced-learn
  • jupyterlab

Instalasi

Prasyarat

instal package satu per satu

pip3 install --user --upgrade [nama package] atau

Clone Repositori

git clone https://github.com/onpilot/sentimen-bahasa.git
cd sentimen-bahasa
pip3 install -r requirements.txt
jupyter-lab

FAQ

Saya punya instalasi Python versi lama untuk projek lain. Apa perlu di-uninstall dulu?

Ya. Atau gunakan aplikasi yang bisa memanajemen instalasi Python, seperti Conda atau Scoop.

Error: Microsoft Visual C++ 14.0 or greater is required!

Pengguna Windows perlu compiler Visual C++ 14.0 Build Tools atau versi di atasnya untuk package scikit-learn.

Publikasi

Publikasi mengenai projek ini bisa dilihat di: http://jurnal.umus.ac.id/index.php/intech/article/view/556

Jika kamu memanfaatkan repositori ini dalam publikasi akademis, kami sangat mengapresiasi sitasi ke paper berikut:

@article{j.ilm.intech:v03:02-556,
author  = {Wildan Fariq Abdillah, Agyztia Premana, Raden Mohamad Herdian Bhakti},
title   = {Analisis Sentimen Penanganan Covid-19 dengan Support Vector Machine: Evaluasi Leksikon dan Metode Ekstraksi Fitur},
journal = {Jurnal Ilmiah Intech: Information Technology Journal of UMUS},
year    = {2021},
volume  = {03},
issue   = {02},
pages   = {160-170},
issn    = {2685-4902 (online)},
doi     = {10.46772/intech.v3i02.556},
url     = {http://jurnal.umus.ac.id/index.php/intech/article/download/556/373}
}

to-dos

Bacaan Lanjut

sentimen-bahasa's People

Contributors

onpilot avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

sentimen-bahasa's Issues

Label

me need label negatif dan positif and bobot this word

<style> </style>

hanya

pandang
bagus
staf
layan
ngeri
alami
luar
biasa
akan
kembali
kamar
sangat
kagum
hotel
kurang
senang

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.