Giter VIP home page Giter VIP logo

cbd.ml's Introduction

CBD.ML

MSc Dissertation on Cyber Bullying Detection using Machine Learning, NLP and Python.

LO

image

CBD ML Development

CBD ML Development

CBD ML Process

CBD ML Process



  • CBD.ML Dataset Overview

image

  • CBD.ML Cyberbulling WordCloud

image

  • CBD.ML NOT Cyberbulling WordCloud

image

  • CBD.ML Barchart of Top Words Frequency

image

  • CBD.ML RandomForest Train/Test Accuracy

image

  • CBD.ML RandomForest Train Test Classification Report

image

  • CBD.ML RandomForest ConfusionMatrix

ConfusionMatrix

cbd.ml's People

Contributors

b4k0 avatar

Watchers

 avatar

cbd.ml's Issues

Support Vector Machine (SVM): Accuracy Train/ Test

  • CyberBullying8020.ipynb : 0.7342498450804434/ 0.7342329936656569
  • CyberBullying7525.ipynb : 0.7342587152369761/ 0.7342097532314924
  • CyberBullying7030.ipynb : 0.7342356520826776/ 0.7342717258261934
  • CyberBullying6535.ipynb : 0.7342447953447643/ 0.7342495934532864
  • CyberBullying6040.ipynb : 0.7342554623905991/ 0.7342329936656569

Random Forest: Accuracy Train/ Test

  • CyberBullying8020.ipynb : 0.7342498450804434/ 0.7342329936656569
  • CyberBullying7525.ipynb : 0.7342831962397179/ 0.7342097532314924
  • CyberBullying7030.ipynb : 0.7342356520826776/ 0.7342717258261934
  • CyberBullying6535.ipynb : 0.7342447953447643/ 0.7342495934532864
  • CyberBullying6040.ipynb : 0.7342554623905991/ 0.7342329936656569

Dataset CyberTroll & IEEE

CyberBullying8020CyberTrollIEEEDataset.ipynb

Length: 22141
Dataset:

  1. cybertroll_dataset.csv
  2. CyberBullyingTypesDataset.csv

Logistic Regression (LR): Accuracy Train/ Test

55K (54464)

CyberBullying8020.ipynb : 0.8591723853021506/ 0.8122647571835123
CyberBullying7525.ipynb : 0.8608499804151978/ 0.8116186839012925
CyberBullying7030.ipynb : 0.8621078585667821/ 0.8096695226438189
CyberBullying6535.ipynb : 0.864495353238609/ 0.808372239416671
CyberBullying6040.ipynb : 0.8648020074667973/ 0.808409070044983

Final Dataset Information

Final Dataset:

  • cyberbullying_tweets.csv
  • CyberBullyingTypesDataset.csv
  • cybertroll_dataset.csv
  • classified_tweets.csv

XGBoost: Accuracy Train/ Test

  • CyberBullying8020.ipynb : 0.7342498450804434/ 0.7342329936656569
  • CyberBullying7525.ipynb : 0.7342831962397179/ 0.7342097532314924
  • CyberBullying7030.ipynb : 0.7342356520826776/ 0.7342717258261934
  • CyberBullying6535.ipynb : 0.7342447953447643/ 0.7342495934532864
  • CyberBullying6040.ipynb : 0.7342554623905991/ 0.7342329936656569

Decision Tree: Accuracy Train/ Test

  • CyberBullying8020.ipynb : 0.9446191274012531/ 0.7636096575782613
  • CyberBullying7525.ipynb : 0.9481737171954563/ 0.7660105757931844
  • CyberBullying7030.ipynb : 0.9511856048683244/ 0.7678702570379436
  • CyberBullying6535.ipynb : 0.9540408463037767/ 0.7668782458165032
  • CyberBullying6040.ipynb : 0.9573107289307791/ 0.7656293032222529

ML Model

Creating the final ML Model for Cyberbullying Detection

Naive Bayes: Accuracy Train/ Test

  • CyberBullying8020.ipynb : 0.7987652337563976/ 0.7805930414027357
  • CyberBullying7525.ipynb : 0.7966118292205249/ 0.7791568742655699
  • CyberBullying7030.ipynb : 0.7964274472773056/ 0.7768665850673194
  • CyberBullying6535.ipynb : 0.7949775430072596/ 0.7746944342443477
  • CyberBullying6040.ipynb : 0.7942346532835547/ 0.7732029743872212

K-Folds

k-fold cross-validation helps to evaluate the performance of different algorithms.

Dataset CyberTroll

CyberBullying8020CyberTrollDataset.ipynb

Length: 20001
Dataset:

  1. cybertroll_dataset.csv

Grid Search

Grid Search helps to find the best combination of hyperparameters for each algorithm.

Dataset IEEE

CyberBullying8020IEEEDataset.ipynb

Length: 2140
Dataset:

  1. CyberBullyingTypesDataset.csv

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.