ml-final-project's Introduction

ML-Final-Project

Final Project Machine Learning

Problem Statement

The main idea of the project is to classify whether a statement is sexist, not sexist or neutral. The main motivation is to identify the sexist statements that are commonly observed in workspaces. In the project we are trying to classify these statements using various ML algorithm and parallely comparing which algorithm gives the best accuracy.

Dataset

The dataset is obtained from kaggle and contains classified data on sexist and not sexist statementsf

Preprocessing

In the preprocessing step we cleaned the data and removed missing values. Then we used BOW(bag of words) and TF-IDF(Term frequency inverse document frequence) for feature extraction and finally trained models on the obtained preprocessed data.

Models

Logistic Regression on BOW and TF-IDF
Naive Bayes on TF-IDF Data
Random Fprest on TF-IDF Data
SVM with various kernels on TF-IDF Data

Requirments

matplotlib==3.2.0
seaborn==0.10.0
nltk==3.5
numpy==1.17.4
pandas==0.25.3
scikit_learn==1.0.1

How to Run

Download the ipynb file and load the dataset folder. Then run each cell to obtain output

Recommend Projects

vahsek300501 / ml-final-project Goto Github PK

ml-final-project's Introduction

ML-Final-Project

Problem Statement

Dataset

Preprocessing

Models

Requirments

How to Run

ml-final-project's People

Contributors

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent