As today is the era of social media with nearly around 192 million daily active users on Twitter alone. With the increase in the number of people online, individuals inclined towards racism, misogyny, etchave led to the spread of hate speech online. Itβs high time that proper steps must be taken to curbthis issue with one major step being to identify people who are spreading hate speech on Twitter. Wehave tried to perform the above task using natural language processing techniques for two different lan-guages English and Spanish on the two datasets provided by PAN @CLEF 2021. Four machine learningclassifiers (i) multinomial naive Bayes, (ii) K-Nearest Neighbors (KNN) classifier, (iii) logistic regressionand (iv) linear SVM, along with three deep learning models (i) Long Short Term Memory (LSTM), Bidirec-tional Long Short term Memory (bi-LSTM) and Bidirectional Encoder Representations for Transformers(BERT) model were implemented for the identification of hate speech spreader. The experiments withall the mentioned models on the training dataset provided by PAN (by splitting it into training andtesting datasets) revealed that the multinomial naive Bayes is the best model with an accuracy of 74%for the English dataset and 82% for the Spanish dataset. The multinomial naive Bayes model yieldedan accuracy of 66% for the English dataset and 80% for the Spanish dataset with the unknown private dataset used by the organizers for the final evaluation of the models.
raksh543 / profiling-hate-speech-spreaders-on-twitter Goto Github PK
View Code? Open in Web Editor NEWResearch paper published in CLEF Working Notes 2021 - http://ceur-ws.org/Vol-2936/paper-175.pdf