yarakyrychenko / mumin-classifier Goto Github PK

View Code? Open in Web Editor NEW

Fine-tuning multilingual and English-only (with translated-to-English text) RoBERTa and T5 to classify misinformation on Twitter.

Python 9.16% Shell 0.15% Jupyter Notebook 90.69%

mumin-classifier's Introduction

Multilingual Transformer Models for Twitter Misinformation Detection

The spread of misinformation on social media is increasingly concerning. Anyone with a smart device and internet connection can reach millions of users with false claims. Machine learning infrastructure is crucial to preventing the dissipation of misleading information since human fact-checking of the immense volumes of online statements is not feasible. This paper utilizes a new multilingual, multi-topical Twitter misinformation dataset to fine-tune and evaluate four language classification models—XLM-RoBERTa, mT5, and RoBERTa and T5 with translation—tasked with flagging misinformation posts. A RoBERTa model trained on translated texts outperformed all the others and beat the Mu- MiN text-only leaderboard lead by 5.4 % with a 59.9% macro-average F1 score.

Models trained on Google Colaboratory GPU.

Recommend Projects

yarakyrychenko / mumin-classifier Goto Github PK

mumin-classifier's Introduction

Multilingual Transformer Models for Twitter Misinformation Detection

mumin-classifier's People

Contributors

Stargazers

Watchers

Forkers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent