The spread of misinformation on social media is increasingly concerning. Anyone with a smart device and internet connection can reach millions of users with false claims. Machine learning infrastructure is crucial to preventing the dissipation of misleading information since human fact-checking of the immense volumes of online statements is not feasible. This paper utilizes a new multilingual, multi-topical Twitter misinformation dataset to fine-tune and evaluate four language classification models—XLM-RoBERTa, mT5, and RoBERTa and T5 with translation—tasked with flagging misinformation posts. A RoBERTa model trained on translated texts outperformed all the others and beat the Mu- MiN text-only leaderboard lead by 5.4 % with a 59.9% macro-average F1 score.
Models trained on Google Colaboratory GPU.