noaimabari / project-text-classification Goto Github PK
View Code? Open in Web Editor NEWAnalysed a dataset containing 20,000 text messages from 20 different newsgroups. Cleaned the data, processed the words and formed the vocabulary. Implemented Multinomial naive bayes from scratch for classification of the messages into the 20 different classes. Compared the results with those obtained using the inbuilt sklearn multinomial naive bayes classifier. Achieved an accuracy of nearly 70%. Python libraries used: sklearn, nltk, matplotlib, numpy.