The aim of text classification is to automatically classify the text documents based on pretrained categories.
In this part we try to solve a classification problem namely email-spam classification using machine learning algorithms (NB) Naive Bayes and (LM) Linear Model.
We employ an open source dataset found in Kaggle website. The dataset which can be found at the following url: https://www.kaggle.com/uciml/sms-spam-collection-dataset#spam.csv.
Project Outline:
1. Understanding the dataset
2. Text Preprocessing
3. Apply Feature Engineering
4. Model training & evaluation