- Naive Bayes Classifier (simple, no stopwords)
- Bag of Words
- Maximum Entropy
- Support Vector Machine
In order to clean and generate a wordlist from a dataset, we utilize the script denoted as generate_wordlist.py
, which returns/writes to a file denoted by wordlist
.
To use this repo, you need to properly set up your data. Place custom tests under data/test. Place training under data/train. 1/4 of the training data will be used for evaluation.
Copyright BASED Systems 2017. Do not reproduce.