predicting the possibility of cancer relapse from gene expression features.
We use gene expression data from to predict if the cancer is going to relapse or not. Data files are in /data/.
use datareader.py to load the data and use your models to compare with our results.
The following models were tesed:
1- Support vector machines with different kernels.
2- Logisitc Regression.
3- Random Forests.
4- AdaBoost.
5- XGBoost.
6- Gaussian Processes with different Kernels.
7- Naive bayes
8- PCA
9- Extra tree
and a voting classifier was chosen in the end.
we use different feature selection techniques including: Mutual information feature selection. Correlation matrix. Variannce threshold.