This repository contains the Python and C++ source codes used in the paper "Revisiting Probability Distribution Assumptions for Information Theoretic Feature Selection".
--------
Compile:
--------
package required: 1) Python-3.5.2; 2) Boost.Python-1.66.0; 3) GCC-6.2.0; and 4) SKlearn.
compile c++ codes: 1) move to the fsmethods directory: 'cd fsmethods'; 2) run the Makefile: 'make'; 3) move back to the main directory: 'cd ..'
We have successfully compiled the C++ codes in a Linux and MaxOS environment. However we haven't test it on a Windows environment. The compiled codes will be used by the Python main file named "main.py" to perform feature selection and classification.
------
Usage:
------
usage: python3 main.py FSMethod Dataset
FSMethod: feature selection methods
1) VMIrm (\tilde{J}_{AMD}^{1})
2) VMIgm (\tilde{J}_{GMD}^{1})
3) VMIin (\tilde{J}_{FID}^{1})
4) RMRMRrm (J_{AMD}^{2,1})
5) JMIrm (J_{AMD}^{1,1})
6) MRMRrm (J_{AMD}^{1,0})
7) RelaxMRMR (J_{GMD}^{2,1})
8) JMI (J_{GMD}^{1,1})
9) MRMR (J_{GMD}^{1,0})
10) CIFE (J_{FID}^{1,1})
11) MIFS (J_{FID}^{1,0})
12) MIM
Dataset: We have included a couple of datasets in the "datasets" folder so that you can have a try, e.g., wine, congress, heart, etc.
--------
Example:
--------
As a concrete example: "python main.py VMIrm wine" will run the \tilde{J}_{AMD}^{1} method on the wine dataset and the output will be the selected features, running time and classification accuracy, which can be found in the "results" directory.
-----------
References:
-----------
Sun, Y., Wang, W., Kirley, M., Li, X., & Chan, J. Revisiting Probability Distribution Assumptions for Information Theoretic Feature Selection. To be presented at AAAI 2020 in New York.
--------
License:
--------
This program is to be used under the terms of the GNU General Public License
(http://www.gnu.org/copyleft/gpl.html).
Author: Yuan Sun
e-mail: [email protected] OR [email protected]
Copyright notice: (c) 2019 Yuan Sun