Giter VIP home page Giter VIP logo

Comments (23)

jijo7 avatar jijo7 commented on September 16, 2024

Could you please show your modifications on the script?
Thank You @limi44

from mifs.

limi44 avatar limi44 commented on September 16, 2024

Hi @jijo7,

I copied mutual_info_.py from sklearn 0.18.dev0 (https://github.com/scikit-learn/scikit-learn/blob/0f2a00f/sklearn/feature_selection/mutual_info_.py) then placed mutual_info_.py in the same folder as mifs.py. You will have to fix the imports at the top from:

from ..externals.six import moves

to:

from sklearn.externals.six import moves

and import mutual_info_ in mifs.py.

For getting the first feature, change line 207 in mifs.py from:

xy_MI [i, :] = mi.get_first_mi_vector(self, k)

to:

if self.categorical:
    xy_MI [i, :] = mutual_info_.mutual_info_classif(self.X, self.y, n_neighbors = k)
else:
    xy_MI [i, :] = mutual_info_.mutual_info_regression(self.X, self.y, n_neighbors = k)

Change line 225 in mifs.py from:

feature_mi_matrix[s, F] = mi.get_mi_vector(self, F, s)

to:

feature_mi_matrix[s, F] = mutual_info_.mutual_info_regression(self.X[:, F], self.X[:, S[-1]])

I was only using MRMR with continuous features and continuous targets, so I didn't need the branches inside the mi.get_mi_vector function that deal with JMI/JMIM and categorical values (I didn't test JMI and JMIM).

This should make MRMR work for continuous features and continuous or categorical targets. mutual_info_regression and mutual_info_classif also check sparsity of input features to determine if they are discrete or continuous (or change the flag 'discrete_features'). I can't say for certain if this works properly or not as I haven't compared it against any other implementations.

Based on the documentation, the method of MI estimation implemented in sklearn appears to be similar to the one here (kNN).

from mifs.

jijo7 avatar jijo7 commented on September 16, 2024

@limi44
Hi
Special thanks to you for your good comments!

from mifs.

shahlaebrahimi avatar shahlaebrahimi commented on September 16, 2024

Hi @limi44
Thanks a lot for your detailed answer.
I will try it.
here (https://github.com/jundongl/scikit-feature/tree/master/skfeature/function/information_theoretical_based) there is another implementation of MRMR but the optimum number of selected features should be passed into it.
Best regards,

from mifs.

shahlaebrahimi avatar shahlaebrahimi commented on September 16, 2024

@limi44 Sorry, is it possible to ask a question about the data division into test and train for feature selection?

from mifs.

limi44 avatar limi44 commented on September 16, 2024

@shahlaebrahimi Sure, I'll do my best to answer your question.

Regarding passing the number of features for MRMR, at least based on the original paper by Peng, all features are assigned a ranking based on MRMR and then the number of features is selected through computing the accuracy on each subsequent subset of features. You could define a stopping condition to end the search early, such as what Daniel has done, but you would have to select a threshold empirically. With a smaller subset of features now, you can now apply an exhaustive feature selection (or whatever your choice is) with a much lower computational burden than using the entire feature set.

The stopping condition defined in this code appears to be for JMI/JMIM, so I don't think it applies for MRMR.

from mifs.

shahlaebrahimi avatar shahlaebrahimi commented on September 16, 2024

@limi44 Thanks for sharing your knowledge and time.
I did these steps to do wrapper based feature selection:

1-Load and prepare data
2-Apply NSGA II to randomly select features
3-Apply Patternnet (neural network) to do classification using selected features in pervious step. This step is implemented E.g., 20 times and each time, the data is randomly divided into train and test.
4-Define fitness function of NSGA II as follows: minimizing the number of selected features as well as the mean error resulting from Patternnet
5-Repeat step 2,3 and 4.

Now, I read some points regarding how to divide data into train and test when doing feature selection here , here and here, so I am not sure if I am doing right? Especially, if it is supposed to use filter method in order to decrease computational burden, will filter approach apply on the whole data set or just train data?
Best regards,

from mifs.

limi44 avatar limi44 commented on September 16, 2024

@shahlaebrahimi Unfortunately I'm not familiar with NSGA II so I can't comment on that. For dividing your data, it would make more sense to use a k-fold cross validation such that you can guarantee that all data points will be part of the test set once. The websites you've linked are just saying that you must be careful that you are not including data from your test set in your training set when estimating the error, as you can bias the results. What is often done is to do a search for the optimal number of features by running cross-validation for a different number of features each time, and then selecting the number of features that gives the lowest cross-validation error.

I'm not sure I understand your last question - filter methods will generally have a lower computational burden since they are computing some information (e.g. some metric of separability) regarding your data set that ideally will lead to selection of features with good performance. It is assumed that the computation of this information is less expensive than actually training the model and computing the performance. You will only apply the filter method on your training data to select features, and then choose the same features from your test set when evaluating performance.

Hope that helps.

from mifs.

shahlaebrahimi avatar shahlaebrahimi commented on September 16, 2024

@limi44 Thank you very much for your all exquisite responses.
Excuse me, since my targets are discrete:
I change line 225 in mifs.py from:

feature_mi_matrix[s, F] = mi.get_mi_vector(self, F, s)

to:

if self.categorical:
                feature_mi_matrix[s, F] = mutual_info_.mutual_info_classif(self.X[:, F], self.X[:, S[-1]])
            else:
                feature_mi_matrix[s, F] = mutual_info_.mutual_info_regression(self.X[:, F], self.X[:, S[-1]])

However, this error is reported:

C:\Users\Markazi.co\Anaconda3\python.exe D:/mifs-master_2/mifs-master/MRMR/untitled0.py
C:\Users\Markazi.co\Anaconda3\lib\site-packages\sklearn\utils\validation.py:515: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel().
  y = column_or_1d(y, warn=True)
C:\Users\Markazi.co\Anaconda3\lib\site-packages\sklearn\utils\validation.py:515: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel().
  y = column_or_1d(y, warn=True)
C:\Users\Markazi.co\Anaconda3\lib\site-packages\sklearn\utils\validation.py:515: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel().
  y = column_or_1d(y, warn=True)
C:\Users\Markazi.co\Anaconda3\lib\site-packages\sklearn\utils\validation.py:515: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel().
  y = column_or_1d(y, warn=True)
C:\Users\Markazi.co\Anaconda3\lib\site-packages\sklearn\utils\validation.py:515: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel().
  y = column_or_1d(y, warn=True)
C:\Users\Markazi.co\Anaconda3\lib\site-packages\sklearn\utils\validation.py:515: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel().
  y = column_or_1d(y, warn=True)
C:\Users\Markazi.co\Anaconda3\lib\site-packages\sklearn\utils\validation.py:515: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel().
  y = column_or_1d(y, warn=True)
C:\Users\Markazi.co\Anaconda3\lib\site-packages\sklearn\utils\validation.py:515: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel().
  y = column_or_1d(y, warn=True)
Traceback (most recent call last):
Auto selected feature #1 : 3, JMI: 0.012069001277461022
  File "D:/mifs-master_2/mifs-master/MRMR/untitled0.py", line 21, in <module>
    feat_selector.fit(X, y)
  File "D:\mifs-master_2\mifs-master\MRMR\mifs.py", line 137, in fit
    return self._fit(X, y)
  File "D:\mifs-master_2\mifs-master\MRMR\mifs.py", line 229, in _fit
    feature_mi_matrix[s, F] = mutual_info_.mutual_info_classif(self.X[:, F], self.X[:, S[-1]])
  File "D:\mifs-master_2\mifs-master\MRMR\mutual_info_.py", line 436, in mutual_info_classif
    check_classification_targets(y)
  File "C:\Users\Markazi.co\Anaconda3\lib\site-packages\sklearn\utils\multiclass.py", line 173, in check_classification_targets
    raise ValueError("Unknown label type: %r" % y)
ValueError: Unknown label type: array([-0.042465, -0.036266, -0.10856 , ...,  0.14536 ,  0.10278 ,
        0.037508])

Process finished with exit code 1

In fact, I load my dataset as follows:

X = np.array(pd.read_csv('my_X_table3.csv').values)
y = np.array(pd.read_csv('my_y_table3.csv').values.ravel())

When I change

feat_selector.fit(X, y)

to:

feat_selector.fit(X, y.values.ravel())

I encounter another error:

AttributeError: 'numpy.ndarray' object has no attribute 'values'

Thanks in advance.

from mifs.

shahlaebrahimi avatar shahlaebrahimi commented on September 16, 2024

@limi44 Hi
To tell you the truth, I found this but I could not figure it out.
Thanks.

from mifs.

limi44 avatar limi44 commented on September 16, 2024

@shahlaebrahimi You should be changing line 255 to:

feature_mi_matrix[s, F] = mutual_info_.mutual_info_regression(self.X[:, F], self.X[:,S[-1]])

You don't need to check self.categorical because when you choose the features after the first feature, you are computing the mutual information between selected features and candidate features (both of which are continuous, hence using mutual_info_regression), not between a continuous feature and a discrete target. When you try to use mutual_info_classif with continuous targets, you're getting an error that the target type is not valid (must be int or str).

For the DataConversionWarning, use np.ravel(y). The attribute 'values' only exists for the pandas dataframe, not for numpy arrays.

from mifs.

shahlaebrahimi avatar shahlaebrahimi commented on September 16, 2024

Hi @limi44
First, Thanks a lot. Then, I am so sorry.
If mutual_info is to be used for "JMI" as well as "JMIM", what changes would be necessary?
When "JMI" or "JMIM" is selected it works well on a data set including 60 features but it fails on a data set including 14 features!

C:\Users\Markazi.co\Anaconda3\python.exe D:/mifs-master_2/mifs-master/MRMR/untitled0.py
C:\Users\Markazi.co\Anaconda3\lib\site-packages\sklearn\utils\validation.py:515: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel().
  y = column_or_1d(y, warn=True)
C:\Users\Markazi.co\Anaconda3\lib\site-packages\sklearn\utils\validation.py:515: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel().
  y = column_or_1d(y, warn=True)
C:\Users\Markazi.co\Anaconda3\lib\site-packages\sklearn\utils\validation.py:515: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel().
  y = column_or_1d(y, warn=True)
C:\Users\Markazi.co\Anaconda3\lib\site-packages\sklearn\utils\validation.py:515: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel().
  y = column_or_1d(y, warn=True)
C:\Users\Markazi.co\Anaconda3\lib\site-packages\sklearn\utils\validation.py:515: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel().
  y = column_or_1d(y, warn=True)
C:\Users\Markazi.co\Anaconda3\lib\site-packages\sklearn\utils\validation.py:515: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel().
  y = column_or_1d(y, warn=True)
C:\Users\Markazi.co\Anaconda3\lib\site-packages\sklearn\utils\validation.py:515: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel().
  y = column_or_1d(y, warn=True)
C:\Users\Markazi.co\Anaconda3\lib\site-packages\sklearn\utils\validation.py:515: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel().
  y = column_or_1d(y, warn=True)
Auto selected feature #1 : 13, JMI: 0.012375003915936711
Auto selected feature #2 : 5, JMI: 0.058816897631613685
Auto selected feature #3 : 4, JMI: 0.05125188456854968
Auto selected feature #4 : 1, JMI: 0.043800257712073076
Auto selected feature #5 : 3, JMI: 0.04149319239081439
Auto selected feature #6 : 0, JMI: 0.03890485174541958
Auto selected feature #7 : 9, JMI: 0.023689493392932803
Auto selected feature #8 : 11, JMI: 0.022042420650845163
Auto selected feature #9 : 2, JMI: 0.01772030746447406
Auto selected feature #10 : 6, JMI: 0.013648756446829502
Auto selected feature #11 : 7, JMI: 0.011994961339341081
Auto selected feature #12 : 12, JMI: 0.007047141568888193
Auto selected feature #13 : 8, JMI: 0.005358582278689239
Auto selected feature #14 : 10, JMI: 0.0024456589423706987
Traceback (most recent call last):
  File "D:/mifs-master_2/mifs-master/MRMR/untitled0.py", line 21, in <module>
    feat_selector.fit(X, y)
  File "D:\mifs-master_2\mifs-master\MRMR\mifs.py", line 137, in fit
    return self._fit(X, y)
  File "D:\mifs-master_2\mifs-master\MRMR\mifs.py", line 228, in _fit
    feature_mi_matrix[s, F] = mutual_info_.mutual_info_regression(self.X[:, F], self.X[:, S[-1]])
  File "D:\mifs-master_2\mifs-master\MRMR\mutual_info_.py", line 362, in mutual_info_regression
    copy, random_state)
  File "D:\mifs-master_2\mifs-master\MRMR\mutual_info_.py", line 244, in _estimate_mi
    X, y = check_X_y(X, y, accept_sparse='csc', y_numeric=not discrete_target)
  File "C:\Users\Markazi.co\Anaconda3\lib\site-packages\sklearn\utils\validation.py", line 510, in check_X_y
    ensure_min_features, warn_on_dtype, estimator)
  File "C:\Users\Markazi.co\Anaconda3\lib\site-packages\sklearn\utils\validation.py", line 415, in check_array
    context))
ValueError: Found array with 0 feature(s) (shape=(1206, 0)) while a minimum of 1 is required.

Process finished with exit code 1

It should be noted that in both case "auto" is passed as number of features. However, I use python 3.5 so I change line 222 from:

while len(S) < self.n_features:

to:

while self.n_features == 'auto' or len(S) < self.n_features:

Seemingly, it makes an error when feature "o" is selected (the above error shows that auto selected feature # 6: 0).
Thanks beforehand.

from mifs.

limi44 avatar limi44 commented on September 16, 2024

@shahlaebrahimi To get JMI/JMIM to use the sklearn versions of MI, instead of changing line 225, you will have to modify mi.get_mi_vector (which calls _get_mi). If you change _mi_dc to mutual_info_classif and _mi_cc to mutual_info_regression, that should work (although you'll have to play around with the syntax to make sure it runs properly).

As for the error you've shown above, there's a problem with your loop condition. You have an infinite loop if self.n_features == 'auto'. The only way to break from the loop is if the threshold condition is met at line 250:

if np.abs(np.mean(MI_dd[-5:])) < 1e-3

What is likely happening is that when you have 60 features, the decay rate of JMI stabilizes over the last 5 selected so the condition is met, and the loop is exited before you reach 60 features selected. When you have 14 features, over the last 5 selected features, the JMI is still changing. So the loop continues, and you get an error because you've already selected all the features.

To fix this, change your loop condition so that you don't continue selecting after all features are selected. Try this:

while (self.n_features == 'auto' or len(S) < self.n_features) and len(F) > 0:

from mifs.

shahlaebrahimi avatar shahlaebrahimi commented on September 16, 2024

@limi44 Hi.
Special thanks to you for your good comments.
It seems MRMR does not select the first and last features. Specially, for the data set including 14 features, it selects all features except them. In fact, If I change the place of first and last feature in my csv file, they would be selected.
Thanks.

from mifs.

shahlaebrahimi avatar shahlaebrahimi commented on September 16, 2024

For "JMI" and "JMIM", I change line 225 to:

feature_mi_matrix[s, F] = mutual_info_._compute_mi_cd(self.X[:, F], self.[:, S[-1]],n_neighbors=k)

Unfortunately, it displays this error:

C:\Users\Markazi.co\Anaconda3\python.exe D:/mifs-master_2/mifs-master/MRMR/untitled0.py
C:\Users\Markazi.co\Anaconda3\lib\site-packages\sklearn\utils\validation.py:515: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel().
  y = column_or_1d(y, warn=True)
C:\Users\Markazi.co\Anaconda3\lib\site-packages\sklearn\utils\validation.py:515: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel().
  y = column_or_1d(y, warn=True)
C:\Users\Markazi.co\Anaconda3\lib\site-packages\sklearn\utils\validation.py:515: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel().
  y = column_or_1d(y, warn=True)
C:\Users\Markazi.co\Anaconda3\lib\site-packages\sklearn\utils\validation.py:515: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel().
  y = column_or_1d(y, warn=True)
C:\Users\Markazi.co\Anaconda3\lib\site-packages\sklearn\utils\validation.py:515: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel().
  y = column_or_1d(y, warn=True)
C:\Users\Markazi.co\Anaconda3\lib\site-packages\sklearn\utils\validation.py:515: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel().
  y = column_or_1d(y, warn=True)
C:\Users\Markazi.co\Anaconda3\lib\site-packages\sklearn\utils\validation.py:515: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel().
  y = column_or_1d(y, warn=True)
C:\Users\Markazi.co\Anaconda3\lib\site-packages\sklearn\utils\validation.py:515: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel().
  y = column_or_1d(y, warn=True)
Auto selected feature #1 : 3, JMI: 0.011804799410976452
D:\mifs-master_2\mifs-master\MRMR\mutual_info_.py:124: VisibleDeprecationWarning: boolean index did not match indexed array along dimension 0; dimension is 15678 but corresponding boolean dimension is 1206
  nn.fit(c[mask])
D:\mifs-master_2\mifs-master\MRMR\mutual_info_.py:124: VisibleDeprecationWarning: boolean index did not match indexed array along dimension 0; dimension is 15678 but corresponding boolean dimension is 1206
  nn.fit(c[mask])
D:\mifs-master_2\mifs-master\MRMR\mutual_info_.py:124: VisibleDeprecationWarning: boolean index did not match indexed array along dimension 0; dimension is 15678 but corresponding boolean dimension is 1206
  nn.fit(c[mask])
D:\mifs-master_2\mifs-master\MRMR\mutual_info_.py:124: VisibleDeprecationWarning: boolean index did not match indexed array along dimension 0; dimension is 15678 but corresponding boolean dimension is 1206
  nn.fit(c[mask])
D:\mifs-master_2\mifs-master\MRMR\mutual_info_.py:124: VisibleDeprecationWarning: boolean index did not match indexed array along dimension 0; dimension is 15678 but corresponding boolean dimension is 1206
  nn.fit(c[mask])
D:\mifs-master_2\mifs-master\MRMR\mutual_info_.py:124: VisibleDeprecationWarning: boolean index did not match indexed array along dimension 0; dimension is 15678 but corresponding boolean dimension is 1206
  nn.fit(c[mask])
D:\mifs-master_2\mifs-master\MRMR\mutual_info_.py:124: VisibleDeprecationWarning: boolean index did not match indexed array along dimension 0; dimension is 15678 but corresponding boolean dimension is 1206
  nn.fit(c[mask])
D:\mifs-master_2\mifs-master\MRMR\mutual_info_.py:124: VisibleDeprecationWarning: boolean index did not match indexed array along dimension 0; dimension is 15678 but corresponding boolean dimension is 1206
  nn.fit(c[mask])
D:\mifs-master_2\mifs-master\MRMR\mutual_info_.py:135: VisibleDeprecationWarning: boolean index did not match indexed array along dimension 0; dimension is 15678 but corresponding boolean dimension is 1206
  c = c[mask]
Auto selected feature #2 : 0, JMI: 0.0
D:\mifs-master_2\mifs-master\MRMR\mutual_info_.py:124: VisibleDeprecationWarning: boolean index did not match indexed array along dimension 0; dimension is 14472 but corresponding boolean dimension is 1206
  nn.fit(c[mask])
D:\mifs-master_2\mifs-master\MRMR\mutual_info_.py:124: VisibleDeprecationWarning: boolean index did not match indexed array along dimension 0; dimension is 14472 but corresponding boolean dimension is 1206
  nn.fit(c[mask])
D:\mifs-master_2\mifs-master\MRMR\mutual_info_.py:124: VisibleDeprecationWarning: boolean index did not match indexed array along dimension 0; dimension is 14472 but corresponding boolean dimension is 1206
  nn.fit(c[mask])
D:\mifs-master_2\mifs-master\MRMR\mutual_info_.py:124: VisibleDeprecationWarning: boolean index did not match indexed array along dimension 0; dimension is 14472 but corresponding boolean dimension is 1206
  nn.fit(c[mask])
D:\mifs-master_2\mifs-master\MRMR\mutual_info_.py:124: VisibleDeprecationWarning: boolean index did not match indexed array along dimension 0; dimension is 14472 but corresponding boolean dimension is 1206
  nn.fit(c[mask])
D:\mifs-master_2\mifs-master\MRMR\mutual_info_.py:124: VisibleDeprecationWarning: boolean index did not match indexed array along dimension 0; dimension is 14472 but corresponding boolean dimension is 1206
  nn.fit(c[mask])
D:\mifs-master_2\mifs-master\MRMR\mutual_info_.py:124: VisibleDeprecationWarning: boolean index did not match indexed array along dimension 0; dimension is 14472 but corresponding boolean dimension is 1206
  nn.fit(c[mask])
D:\mifs-master_2\mifs-master\MRMR\mutual_info_.py:124: VisibleDeprecationWarning: boolean index did not match indexed array along dimension 0; dimension is 14472 but corresponding boolean dimension is 1206
  nn.fit(c[mask])
D:\mifs-master_2\mifs-master\MRMR\mutual_info_.py:124: VisibleDeprecationWarning: boolean index did not match indexed array along dimension 0; dimension is 14472 but corresponding boolean dimension is 1206
  nn.fit(c[mask])
D:\mifs-master_2\mifs-master\MRMR\mutual_info_.py:124: VisibleDeprecationWarning: boolean index did not match indexed array along dimension 0; dimension is 14472 but corresponding boolean dimension is 1206
  nn.fit(c[mask])
Auto selected feature #3 : 1, JMI: 0.0
D:\mifs-master_2\mifs-master\MRMR\mutual_info_.py:135: VisibleDeprecationWarning: boolean index did not match indexed array along dimension 0; dimension is 14472 but corresponding boolean dimension is 1206
  c = c[mask]
D:\mifs-master_2\mifs-master\MRMR\mutual_info_.py:124: VisibleDeprecationWarning: boolean index did not match indexed array along dimension 0; dimension is 13266 but corresponding boolean dimension is 1206
  nn.fit(c[mask])
D:\mifs-master_2\mifs-master\MRMR\mutual_info_.py:124: VisibleDeprecationWarning: boolean index did not match indexed array along dimension 0; dimension is 13266 but corresponding boolean dimension is 1206
  nn.fit(c[mask])
D:\mifs-master_2\mifs-master\MRMR\mutual_info_.py:124: VisibleDeprecationWarning: boolean index did not match indexed array along dimension 0; dimension is 13266 but corresponding boolean dimension is 1206
  nn.fit(c[mask])
D:\mifs-master_2\mifs-master\MRMR\mutual_info_.py:124: VisibleDeprecationWarning: boolean index did not match indexed array along dimension 0; dimension is 13266 but corresponding boolean dimension is 1206
  nn.fit(c[mask])
D:\mifs-master_2\mifs-master\MRMR\mutual_info_.py:124: VisibleDeprecationWarning: boolean index did not match indexed array along dimension 0; dimension is 13266 but corresponding boolean dimension is 1206
  nn.fit(c[mask])
D:\mifs-master_2\mifs-master\MRMR\mutual_info_.py:124: VisibleDeprecationWarning: boolean index did not match indexed array along dimension 0; dimension is 13266 but corresponding boolean dimension is 1206
  nn.fit(c[mask])
D:\mifs-master_2\mifs-master\MRMR\mutual_info_.py:124: VisibleDeprecationWarning: boolean index did not match indexed array along dimension 0; dimension is 13266 but corresponding boolean dimension is 1206
  nn.fit(c[mask])
D:\mifs-master_2\mifs-master\MRMR\mutual_info_.py:124: VisibleDeprecationWarning: boolean index did not match indexed array along dimension 0; dimension is 13266 but corresponding boolean dimension is 1206
  nn.fit(c[mask])
D:\mifs-master_2\mifs-master\MRMR\mutual_info_.py:135: VisibleDeprecationWarning: boolean index did not match indexed array along dimension 0; dimension is 13266 but corresponding boolean dimension is 1206
  c = c[mask]
D:\mifs-master_2\mifs-master\MRMR\mutual_info_.py:124: VisibleDeprecationWarning: boolean index did not match indexed array along dimension 0; dimension is 12060 but corresponding boolean dimension is 1206
  nn.fit(c[mask])
D:\mifs-master_2\mifs-master\MRMR\mutual_info_.py:124: VisibleDeprecationWarning: boolean index did not match indexed array along dimension 0; dimension is 12060 but corresponding boolean dimension is 1206
  nn.fit(c[mask])
Auto selected feature #4 : 2, JMI: 0.0
D:\mifs-master_2\mifs-master\MRMR\mutual_info_.py:124: VisibleDeprecationWarning: boolean index did not match indexed array along dimension 0; dimension is 12060 but corresponding boolean dimension is 1206
  nn.fit(c[mask])
D:\mifs-master_2\mifs-master\MRMR\mutual_info_.py:124: VisibleDeprecationWarning: boolean index did not match indexed array along dimension 0; dimension is 12060 but corresponding boolean dimension is 1206
  nn.fit(c[mask])
D:\mifs-master_2\mifs-master\MRMR\mutual_info_.py:124: VisibleDeprecationWarning: boolean index did not match indexed array along dimension 0; dimension is 12060 but corresponding boolean dimension is 1206
  nn.fit(c[mask])
D:\mifs-master_2\mifs-master\MRMR\mutual_info_.py:124: VisibleDeprecationWarning: boolean index did not match indexed array along dimension 0; dimension is 12060 but corresponding boolean dimension is 1206
  nn.fit(c[mask])
D:\mifs-master_2\mifs-master\MRMR\mutual_info_.py:124: VisibleDeprecationWarning: boolean index did not match indexed array along dimension 0; dimension is 12060 but corresponding boolean dimension is 1206
  nn.fit(c[mask])
D:\mifs-master_2\mifs-master\MRMR\mutual_info_.py:124: VisibleDeprecationWarning: boolean index did not match indexed array along dimension 0; dimension is 12060 but corresponding boolean dimension is 1206
  nn.fit(c[mask])
D:\mifs-master_2\mifs-master\MRMR\mutual_info_.py:124: VisibleDeprecationWarning: boolean index did not match indexed array along dimension 0; dimension is 12060 but corresponding boolean dimension is 1206
  nn.fit(c[mask])
D:\mifs-master_2\mifs-master\MRMR\mutual_info_.py:124: VisibleDeprecationWarning: boolean index did not match indexed array along dimension 0; dimension is 12060 but corresponding boolean dimension is 1206
  nn.fit(c[mask])
D:\mifs-master_2\mifs-master\MRMR\mutual_info_.py:124: VisibleDeprecationWarning: boolean index did not match indexed array along dimension 0; dimension is 12060 but corresponding boolean dimension is 1206
  nn.fit(c[mask])
D:\mifs-master_2\mifs-master\MRMR\mutual_info_.py:124: VisibleDeprecationWarning: boolean index did not match indexed array along dimension 0; dimension is 12060 but corresponding boolean dimension is 1206
  nn.fit(c[mask])
D:\mifs-master_2\mifs-master\MRMR\mutual_info_.py:124: VisibleDeprecationWarning: boolean index did not match indexed array along dimension 0; dimension is 12060 but corresponding boolean dimension is 1206
  nn.fit(c[mask])
D:\mifs-master_2\mifs-master\MRMR\mutual_info_.py:124: VisibleDeprecationWarning: boolean index did not match indexed array along dimension 0; dimension is 12060 but corresponding boolean dimension is 1206
  nn.fit(c[mask])
Auto selected feature #5 : 4, JMI: 0.0
D:\mifs-master_2\mifs-master\MRMR\mutual_info_.py:124: VisibleDeprecationWarning: boolean index did not match indexed array along dimension 0; dimension is 12060 but corresponding boolean dimension is 1206
  nn.fit(c[mask])
D:\mifs-master_2\mifs-master\MRMR\mutual_info_.py:135: VisibleDeprecationWarning: boolean index did not match indexed array along dimension 0; dimension is 12060 but corresponding boolean dimension is 1206
  c = c[mask]
D:\mifs-master_2\mifs-master\MRMR\mutual_info_.py:124: VisibleDeprecationWarning: boolean index did not match indexed array along dimension 0; dimension is 10854 but corresponding boolean dimension is 1206
  nn.fit(c[mask])
D:\mifs-master_2\mifs-master\MRMR\mutual_info_.py:124: VisibleDeprecationWarning: boolean index did not match indexed array along dimension 0; dimension is 10854 but corresponding boolean dimension is 1206
  nn.fit(c[mask])
D:\mifs-master_2\mifs-master\MRMR\mutual_info_.py:124: VisibleDeprecationWarning: boolean index did not match indexed array along dimension 0; dimension is 10854 but corresponding boolean dimension is 1206
  nn.fit(c[mask])
D:\mifs-master_2\mifs-master\MRMR\mutual_info_.py:124: VisibleDeprecationWarning: boolean index did not match indexed array along dimension 0; dimension is 10854 but corresponding boolean dimension is 1206
  nn.fit(c[mask])
D:\mifs-master_2\mifs-master\MRMR\mutual_info_.py:124: VisibleDeprecationWarning: boolean index did not match indexed array along dimension 0; dimension is 10854 but corresponding boolean dimension is 1206
  nn.fit(c[mask])
D:\mifs-master_2\mifs-master\MRMR\mutual_info_.py:124: VisibleDeprecationWarning: boolean index did not match indexed array along dimension 0; dimension is 10854 but corresponding boolean dimension is 1206
  nn.fit(c[mask])
D:\mifs-master_2\mifs-master\MRMR\mutual_info_.py:124: VisibleDeprecationWarning: boolean index did not match indexed array along dimension 0; dimension is 10854 but corresponding boolean dimension is 1206
  nn.fit(c[mask])
D:\mifs-master_2\mifs-master\MRMR\mutual_info_.py:124: VisibleDeprecationWarning: boolean index did not match indexed array along dimension 0; dimension is 10854 but corresponding boolean dimension is 1206
  nn.fit(c[mask])
D:\mifs-master_2\mifs-master\MRMR\mutual_info_.py:124: VisibleDeprecationWarning: boolean index did not match indexed array along dimension 0; dimension is 10854 but corresponding boolean dimension is 1206
  nn.fit(c[mask])
D:\mifs-master_2\mifs-master\MRMR\mutual_info_.py:135: VisibleDeprecationWarning: boolean index did not match indexed array along dimension 0; dimension is 10854 but corresponding boolean dimension is 1206
  c = c[mask]
Auto selected feature #6 : 5, JMI: 0.0
D:\mifs-master_2\mifs-master\MRMR\mutual_info_.py:124: VisibleDeprecationWarning: boolean index did not match indexed array along dimension 0; dimension is 9648 but corresponding boolean dimension is 1206
  nn.fit(c[mask])
D:\mifs-master_2\mifs-master\MRMR\mutual_info_.py:124: VisibleDeprecationWarning: boolean index did not match indexed array along dimension 0; dimension is 9648 but corresponding boolean dimension is 1206
  nn.fit(c[mask])
D:\mifs-master_2\mifs-master\MRMR\mutual_info_.py:124: VisibleDeprecationWarning: boolean index did not match indexed array along dimension 0; dimension is 9648 but corresponding boolean dimension is 1206
  nn.fit(c[mask])
Auto selected feature #7 : 6, JMI: 0.0
D:\mifs-master_2\mifs-master\MRMR\mutual_info_.py:135: VisibleDeprecationWarning: boolean index did not match indexed array along dimension 0; dimension is 9648 but corresponding boolean dimension is 1206
  c = c[mask]
D:\mifs-master_2\mifs-master\MRMR\mutual_info_.py:124: VisibleDeprecationWarning: boolean index did not match indexed array along dimension 0; dimension is 8442 but corresponding boolean dimension is 1206
  nn.fit(c[mask])
D:\mifs-master_2\mifs-master\MRMR\mutual_info_.py:124: VisibleDeprecationWarning: boolean index did not match indexed array along dimension 0; dimension is 8442 but corresponding boolean dimension is 1206
  nn.fit(c[mask])
D:\mifs-master_2\mifs-master\MRMR\mutual_info_.py:124: VisibleDeprecationWarning: boolean index did not match indexed array along dimension 0; dimension is 8442 but corresponding boolean dimension is 1206
  nn.fit(c[mask])
D:\mifs-master_2\mifs-master\MRMR\mutual_info_.py:135: VisibleDeprecationWarning: boolean index did not match indexed array along dimension 0; dimension is 8442 but corresponding boolean dimension is 1206
  c = c[mask]
Auto selected feature #8 : 7, JMI: 0.0
D:\mifs-master_2\mifs-master\MRMR\mutual_info_.py:124: VisibleDeprecationWarning: boolean index did not match indexed array along dimension 0; dimension is 7236 but corresponding boolean dimension is 1206
  nn.fit(c[mask])
D:\mifs-master_2\mifs-master\MRMR\mutual_info_.py:124: VisibleDeprecationWarning: boolean index did not match indexed array along dimension 0; dimension is 7236 but corresponding boolean dimension is 1206
  nn.fit(c[mask])
D:\mifs-master_2\mifs-master\MRMR\mutual_info_.py:124: VisibleDeprecationWarning: boolean index did not match indexed array along dimension 0; dimension is 7236 but corresponding boolean dimension is 1206
  nn.fit(c[mask])
D:\mifs-master_2\mifs-master\MRMR\mutual_info_.py:124: VisibleDeprecationWarning: boolean index did not match indexed array along dimension 0; dimension is 7236 but corresponding boolean dimension is 1206
  nn.fit(c[mask])
D:\mifs-master_2\mifs-master\MRMR\mutual_info_.py:124: VisibleDeprecationWarning: boolean index did not match indexed array along dimension 0; dimension is 7236 but corresponding boolean dimension is 1206
  nn.fit(c[mask])
D:\mifs-master_2\mifs-master\MRMR\mutual_info_.py:124: VisibleDeprecationWarning: boolean index did not match indexed array along dimension 0; dimension is 7236 but corresponding boolean dimension is 1206
  nn.fit(c[mask])
D:\mifs-master_2\mifs-master\MRMR\mutual_info_.py:124: VisibleDeprecationWarning: boolean index did not match indexed array along dimension 0; dimension is 7236 but corresponding boolean dimension is 1206
  nn.fit(c[mask])
D:\mifs-master_2\mifs-master\MRMR\mutual_info_.py:124: VisibleDeprecationWarning: boolean index did not match indexed array along dimension 0; dimension is 7236 but corresponding boolean dimension is 1206
  nn.fit(c[mask])
D:\mifs-master_2\mifs-master\MRMR\mutual_info_.py:124: VisibleDeprecationWarning: boolean index did not match indexed array along dimension 0; dimension is 7236 but corresponding boolean dimension is 1206
  nn.fit(c[mask])
D:\mifs-master_2\mifs-master\MRMR\mutual_info_.py:135: VisibleDeprecationWarning: boolean index did not match indexed array along dimension 0; dimension is 7236 but corresponding boolean dimension is 1206
  c = c[mask]
Auto selected feature #9 : 8, JMI: 0.0
D:\mifs-master_2\mifs-master\MRMR\mutual_info_.py:124: VisibleDeprecationWarning: boolean index did not match indexed array along dimension 0; dimension is 6030 but corresponding boolean dimension is 1206
  nn.fit(c[mask])
D:\mifs-master_2\mifs-master\MRMR\mutual_info_.py:124: VisibleDeprecationWarning: boolean index did not match indexed array along dimension 0; dimension is 6030 but corresponding boolean dimension is 1206
  nn.fit(c[mask])
D:\mifs-master_2\mifs-master\MRMR\mutual_info_.py:124: VisibleDeprecationWarning: boolean index did not match indexed array along dimension 0; dimension is 6030 but corresponding boolean dimension is 1206
  nn.fit(c[mask])
D:\mifs-master_2\mifs-master\MRMR\mutual_info_.py:124: VisibleDeprecationWarning: boolean index did not match indexed array along dimension 0; dimension is 6030 but corresponding boolean dimension is 1206
  nn.fit(c[mask])
D:\mifs-master_2\mifs-master\MRMR\mutual_info_.py:124: VisibleDeprecationWarning: boolean index did not match indexed array along dimension 0; dimension is 6030 but corresponding boolean dimension is 1206
  nn.fit(c[mask])
D:\mifs-master_2\mifs-master\MRMR\mutual_info_.py:124: VisibleDeprecationWarning: boolean index did not match indexed array along dimension 0; dimension is 6030 but corresponding boolean dimension is 1206
  nn.fit(c[mask])
D:\mifs-master_2\mifs-master\MRMR\mutual_info_.py:124: VisibleDeprecationWarning: boolean index did not match indexed array along dimension 0; dimension is 6030 but corresponding boolean dimension is 1206
  nn.fit(c[mask])
D:\mifs-master_2\mifs-master\MRMR\mutual_info_.py:135: VisibleDeprecationWarning: boolean index did not match indexed array along dimension 0; dimension is 6030 but corresponding boolean dimension is 1206
  c = c[mask]
Auto selected feature #10 : 9, JMI: 0.0
D:\mifs-master_2\mifs-master\MRMR\mutual_info_.py:124: VisibleDeprecationWarning: boolean index did not match indexed array along dimension 0; dimension is 4824 but corresponding boolean dimension is 1206
  nn.fit(c[mask])
D:\mifs-master_2\mifs-master\MRMR\mutual_info_.py:124: VisibleDeprecationWarning: boolean index did not match indexed array along dimension 0; dimension is 4824 but corresponding boolean dimension is 1206
  nn.fit(c[mask])
D:\mifs-master_2\mifs-master\MRMR\mutual_info_.py:124: VisibleDeprecationWarning: boolean index did not match indexed array along dimension 0; dimension is 4824 but corresponding boolean dimension is 1206
  nn.fit(c[mask])
D:\mifs-master_2\mifs-master\MRMR\mutual_info_.py:124: VisibleDeprecationWarning: boolean index did not match indexed array along dimension 0; dimension is 4824 but corresponding boolean dimension is 1206
  nn.fit(c[mask])
D:\mifs-master_2\mifs-master\MRMR\mutual_info_.py:135: VisibleDeprecationWarning: boolean index did not match indexed array along dimension 0; dimension is 4824 but corresponding boolean dimension is 1206
  c = c[mask]
Auto selected feature #11 : 10, JMI: 0.0

Process finished with exit code 0

Isn't necessary to change line 207?
Thanks a lot.
Best regards,

from mifs.

shahlaebrahimi avatar shahlaebrahimi commented on September 16, 2024

@limi44 Sorry, again in Python 3.5, I changed line 206 from:

for i, k in enumerate(xrange(k_min, k_max)):

to

for i, k in enumerate(range(k_min, k_max)):

should it change to this?

for i, k in enumerate(list(range(k_min, k_max))):

Best regards,

from mifs.

shahlaebrahimi avatar shahlaebrahimi commented on September 16, 2024

@limi44 Is it possible to do "MRMR","JMI" and "JMIM" based on this toolbox?

from mifs.

limi44 avatar limi44 commented on September 16, 2024

@shahlaebrahimi Line 207 needs to be changed as I previously mentioned. Line 225 should be kept as it was originally:

feature_mi_matrix[s, F] = mi.get_mi_vector(self, F, S[-1])

You need to modify mi.py and change calls to the helper functions mi_dc and mi_cc to the functions from mutual_info (you will have to import to mi.py as well), do not use mutual_info._compute_mi_cd.

You'll have to debug the changes to Python 3 yourself, I am only using Python 2. As for the toolbox you linked, it appears to have JMI and MRMR but you'll have to look at their documentation to figure out how to use it.

from mifs.

shahlaebrahimi avatar shahlaebrahimi commented on September 16, 2024

@limi44 Hi. Thanks a lot.
I did these changes:
In mi.py:

import mutual_info_

line 34: from

MI = _mi_dc(joint, MI_FS.y, MI_FS.k)

to

MI = mutual_info_.mutual_info_classif(joint, MI_FS.y, MI_FS.k)

line 37: from

MI = _mi_cc(vars, MI_FS.k)

to

MI = mutual_info_.mutual_info_regression(vars, MI_FS.k)

line 41: from

MI = _mi_cc(vars, MI_FS.k)

to

MI = mutual_info_.mutual_info_regression(vars, MI_FS.k)

line 65: from

MI = _mi_dc(x, MI_FS.y, k)

to

MI = mutual_info_.mutual_info_classif(x, MI_FS.y, k)

line 68: from

MI = _mi_cc(vars, k)

to

MI = mutual_info_.mutual_info_regression(vars, k)

line 77:

def _mi_dc(x, y, k):-- unchanged

line 118:

def _mi_cc(variables, k=1):-- unchanged

In mifs.py:

import mutual_info_

Line 207: from

xy_MI [i, :] = mi.get_first_mi_vector(self, k)

to

if self.categorical:
    xy_MI[i, :] = mutual_info_.mutual_info_classif(self.X, self.y, n_neighbors=k)
else:
    xy_MI[i, :] = mutual_info_.mutual_info_regression(self.X, self.y, n_neighbors=k)

Line 225:
feature_mi_matrix[s, F] = mi.get_mi_vector(self, F, s) -- unchanged

However, the result was:

C:\Users\Markazi.co\Anaconda3\python.exe D:/mifs-master_2/mifs-master/MRMR/untitled0.py
C:\Users\Markazi.co\Anaconda3\lib\site-packages\sklearn\utils\validation.py:515: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel().
  y = column_or_1d(y, warn=True)
C:\Users\Markazi.co\Anaconda3\lib\site-packages\sklearn\utils\validation.py:515: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel().
  y = column_or_1d(y, warn=True)
C:\Users\Markazi.co\Anaconda3\lib\site-packages\sklearn\utils\validation.py:515: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel().
  y = column_or_1d(y, warn=True)
C:\Users\Markazi.co\Anaconda3\lib\site-packages\sklearn\utils\validation.py:515: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel().
  y = column_or_1d(y, warn=True)
C:\Users\Markazi.co\Anaconda3\lib\site-packages\sklearn\utils\validation.py:515: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel().
  y = column_or_1d(y, warn=True)
C:\Users\Markazi.co\Anaconda3\lib\site-packages\sklearn\utils\validation.py:515: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel().
  y = column_or_1d(y, warn=True)
C:\Users\Markazi.co\Anaconda3\lib\site-packages\sklearn\utils\validation.py:515: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel().
  y = column_or_1d(y, warn=True)
C:\Users\Markazi.co\Anaconda3\lib\site-packages\sklearn\utils\validation.py:515: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel().
  y = column_or_1d(y, warn=True)
Auto selected feature #1 : 13, JMI: 0.011810446483446868
C:\Users\Markazi.co\Anaconda3\lib\site-packages\sklearn\utils\validation.py:515: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel().
  y = column_or_1d(y, warn=True)
C:\Users\Markazi.co\Anaconda3\lib\site-packages\sklearn\utils\validation.py:515: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel().
  y = column_or_1d(y, warn=True)
C:\Users\Markazi.co\Anaconda3\lib\site-packages\sklearn\utils\validation.py:515: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel().
  y = column_or_1d(y, warn=True)
C:\Users\Markazi.co\Anaconda3\lib\site-packages\sklearn\utils\validation.py:515: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel().
  y = column_or_1d(y, warn=True)
C:\Users\Markazi.co\Anaconda3\lib\site-packages\sklearn\utils\validation.py:515: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel().
  y = column_or_1d(y, warn=True)
C:\Users\Markazi.co\Anaconda3\lib\site-packages\sklearn\utils\validation.py:515: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel().
  y = column_or_1d(y, warn=True)
C:\Users\Markazi.co\Anaconda3\lib\site-packages\sklearn\utils\validation.py:515: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel().
  y = column_or_1d(y, warn=True)
C:\Users\Markazi.co\Anaconda3\lib\site-packages\sklearn\utils\validation.py:515: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel().
  y = column_or_1d(y, warn=True)
C:\Users\Markazi.co\Anaconda3\lib\site-packages\sklearn\utils\validation.py:515: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel().
  y = column_or_1d(y, warn=True)
C:\Users\Markazi.co\Anaconda3\lib\site-packages\sklearn\utils\validation.py:515: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel().
  y = column_or_1d(y, warn=True)
C:\Users\Markazi.co\Anaconda3\lib\site-packages\sklearn\utils\validation.py:515: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel().
  y = column_or_1d(y, warn=True)
C:\Users\Markazi.co\Anaconda3\lib\site-packages\sklearn\utils\validation.py:515: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel().
  y = column_or_1d(y, warn=True)
C:\Users\Markazi.co\Anaconda3\lib\site-packages\sklearn\utils\validation.py:515: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel().
  y = column_or_1d(y, warn=True)
multiprocessing.pool.RemoteTraceback: 
"""
Traceback (most recent call last):
  File "C:\Users\Markazi.co\Anaconda3\lib\site-packages\joblib\parallel.py", line 130, in __call__
    return self.func(*args, **kwargs)
  File "C:\Users\Markazi.co\Anaconda3\lib\site-packages\joblib\parallel.py", line 72, in __call__
    return [func(*args, **kwargs) for func, args, kwargs in self.items]
  File "C:\Users\Markazi.co\Anaconda3\lib\site-packages\joblib\parallel.py", line 72, in <listcomp>
    return [func(*args, **kwargs) for func, args, kwargs in self.items]
  File "D:\mifs-master_2\mifs-master\MRMR\mi.py", line 35, in _get_mi
    MI = mutual_info_.mutual_info_classif(joint, MI_FS.y, MI_FS.k)
  File "D:\mifs-master_2\mifs-master\MRMR\mutual_info_.py", line 438, in mutual_info_classif
    copy, random_state)
  File "D:\mifs-master_2\mifs-master\MRMR\mutual_info_.py", line 257, in _estimate_mi
    discrete_mask[discrete_features] = True
IndexError: index 5 is out of bounds for axis 0 with size 2

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\Users\Markazi.co\Anaconda3\lib\multiprocessing\pool.py", line 119, in worker
    result = (True, func(*args, **kwds))
  File "C:\Users\Markazi.co\Anaconda3\lib\site-packages\joblib\parallel.py", line 140, in __call__
    raise TransportableException(text, e_type)
joblib.my_exceptions.TransportableException: TransportableException
___________________________________________________________________________
IndexError                                         Sun Sep 11 21:06:11 2016
PID: 1908            Python 3.5.1: C:\Users\Markazi.co\Anaconda3\python.exe
...........................................................................
C:\Users\Markazi.co\Anaconda3\lib\site-packages\joblib\parallel.py in __call__(self=<joblib.parallel.BatchedCalls object>)
     67     def __init__(self, iterator_slice):
     68         self.items = list(iterator_slice)
     69         self._size = len(self.items)
     70 
     71     def __call__(self):
---> 72         return [func(*args, **kwargs) for func, args, kwargs in self.items]
        self.items = [(<function _get_mi>, (0, 0, <mifs.MutualInformationFeatureSelector object>), {})]
     73 
     74     def __len__(self):
     75         return self._size
     76 

...........................................................................
C:\Users\Markazi.co\Anaconda3\lib\site-packages\joblib\parallel.py in <listcomp>(.0=<list_iterator object>)
     67     def __init__(self, iterator_slice):
     68         self.items = list(iterator_slice)
     69         self._size = len(self.items)
     70 
     71     def __call__(self):
---> 72         return [func(*args, **kwargs) for func, args, kwargs in self.items]
        func = <function _get_mi>
        args = (0, 0, <mifs.MutualInformationFeatureSelector object>)
        kwargs = {}
     73 
     74     def __len__(self):
     75         return self._size
     76 

...........................................................................
D:\mifs-master_2\mifs-master\MRMR\mi.py in _get_mi(f=0, s=0, MI_FS=<mifs.MutualInformationFeatureSelector object>)
     30     n, p = MI_FS.X.shape
     31     if MI_FS.method in ['JMI','JMIM']:
     32         # JMI & JMIM
     33         joint = MI_FS.X[:,(s, f)]
     34         if MI_FS.categorical:
---> 35             MI = mutual_info_.mutual_info_classif(joint, MI_FS.y, MI_FS.k)
        MI = undefined
        joint = array([[-0.042671, -0.042671],
       [-0.030316...0573,  0.070573],
       [ 0.037396,  0.037396]])
        MI_FS.y = array([[1],
       [0],
       [0],
       ..., 
       [0],
       [0],
       [0]], dtype=int64)
        MI_FS.k = 5
     36         else:
     37             vars = (joint, MI_FS.y)
     38             MI = mutual_info_.mutual_info_regression(vars, MI_FS.k)
     39     else:

...........................................................................
D:\mifs-master_2\mifs-master\MRMR\mutual_info_.py in mutual_info_classif(X=array([[-0.042671, -0.042671],
       [-0.030316...0573,  0.070573],
       [ 0.037396,  0.037396]]), y=array([[1],
       [0],
       [0],
       ..., 
       [0],
       [0],
       [0]], dtype=int64), discrete_features=5, n_neighbors=3, copy=True, random_state=None)
    433     .. [4] L. F. Kozachenko, N. N. Leonenko, "Sample Estimate of the Entropy
    434            of a Random Vector:, Probl. Peredachi Inf., 23:2 (1987), 9-16
    435     """
    436     check_classification_targets(y)
    437     return _estimate_mi(X, y, discrete_features, True, n_neighbors,
--> 438                         copy, random_state)
        copy = True
        random_state = None
    439 
    440 
    441 
    442 

...........................................................................
D:\mifs-master_2\mifs-master\MRMR\mutual_info_.py in _estimate_mi(X=array([[-0.042671, -0.042671],
       [-0.030316...0573,  0.070573],
       [ 0.037396,  0.037396]]), y=array([1, 0, 0, ..., 0, 0, 0], dtype=int64), discrete_features=array(5), discrete_target=True, n_neighbors=3, copy=True, random_state=None)
    252         discrete_mask.fill(discrete_features)
    253     else:
    254         discrete_features = np.asarray(discrete_features)
    255         if discrete_features.dtype != 'bool':
    256             discrete_mask = np.zeros(n_features, dtype=bool)
--> 257             discrete_mask[discrete_features] = True
        discrete_mask = array([False, False], dtype=bool)
        discrete_features = array(5)
    258         else:
    259             discrete_mask = discrete_features
    260 
    261     continuous_mask = ~discrete_mask

IndexError: index 5 is out of bounds for axis 0 with size 2
___________________________________________________________________________
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "C:\Users\Markazi.co\Anaconda3\lib\site-packages\joblib\parallel.py", line 727, in retrieve
    self._output.extend(job.get())
  File "C:\Users\Markazi.co\Anaconda3\lib\multiprocessing\pool.py", line 608, in get
    raise self._value
joblib.my_exceptions.TransportableException: TransportableException
___________________________________________________________________________
IndexError                                         Sun Sep 11 21:06:11 2016
PID: 1908            Python 3.5.1: C:\Users\Markazi.co\Anaconda3\python.exe
...........................................................................
C:\Users\Markazi.co\Anaconda3\lib\site-packages\joblib\parallel.py in __call__(self=<joblib.parallel.BatchedCalls object>)
     67     def __init__(self, iterator_slice):
     68         self.items = list(iterator_slice)
     69         self._size = len(self.items)
     70 
     71     def __call__(self):
---> 72         return [func(*args, **kwargs) for func, args, kwargs in self.items]
        self.items = [(<function _get_mi>, (0, 0, <mifs.MutualInformationFeatureSelector object>), {})]
     73 
     74     def __len__(self):
     75         return self._size
     76 

...........................................................................
C:\Users\Markazi.co\Anaconda3\lib\site-packages\joblib\parallel.py in <listcomp>(.0=<list_iterator object>)
     67     def __init__(self, iterator_slice):
     68         self.items = list(iterator_slice)
     69         self._size = len(self.items)
     70 
     71     def __call__(self):
---> 72         return [func(*args, **kwargs) for func, args, kwargs in self.items]
        func = <function _get_mi>
        args = (0, 0, <mifs.MutualInformationFeatureSelector object>)
        kwargs = {}
     73 
     74     def __len__(self):
     75         return self._size
     76 

...........................................................................
D:\mifs-master_2\mifs-master\MRMR\mi.py in _get_mi(f=0, s=0, MI_FS=<mifs.MutualInformationFeatureSelector object>)
     30     n, p = MI_FS.X.shape
     31     if MI_FS.method in ['JMI','JMIM']:
     32         # JMI & JMIM
     33         joint = MI_FS.X[:,(s, f)]
     34         if MI_FS.categorical:
---> 35             MI = mutual_info_.mutual_info_classif(joint, MI_FS.y, MI_FS.k)
        MI = undefined
        joint = array([[-0.042671, -0.042671],
       [-0.030316...0573,  0.070573],
       [ 0.037396,  0.037396]])
        MI_FS.y = array([[1],
       [0],
       [0],
       ..., 
       [0],
       [0],
       [0]], dtype=int64)
        MI_FS.k = 5
     36         else:
     37             vars = (joint, MI_FS.y)
     38             MI = mutual_info_.mutual_info_regression(vars, MI_FS.k)
     39     else:

...........................................................................
D:\mifs-master_2\mifs-master\MRMR\mutual_info_.py in mutual_info_classif(X=array([[-0.042671, -0.042671],
       [-0.030316...0573,  0.070573],
       [ 0.037396,  0.037396]]), y=array([[1],
       [0],
       [0],
       ..., 
       [0],
       [0],
       [0]], dtype=int64), discrete_features=5, n_neighbors=3, copy=True, random_state=None)
    433     .. [4] L. F. Kozachenko, N. N. Leonenko, "Sample Estimate of the Entropy
    434            of a Random Vector:, Probl. Peredachi Inf., 23:2 (1987), 9-16
    435     """
    436     check_classification_targets(y)
    437     return _estimate_mi(X, y, discrete_features, True, n_neighbors,
--> 438                         copy, random_state)
        copy = True
        random_state = None
    439 
    440 
    441 
    442 

...........................................................................
D:\mifs-master_2\mifs-master\MRMR\mutual_info_.py in _estimate_mi(X=array([[-0.042671, -0.042671],
       [-0.030316...0573,  0.070573],
       [ 0.037396,  0.037396]]), y=array([1, 0, 0, ..., 0, 0, 0], dtype=int64), discrete_features=array(5), discrete_target=True, n_neighbors=3, copy=True, random_state=None)
    252         discrete_mask.fill(discrete_features)
    253     else:
    254         discrete_features = np.asarray(discrete_features)
    255         if discrete_features.dtype != 'bool':
    256             discrete_mask = np.zeros(n_features, dtype=bool)
--> 257             discrete_mask[discrete_features] = True
        discrete_mask = array([False, False], dtype=bool)
        discrete_features = array(5)
    258         else:
    259             discrete_mask = discrete_features
    260 
    261     continuous_mask = ~discrete_mask

IndexError: index 5 is out of bounds for axis 0 with size 2
___________________________________________________________________________

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "D:/mifs-master_2/mifs-master/MRMR/untitled0.py", line 21, in <module>
    feat_selector.fit(X, y)
  File "D:\mifs-master_2\mifs-master\MRMR\mifs.py", line 137, in fit
    return self._fit(X, y)
  File "D:\mifs-master_2\mifs-master\MRMR\mifs.py", line 228, in _fit
    feature_mi_matrix[s, F] = mi.get_mi_vector(self, F, s)
  File "D:\mifs-master_2\mifs-master\MRMR\mi.py", line 25, in get_mi_vector
    MIs = Parallel(n_jobs=num_cores)(delayed(_get_mi)(f, s, MI_FS) for f in F)
  File "C:\Users\Markazi.co\Anaconda3\lib\site-packages\joblib\parallel.py", line 810, in __call__
    self.retrieve()
  File "C:\Users\Markazi.co\Anaconda3\lib\site-packages\joblib\parallel.py", line 757, in retrieve
    raise exception
joblib.my_exceptions.JoblibIndexError: JoblibIndexError
___________________________________________________________________________
Multiprocessing exception:
...........................................................................
D:\mifs-master_2\mifs-master\MRMR\untitled0.py in <module>()
     16 
     17 # define MI_FS feature selection method
     18  feat_selector = mifs.MutualInformationFeatureSelector(method='JMI', verbose=2,n_features ='auto')
     19  
     20 # find all relevant features
---> 21  feat_selector.fit(X, y)
     22  
     23 # check selected features
     24  feat_selector.support_
     25 

...........................................................................
D:\mifs-master_2\mifs-master\MRMR\mifs.py in fit(self=<mifs.MutualInformationFeatureSelector object>, X=array([[ -0.042671,  -0.051995,  -0.063643, ...,.....,   8.2168  ,   0.36536 ,
          0.23602 ]]), y=array([1, 0, 0, ..., 0, 0, 0], dtype=int64))
    132 
    133         y : array-like, shape = [n_samples]
    134             The target values.
    135         """
    136 
--> 137         return self._fit(X, y)
        self._fit = <bound method MutualInformationFeatureSelector._...f <mifs.MutualInformationFeatureSelector object>>
        X = array([[ -0.042671,  -0.051995,  -0.063643, ...,.....,   8.2168  ,   0.36536 ,
          0.23602 ]])
        y = array([1, 0, 0, ..., 0, 0, 0], dtype=int64)
    138 
    139 
    140     def transform(self, X):
    141         """

...........................................................................
D:\mifs-master_2\mifs-master\MRMR\mifs.py in _fit(self=<mifs.MutualInformationFeatureSelector object>, X=array([[ -0.042671,  -0.051995,  -0.063643, ...,.....,   8.2168  ,   0.36536 ,
          0.23602 ]]), y=array([1, 0, 0, ..., 0, 0, 0], dtype=int64))
    223         # ----------------------------------------------------------------------
    224 
    225         while (self.n_features == 'auto' or len(S) < self.n_features) and len(F) > 0:
    226             # loop through the remaining unselected features and calculate MI
    227             s = len(S) - 1
--> 228             feature_mi_matrix[s, F] = mi.get_mi_vector(self, F, s)
        feature_mi_matrix = array([[ nan,  nan,  nan, ...,  nan,  nan,  nan]...      [ nan,  nan,  nan, ...,  nan,  nan,  nan]])
        s = 0
        F = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]
        self = <mifs.MutualInformationFeatureSelector object>
    229 
    230             # make decision based on the chosen FS algorithm
    231             fmm = feature_mi_matrix[:len(S),F]
    232             if self.method == 'JMI':

...........................................................................
D:\mifs-master_2\mifs-master\MRMR\mi.py in get_mi_vector(MI_FS=<mifs.MutualInformationFeatureSelector object>, F=[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12], s=0)
     20 
     21     This function is for when |S| > 1. s is the previously selected feature.
     22     We exploite the fact that this step is embarrassingly parallel.
     23     """
     24 
---> 25     MIs = Parallel(n_jobs=num_cores)(delayed(_get_mi)(f, s, MI_FS) for f in F)
        MIs = undefined
        s = 0
        MI_FS = <mifs.MutualInformationFeatureSelector object>
        F = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]
     26     return MIs
     27 
     28 
     29 def _get_mi(f, s, MI_FS):

...........................................................................
C:\Users\Markazi.co\Anaconda3\lib\site-packages\joblib\parallel.py in __call__(self=Parallel(n_jobs=8), iterable=<generator object get_mi_vector.<locals>.<genexpr>>)
    805             if pre_dispatch == "all" or n_jobs == 1:
    806                 # The iterable was consumed all at once by the above for loop.
    807                 # No need to wait for async callbacks to trigger to
    808                 # consumption.
    809                 self._iterating = False
--> 810             self.retrieve()
        self.retrieve = <bound method Parallel.retrieve of Parallel(n_jobs=8)>
    811             # Make sure that we get a last message telling us we are done
    812             elapsed_time = time.time() - self._start_time
    813             self._print('Done %3i out of %3i | elapsed: %s finished',
    814                         (len(self._output), len(self._output),

---------------------------------------------------------------------------
Sub-process traceback:
---------------------------------------------------------------------------
IndexError                                         Sun Sep 11 21:06:11 2016
PID: 1908            Python 3.5.1: C:\Users\Markazi.co\Anaconda3\python.exe
...........................................................................
C:\Users\Markazi.co\Anaconda3\lib\site-packages\joblib\parallel.py in __call__(self=<joblib.parallel.BatchedCalls object>)
     67     def __init__(self, iterator_slice):
     68         self.items = list(iterator_slice)
     69         self._size = len(self.items)
     70 
     71     def __call__(self):
---> 72         return [func(*args, **kwargs) for func, args, kwargs in self.items]
        self.items = [(<function _get_mi>, (0, 0, <mifs.MutualInformationFeatureSelector object>), {})]
     73 
     74     def __len__(self):
     75         return self._size
     76 

...........................................................................
C:\Users\Markazi.co\Anaconda3\lib\site-packages\joblib\parallel.py in <listcomp>(.0=<list_iterator object>)
     67     def __init__(self, iterator_slice):
     68         self.items = list(iterator_slice)
     69         self._size = len(self.items)
     70 
     71     def __call__(self):
---> 72         return [func(*args, **kwargs) for func, args, kwargs in self.items]
        func = <function _get_mi>
        args = (0, 0, <mifs.MutualInformationFeatureSelector object>)
        kwargs = {}
     73 
     74     def __len__(self):
     75         return self._size
     76 

...........................................................................
D:\mifs-master_2\mifs-master\MRMR\mi.py in _get_mi(f=0, s=0, MI_FS=<mifs.MutualInformationFeatureSelector object>)
     30     n, p = MI_FS.X.shape
     31     if MI_FS.method in ['JMI','JMIM']:
     32         # JMI & JMIM
     33         joint = MI_FS.X[:,(s, f)]
     34         if MI_FS.categorical:
---> 35             MI = mutual_info_.mutual_info_classif(joint, MI_FS.y, MI_FS.k)
        MI = undefined
        joint = array([[-0.042671, -0.042671],
       [-0.030316...0573,  0.070573],
       [ 0.037396,  0.037396]])
        MI_FS.y = array([[1],
       [0],
       [0],
       ..., 
       [0],
       [0],
       [0]], dtype=int64)
        MI_FS.k = 5
     36         else:
     37             vars = (joint, MI_FS.y)
     38             MI = mutual_info_.mutual_info_regression(vars, MI_FS.k)
     39     else:

...........................................................................
D:\mifs-master_2\mifs-master\MRMR\mutual_info_.py in mutual_info_classif(X=array([[-0.042671, -0.042671],
       [-0.030316...0573,  0.070573],
       [ 0.037396,  0.037396]]), y=array([[1],
       [0],
       [0],
       ..., 
       [0],
       [0],
       [0]], dtype=int64), discrete_features=5, n_neighbors=3, copy=True, random_state=None)
    433     .. [4] L. F. Kozachenko, N. N. Leonenko, "Sample Estimate of the Entropy
    434            of a Random Vector:, Probl. Peredachi Inf., 23:2 (1987), 9-16
    435     """
    436     check_classification_targets(y)
    437     return _estimate_mi(X, y, discrete_features, True, n_neighbors,
--> 438                         copy, random_state)
        copy = True
        random_state = None
    439 
    440 
    441 
    442 

...........................................................................
D:\mifs-master_2\mifs-master\MRMR\mutual_info_.py in _estimate_mi(X=array([[-0.042671, -0.042671],
       [-0.030316...0573,  0.070573],
       [ 0.037396,  0.037396]]), y=array([1, 0, 0, ..., 0, 0, 0], dtype=int64), discrete_features=array(5), discrete_target=True, n_neighbors=3, copy=True, random_state=None)
    252         discrete_mask.fill(discrete_features)
    253     else:
    254         discrete_features = np.asarray(discrete_features)
    255         if discrete_features.dtype != 'bool':
    256             discrete_mask = np.zeros(n_features, dtype=bool)
--> 257             discrete_mask[discrete_features] = True
        discrete_mask = array([False, False], dtype=bool)
        discrete_features = array(5)
    258         else:
    259             discrete_mask = discrete_features
    260 
    261     continuous_mask = ~discrete_mask

IndexError: index 5 is out of bounds for axis 0 with size 2
___________________________________________________________________________

Process finished with exit code 1

Sorry again,
Regarding how to make MRMR work: It seems MRMR does not select the first and last features. Specially, for the data set including 14 features, it selects all features except them. In fact, If I change the place of first and last feature in my csv file, they would be selected.

Kind regards,

from mifs.

nmayorov avatar nmayorov commented on September 16, 2024

Regarding using sklearn functions for JMI/JMIM --- it can't be done because it was designed to compute MI between univariate random variables, and JMI/JMIM requires to compute MI involving a 2-variate variable (named joint in the code).

This is not limitation of algorithms used, it is just how it was agreed to introduce to sklearn (to match its univariate feature selection scheme).


As for negative MI --- algorithms in sklearn may compute negative values, in this case it is replaced by 0 (and it is reasonable), in this code it is replaced by nan.


Generally the code in sklearn is more polished and tested, but I can't claim that it necessary does a better job that the code in this repo (although I would trust it more).

@shahlaebrahimi I suggest you to try the code from this repo, but

  1. Change here to return 0. I suspect the current code designed to work with nans correctly, but I feel that changing to 0 is safer.
  2. As @limi44 noted this line should be changed to feature_mi_matrix[s, F] = mi.get_mi_vector(self, F, S[-1]).

But if I were you I would probably dig more deeply in the algorithms and implement a version I would be certain about or thoroughly check the provided implementation.

from mifs.

limi44 avatar limi44 commented on September 16, 2024

@nmayorov Thanks for the tip regarding JMI/JMIM! And thanks for implementing the MI methods in sklearn, I'd love to see MI based feature selection eventually integrated as well. You are also correct about the change to line 225, I made a typo in my previous comment which I have now corrected.

From what I understand, due to the approximations made in computing MI, it is possible sometimes to get negative values, even though by definition MI is positive. With the MI methods in this repo, I was getting many NaN values (negative MIs), which is why I switched to the sklearn methods, that at least gave me positive values.

@shahlaebrahimi I agree with @nmayorov, you should try to understand how JMI/JMIM work before debugging the code, otherwise you won't be able to determine if the problem is syntax related or if there's a problem with the implementation. Regarding your comment about how the first and last feature are not selected by MRMR, keep in mind that Python data structures are zero-indexed so the features selected are numbered 0-13, not 1-14.

from mifs.

danielhomola avatar danielhomola commented on September 16, 2024

Hi everyone,

very sorry for my long absence. Life got a bit busy.. This thread exploded and there's no way I can read this through now..

@nmayorov I tried to get the sklearn guys to incorporate JMI based filter methods into the feature_selection module, and the author of a very comprehensive review paper in the topic also chimed in, but I'm not sure they were convinced. See here: scikit-learn/scikit-learn#6313 (comment)

I'd be happy to work with you on integrating JMI based FS into an sklearn module. If you're interested. The methods behind these algorithms seem really well established and studied.

from mifs.

danielhomola avatar danielhomola commented on September 16, 2024

Can you please try the latest version of the code and report back if you still encounter the bug? Thanks!

from mifs.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.