Giter VIP home page Giter VIP logo

grouped_permutation_importance's Introduction

Grouped Permutation Importance

Understanding the fundamentals of a decision-making process is, for most purposes, an essential step in the field of machine learning. In this context, the analysis of predefined groups of features can provide important indications for comprehending and improving the prediction. This repository extend the univariate permutation importance to a grouped version for evaluating the influence of whole feature subsets in a machine learning model. This is done by a slight modification of the permutation importance of scikit-learn.

Install via pip

pip install git+https://github.com/lucasplagwitz/grouped_permutation_importance
from grouped_permutation_importance import grouped_permutation_importance

data = load_breast_cancer()
feature_names = data["feature_names"].tolist()
X, y = data["data"], data["target"]

idxs = []
columns = ["mean", "error", "worst"]
for key in columns:
    idxs.append([x for (x, y) in enumerate(feature_names) if key in y])

cv = RepeatedStratifiedKFold()
pipe = Pipeline([("MinMax", MinMaxScaler()),  ("SVC", SVC())])


r = grouped_permutation_importance(pipe, X, y, idxs=idxs, n_repeats=50, random_state=0, 
                                   scoring="balanced_accuracy", n_jobs=5, cv=cv, 
                                   perm_set="test")

Simulation

In the file "examples/make_class.py" a small simulation is shown to verify correctness. Based on scikit-learns make_classification method, different informative subsets are analyzed.

Model interpretation

The file "examples/brain_atlas.py" demonstrates a neuroimaging example for rating brain regions depending on the target variable (age, CDR, biological sex).

Citing

If you use the Grouped Permutation Importance in a scientific publication, we would appreciate citations to the following paper:

Lucas Plagwitz, Alexander Brenner, Michael Fujarski, and Julian Varghese. Supporting AI-Explainability by Analyzing Feature Subsets in a Machine Learning Model.
Studies in Health Technology and Informatics, Volume 294: Challenges of Trustable AI and Added-Value on Health. doi:10.3233/SHTI220406

grouped_permutation_importance's People

Contributors

lucasplagwitz avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.