Giter VIP home page Giter VIP logo

mihifepe's Introduction

mihifepe

Release status

Build status

Documentation Status

Updates

Overview

mihifepe, or Model Interpretability via Hierarchical Feature Perturbation, is a library implementing a model-agnostic method that, given a learned model and a hierarchy over features, (i) tests feature groups, in addition to base features, and tries to determine the level of resolution at which important features can be determined, (ii) uses hypothesis testing to rigorously assess the effect of each feature on the model's loss, (iii) employs a hierarchical approach to control the false discovery rate when testing feature groups and individual base features for importance, and (iv) uses hypothesis testing to identify important interactions among features and feature groups. mihifepe is based on the following paper:

Lee, Kyubin, Akshay Sood, and Mark Craven. 2019. “Understanding Learned Models by Identifying Important Features at the Right Resolution.” In Proceedings of the AAAI Conference on Artificial Intelligence, 33:4155–63. https://doi.org/10.1609/aaai.v33i01.33014155.

Documentation

https://mihifepe.readthedocs.io

Installation

Recommended installation method is via virtual environments and pip. In addition, you also need graphviz installed on your system.

When making the virtual environment, specify python3 as the python executable (python3 version must be 3.5+):

mkvirtualenv -p python3 mihifepe_env

To install the latest stable release:

pip install mihifepe

Or to install the latest development version from GitHub:

pip install git+https://github.com/Craven-Biostat-Lab/mihifepe.git@master#egg=mihifepe

On Ubuntu, graphviz may be installed by:

sudo apt-get install graphviz

Development

https://mihifepe.readthedocs.io/en/latest/contributing.html

Usage

https://mihifepe.readthedocs.io/en/latest/usage.html

License

mihifepe is free, open source software, released under the MIT license. See LICENSE for details.

Contact

Akshay Sood

mihifepe's People

Contributors

cloudbopper avatar

Stargazers

 avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

Forkers

cloudborger

mihifepe's Issues

Fix shuffling perturbations for interactions

Currently, shuffling perturbations independently permute instances for features being perturbed. This is ok for feature importance analysis, but for interactions analysis, the joint shuffling perturbations for an interaction pair should use the same permutations as used by its component features.

Initial Update

The bot created this issue to inform you that pyup.io has been set up on this repo.
Once you have closed it, the bot will open pull requests for updates as soon as they are available.

Make temporal features optional and/or throw and catch an exception rather than throwing an error when perturbation=shuffling

Currently temporal features are required in the input datafile and if the perturbation mode is set to shuffling, perturb_temporal_data throws a ValueError halting the execution. For use cases where temporal features don't apply and/or one wants to shuffle instead of zeroing out, this causes an issue. It'd be great if temporal features are optional in the input file; and until shuffling temporal data is implemented, maybe throw an exception in perturb_temporal_data() at line 210 instead of an error and catch it in [perturb_features_for_record() near line 170] (

tdata = self.perturb_temporal_data(feature, temporal_data)
).

Error while running -analyze_interactions and -condor together

The code may error out when -analyze_interactions and -condor are enabled together.
To reproduce: Disable -cleanup flag in simulation.py; run 'pytest --basetemp=condor_test_outputs tests/condor_tests'
Cause:

  • pipelines.py->SerialPipeline->features_filename assignment ignored by condor helper that writes features to file
  • Condor helper compile_results picks up previously computed non-interaction results due to wildcard match

Fix:

  • Change condor helper writer to accept input filename arg given serial mode
  • Change condor helper result compiler to only process correct result file given serial mode

Return model instance rather than the model class

Method load_model() in worker.py imports the user-supplied model generation module and returns the model class found in the module. This causes an issue if the user-implemented model.predict() and model.loss() are instance methods (might be because they use and/or need instance variables) rather than static/class methods. Would suggest instantiating model after line 107 e.g. model_inst = model(), and returning the model_inst to cover both scenarios, since model instances can still call static and class methods.

Populate effect size for regression output

Currently the effect size column in pvalues.csv output file for regression models is empty. We discussed populating it with |MSE(baseline) - MSE(perturbation)|. If a similar field could also be provided for interaction tests, that would be great.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.