Giter VIP home page Giter VIP logo

mlbugdetection's Introduction

MLBugDetection

Machine learning explainability and unexpectated behaviors detection

Overview

Most machine learning explainability packages requires both trained models and the training data to create Explainer objects that explain the model's behavior. This package allows ceteris paribus analysis of features using only the trained model and one or more input samples.

Documentation

Installation:

pip install mlbugdetection

How to use

Functions

  1. Monotonic:

    from mlbugdetection.monotonic import check_monotonicity_single_sample, check_monotonicity_multiple_samples

    Usage: For 1 sample

    check_monotonicity_single_sample(model, sample, feature, start, stop, step=1)

    For more than 1 sample:

    check_monotonicity_multiple_samples(model, samples, feature, start, stop, step=1)
  2. Critical Values:

    from mlbugdetection.critical_values import find_critical_values, find_several_critical_values

    Usage: For 1 sample

    find_critical_values(model, sample, feature, start, stop, step=1)

    For more than 1 sample:

    find_several_critical_values(model, samples, feature, start, stop, step=1, bins=15, keep_n=5, log=False)
  3. Calibration:

    from mlbugdetection.calibration import calibration_check

    Usage:

    calibration_check(model, samples, target, pos_label=1)
  4. Sanity:

    from mlbugdetection.sanity import sanity_check, sanity_check_with_indexes

    Usage:

    sanity_check(model, samples, target)

    Same as sanity check, but return the failed indexes

    sanity_check_with_indexes(model, samples, target)

Virtual Environment with Jupyter Notebook

python3 -m virtualenv venv
source venv/bin/activate
python -m pip install --upgrade pip
pip install -r requirements.txt

mlbugdetection's People

Contributors

fbarth avatar gleviosa avatar jpgianfaldoni avatar williamars avatar

Watchers

 avatar  avatar  avatar  avatar

Forkers

pfeinsper

mlbugdetection's Issues

check_monotonicity aceita dataframe sample com row maior que 1?

Testei r = check_monotonicity(model, X_train, 'area_worst', 0, 5000, steps=10000) inde X_train é um dataframe com 5 rows. A função executou sem lançar nenhuma exception.

O plot gerado não me pareceu correto. Não pareceu sumarizar bem o comportamento do atributo levando-se em consideração os diversos exemplos.

Ao executar:

for i in range(0,5):
    row = X_train.loc[i:i]
    r = check_monotonicity(model, row, 'area_worst', 0, 5000, steps=10000)
    print(r.metrics)

temos gráficos com comportamentos diferentes. valores para o eixo Y bem diferentes.

ao executar:

r = check_monotonicity(model, X_train, 'area_worst', 0, 5000, steps=10000)

é criado um gráfico que talvez não sumariza bem a análise.

Seria possível sugerir valores de start e stop para a execução de check_monotonicity?

Ao executar check_monotonicity(model, row, attr, start, stop, steps=1000) precisamos saber o valor mínimo e máximo de attr.

Seria possível sugerir valores de start e stop para a execução de check_monotonicity? Assim, o usuário não precisaria se preocupar com a definição destes valores.

Observação: vale lembrar que não temos acesso aos dados de treinamento. Uma instância do modelo carregada via pkl fornece estipo de informação?

Datasets não estão no repositório

Os datasets

full_df = pd.read_csv('Datasets/fraud_new.csv')
df = pd.read_csv('Datasets/fraud.csv')

Não estão no repositório. Não tem como executar os notebooks já existentes e também não tem como criar outros.

calibration_check não consegue fazer a análise com dataframe com y categórico

Ao executar calibration_check(model, df_full, 'diagnosis') onde df_full['diagnosis'] tem valores 'B' ou "M', a função calibration_check(model, df_full, 'diagnosis') retorna o seguinte erro:

ValueError: y_true takes value in {'B', 'M'} and pos_label is not specified: either make y_true take value in {0, 1} or {-1, 1} or pass pos_label explicitly.

Sugiro propagar o atributo pos_label da função brier_score_loss para a função calibration_check.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.