Giter VIP home page Giter VIP logo

audit-ai's Introduction

audit-AI

Open Sourced Bias Testing for Generalized Machine Learning Applications

audit-AI is a Python library built on top of pandas and sklearnthat implements fairness-aware machine learning algorithms. audit-AI was developed by the Data Science team at pymetrics

Bias Testing for Generalized Machine Learning Applications

audit-AI a tool to measure and mitigate the effects discriminatory patterns in training data and the predictions made by machine learning algorithms trained for the purposes of socially sensitive decision processes.

The overall goal of this research is to come up with a reasonable way to think about how to make machine learning algorithms more fair. While identifying potential bias in training datasets and by consequence the machine learning algorithms trained on them is not sufficient to solve the problem of discrimination, in a world where more and more decisions are being automated by Artifical Intelligence, our ability to understand and identify the degree to which an algorithm is fair or biased is a step in the right direction.

Features

Here are a few of the bias testing and algorithm auditing techniques that this library implements.

Classification tasks

  • 4/5th, fisher, z-test, bayes factor, chi squared
  • sim_beta_ratio, classifier_posterior_probabilities

Regression tasks

  • anova
  • 4/5th, fisher, z-test, bayes factor, chi squared
  • group proportions at different thresholds

Installation

The source code is currently hosted on GitHub: https://github.com/pymetrics/audit-ai

You can install the latest released version with pip.

# pip
pip install audit-AI

If you install with pip, you'll need to install scikit-learn, numpy, and pandas with either pip or conda. Version requirements:

  • numpy
  • scipy
  • pandas

For vizualization:

  • matplotlib
  • seaborn

How to use this package:

from auditai.misc import bias_test_check

X = df.loc[:,features]
y_pred = clf.predict_proba(X)

# test for bias
bias_test_check(labels=df['gender'], results=y_pred, category='Gender')

>>> *Gender passes 4/5 test, Fisher p-value, Chi-Squared p-value, z-test p-value and Bayes Factor at 50.00*

To get a plot of the different tests at different thresholds:

from auditai.viz import plot_threshold_tests

X = df.loc[:,features]
y_pred = clf.predict_proba(X)

# test for bias
plot_threshold_tests(labels=df['gender'], results=y_pred, category='Gender')

Sample audit-AI Plot

Example Datasets

audit-ai's People

Contributors

markaward avatar vishalmurali avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.