Giter VIP home page Giter VIP logo

equalityml's Introduction

EqualityAI Logo

Continuous Integration License Contributor Covenant

Equality AI EqualityML

Let's end algorithmic bias together!

Equality AI (EAI) is a public-benefit corporation dedicated to providing developers with evidence-based tools to end algorithmic bias. Our tools are built by developers for developers. So, we know that developers want their models to be fair, but we also understand that bias is difficult and intimidating.

The EAI EqualityML repository provides tools and guidance on how to include fairness and bias mitigation methods to model fitting so as to safeguard the people on the receiving end of our models from bias.

If you like what we're doing, give us a ⭐ and join our EAI Manifesto!!


We have extented EqualityML to include other aspects of Responsible AI (see full framework Figure 1.) and collaboration features to create our Beta MLOps Developer Studio. Become a Beta user by going to our website!

Figure 1: Full Responsible AI Framework.

Introduction

Incorporating bias mitigation methods and fairness metrics into the traditional end-to-end MLOps is called fairness-based machine learning (ML) or fair machine learning. However, fair ML comes with its own challenges. We assembled a diverse team of statisticians and ML experts to provide evidence-based guidance on fairness metrics use/selection and validated code to properly run bias mitigation methods.

Click to read our findings:

Fairness Metric

  • Statistical measure of the output of a machine learning model based a mathematical definition of fairness.

Fairness Metric Guide: We have combined fairness metrics and bias mitigation into a unified syntax.
Statistical Parity | Conditional Statistical Parity | Negative Predictive Parity | Equal Opportunity | Balance for Positive Class | Predictive Parity | Well Calibration | Calibration | Conditional Use Accuracy | Predictive Equality | Balance for Negative Class | Equalized Odds | Overall Balance

Bias Mitigation

  • Methods or algorithms applied to a machine learning dataset or model to improve the fairness of the model output. Many mitigation methods have been proposed in the literature, which can be broadly classified into the application of a mitigation method on the data set (pre-processing), in the model fitting (in-processing), and to the model predictions (post-processing).

Bias Mitigation Guide:
Resampling | Reweighting | Disparate Impact Remover | Correlation Remover

Figure 2: Bias mitigation can be performed in the pre-processing, in-processing, and post-processing of a model.

Need a specific metric or method? Just let us know!

Potential Uses

  • Bias mitigation methods are employed to address bias in data and/or machine learning models and fairness metrics are needed to mathematically represent the fairness or bias levels of a ML model.
Use Description
As a metric Quantify a measure of fairness (a.k.a a fairness metric) targeting a bias
Evaluate fairness Fairness metrics can be used to mathematically represent the fairness levels of a ML model. This can also be used to monitor a model.
Create parity on fairness Unlike model performance metrics (e.g., loss, accuracy, etc.), fairness metrics affect your final model selection by creating parity (i.e., equality) on appropriate fairness metrics before model deployment.
Select most fair model Balance fairness with performance metrics when selecting the final model.
Apply methods to improve the fairness & performance tradeoff Methods to improve the fairness by applying a.k.a bias mitigation methods

Table 1: The potential uses for fairness metrics and bias mitigation methods.

Note: Parity is achieved when a fairness metric (such as the percent of positive predictions) have the same value across all levels of a sensitive attribute. Sensitive attributes are attributes such as race, gender, age, and other patient attributes that are of primary concern when it comes to fairness, and are typically protected by law.

Through these steps we safeguard against bias by:

  1. Creating metrics targeting sources of bias to balance alongside our performance metrics in evaluation, model selection, and monitoring.
  2. Applying bias mitigation methods to improve fairness without compromising performance.

EAI EqualityML Workflow

We have conducted extensive literature review and theoretical analysis on dozens of fairness metrics and mitigation methods. Theoretical properties of those fairness mitigation methods were analyzed to determine their suitability under various conditions to create our framework for a pre-processing workflow.

Pre-processing Workflow Tool or Guidance provided
1. Select Fairness Metric Use our Fairness Metric Selection Questionnaire & Tree to determine appropriate fairness metric(s)
2. Data Preparation
3. Fit Prediction Model
4. Compute Model Results and Evaluate Fairness Metric Use EqualityML method fairness_metric to evaluate the fairness of a model
5. Run Bias Mitigation Use EqualityML method bias_mitigation to run various bias mitigation methods on your dataset
6. Compute Model Results and Fairness Metric After Mitigation fairness_metric bias_mitigation
7. Compare Model Results and Fairness Metric Before and After Mitigation fairness_metric bias_mitigation

Table 2: The Equality AI recommended pre-processing workflow with tools and guidance made available per step.

We recommend assessing the fairness of the same ML model after bias mitigation is applied. By comparing the predictions before and after mitigation, we will be able to assess whether and to what extent the fairness can be improved. Furthermore, the trade-offs between the accuracy and fairness of the machine learning model will be examined.

In-processing and Post-processing are still under development. Do you need this now? Let us know!

Guidance on selecting Fairness Metrics

To make fairness metric selection easy we have provided a few essential questions you must answer to identify the appropriate fairness metric for your use case. Click here for the questionnaire. Complete the answers to this questionnaire, then refer to the scoring guide to map your inputs to the desired metrics.


Figure 3: Tree representation of questionnaire.

After identifying the important fairness criteria, we recommend you attempt to use multiple bias mitigation strategies to try to optimize the efficiency-fairness tradeoff.

EqualityML Installation

Python

The EqualityML python package can be installed from PyPI.

pip install equalityml

Manual Installation

Clone the last version of this repository:

https://github.com/EqualityAI/EqualityML.git

In the root directory of the project run the command:

poetry install

Package Testing

To run the bunch of tests over the EqualityML package, dependencies shall be first installed before calling pytest.

poetry install --with test
pytest tests

Quick Tour

Check out the example below to see how EqualityML can be used to assess fairness metrics and mitigate unwanted bias in the dataset.

from sklearn.linear_model import LogisticRegression
from equalityml import FAIR
import numpy as np
import pandas as pd

# Sample unfair dataset
random_col = np.random.normal(size=30)
sex_col = [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
           0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
weight_col = [80, 75, 70, 65, 60, 85, 70, 75, 70, 70, 70, 80, 70, 70, 70, 80, 75, 70, 65, 70,
              70, 75, 80, 75, 75, 70, 65, 70, 75, 65]
target_col = [1, 1, 0, 1, 1, 1, 1, 0, 0, 1, 1, 1, 0, 1, 0, 1, 1, 0, 1, 1,
              0, 1, 0, 1, 1, 0, 0, 1, 1, 0]
training_data = pd.DataFrame({"random": random_col, "sex": sex_col, "weight": weight_col, 
                              "Y": target_col})
    
# Train a machine learning model (for example LogisticRegression)
ml_model = LogisticRegression()
ml_model.fit(training_data.drop(columns="Y"), training_data["Y"])

# Instantiate a FAIR object
fair_obj = FAIR(ml_model=ml_model, 
                training_data=training_data,
                target_variable="Y",
                protected_variable="sex", 
                privileged_class=1)

# Evaluate a fairness metric (for example statistical parity ratio)
metric_name = 'statistical_parity_ratio'
fairness_metric = fair_obj.fairness_metric(metric_name)

# In case the model is unfair in terms of checked fairness metric (value is not close to 1), 
# EqualityML provides a range of methods to try to mitigate bias in Machine Learning models. 
# For example, we can use 'resampling' to perform mitigation on training dataset.

mitigation_method = "resampling"
mitigation_result = fair_obj.bias_mitigation(mitigation_method)

# Now we can re-train the machine learning model based on that mitigated data and 
# evaluate again the fairness metric
mitigated_data = mitigation_result['training_data']
ml_model.fit(mitigated_data.drop(columns="Y"), mitigated_data["Y"])

fair_obj.update_classifier(ml_model)
new_fairness_metric = fair_obj.fairness_metric(metric_name)

# print the unmitigated fairness metric
print(f"Unmitigated fairness metric = {fairness_metric}")

# print the mitigated fairness metric
print(f"Mitigated fairness metric = {new_fairness_metric}")

# All available fairness metrics and bias mitigation can be printed calling the methods:
fair_obj.print_fairness_metrics()
fair_obj.print_bias_mitigation_methods()

R

The EqualityML R package can be installed from CRAN:

install.packages("equalityml")

or developer version from GitHub:

devtools::install_github("EqualityAI/equalityml/equalityml-r")

For more details regarding the R package, please check here.

Responsible AI Takes a Community

The connections and trade-offs between fairness, explainability, and privacy require a holistic approach to Responsible AI development in the machine learning community. We are starting with the principle of fairness and working towards a solution that incorporates multiple aspects of Responsible AI for data scientists and healthcare professionals. We have much more in the works, and we want to know—what do you need? Do you have a Responsible AI challenge you need to solve? Drop us a line and let’s see how we can help!

EqualityAI Logo

Contributing to the project

Equality AI uses both GitHub and Slack to manage our open source community. To participate:

  1. Join the Slack community (https://equalityai.com/slack)
    • Introduce yourself in the #Introductions channel. We're all friendly people!
  2. Check out the CONTRIBUTING file to learn how to contribute to our project, report bugs, or make feature requests.
  3. Try out the EqualityML
    • Hit the top right "star" button on GitHub to show your love!
    • Follow the recipe above to use the code.
  4. Provide feedback on your experience using the GitHub discussions or the Slack #support channel
    • For any questions or problems, send a message on Slack, or send an email to [email protected].

equalityml's People

Contributors

bjb2088 avatar jamesng-dev avatar joaogranja avatar jzdavis66 avatar nyujwc331 avatar proinsights avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

equalityml's Issues

Support for phenotype assessment tools?

Hi! We love your work @onefact and are happy to help if we can.

Work I helped develop during my postdoc is here: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10148336/

We have assessed several large language models for compliance with the Affordable Care Act non-discrimination clause (https://www.hhs.gov/about/leadership/melanie-fontes-rainer.html).

Specifically, the demographic parity metric is one I haven't found in your repository, and such assessment is necessary prior to training machine learning/artificial intelligence algorithms using labels derived from clinical phenotypes. For example presence or absence of a disease could be computed as a SQL query executed against a clinical data repository such as the one we work with from the NIH, researchallofus.org (@all-of-us).

Are such algorithmic fairness criteria for clinical phenotype assessment out of scope for @EqualityAI?

Please let us know as we will be releasing open source tools around this over the summer and don't want to duplicate your excellent work here!

Dependency issue on Tkinter while using on databricks jupyter notebooks

Hello,
I'm trying to install equalityml on databricks which is an environment that doesn't allow us to install system libraries using something like apt install python-tk.

Here's a screenshot of the error I get:
Screenshot 2023-03-10 at 04 18 11

I started investigating the issue until I found this ticket
Trusted-AI/AIF360#415

from the above issue, I realized that tkinter is not needed anymore as a dependency. However Aif360 hasn't had a new release since September, and I got no response from them so far when I asked for a newer release.

I've tried installing AIF360 from source, but the lastest commit has other newer dependencies that created issues.

That's why I've created a fork of AIF360 to have a quick fix for this issue until they make a new release including the fix.
https://github.com/lanterno/aif360

so, I'm opening this PR to see if anyone else is having a similar issue, and also to propose a solution.

I've tried modifying EqualityML dependencies with the fork I mentioned above, but it still didn't work because the complex interconnected dependencies that caused conflicts in dependencies.

The final solution that I finally managed to get to work came after switching the dependency management to poetry instead pip.

Poetry has better dependency resolution, and I was lucky that it solved the dependency issue without problems.

I'm not sure if the plan for EqualityML to switch to poetry, but it does seem like a good alternative since it really gives more power than a classic requirements.txt and setup.py

I will have a PR ready soon, and we can discuss it.

Correlation remover doesn't show up in plots when using compare_mitigation_methods()

Correlation remover doesn't show up in plots when using compare_mitigation_methods()

Suggest we add it to the list here.

def map_bias_mitigation(self):
return {'treatment_equality_ratio': [''],
'treatment_equality_difference': [''],
'balance_positive_class': [''],
'balance_negative_class': [''],
'equal_opportunity_ratio': [''],
'accuracy_equality_ratio': [''],
'predictive_parity_ratio': [''],
'predictive_equality_ratio': [''],
'statistical_parity_ratio': ['disparate-impact-remover', 'resampling',
'resampling-preferential', 'reweighing']}

self.threshold may mean self._threshold?

I didn't notice self.threshold being declared. Possibly I missed it. I did see self._threshold. Is it declared someplace I missed? If not, possible typo.

score = binary_threshold_score(self.orig_ml_model,
testing_data[self.features],
testing_data[self.target_variable],
scoring=scoring,
threshold=self.threshold,
utility_costs=utility_costs)
fairness_metric = self.fairness_metric(self._metric_name)
comparison_df.loc['reference'] = [score, fairness_metric]
# Iterate over mitigation methods list and re-evaluate score and fairness metric
for mitigation_method in mitigation_methods:
ml_model = self.model_mitigation(mitigation_method=mitigation_method, **kwargs)
if self.mitigated_testing_data is not None:
testing_data = self.mitigated_testing_data
else:
testing_data = self.testing_data if self.testing_data is not None else self.training_data
score = binary_threshold_score(ml_model,
testing_data[self.features],
testing_data[self.target_variable],
scoring=scoring,
threshold=self.threshold,
utility_costs=utility_costs)

Independent thresholds for each method when using compare_mitigation_methods

Currently the compare_mitigation_methods() function seems to rely on a pre-defined threshold.

for mitigation_method in mitigation_methods:
ml_model = self.model_mitigation(mitigation_method=mitigation_method, **kwargs)
if self.mitigated_testing_data is not None:
testing_data = self.mitigated_testing_data
else:
testing_data = self.testing_data if self.testing_data is not None else self.training_data
score = binary_threshold_score(ml_model,
testing_data[self.features],
testing_data[self.target_variable],
scoring=scoring,
threshold=self.threshold,
utility_costs=utility_costs)
fairness_metric = self.fairness_metric(self._metric_name)
comparison_df.loc[mitigation_method] = [score, fairness_metric]

It is statistically more correct to select a new threshold for each model. This will require taking in from the user the decision_maker when calling compare_mitigation_methods() I don't know if this is urgent - as the current approach seems to give a good approximate result.

Also, recall that the threshold function uses random seed.

Correlation remover uses fit_transform() instead of fit() and transform()

The training and testing sets should use the same correlation remover object. It should be fit() on only the training data, and should transform() both the training data and the testing data.

These lines of code show an example:

cr = CorrelationRemover(sensitive_feature_ids=['sex'], alpha=1)
cr.fit(train_data.drop(['two_year_recid'], axis=1))
train2 = cr.transform(train_data.drop(['two_year_recid'], axis=1))
test2 = cr.transform(testing_data.drop(['two_year_recid'], axis=1))

These are the relevant GitHub references for review.

def _cr_removing_data(self,
data,
alpha=1.0):
"""
Filters out sensitive correlations in a dataset using 'CorrelationRemover' function from fairlearn package.
"""
# Getting correlation coefficient for mitigation_method 'correlation_remover'. The input alpha parameter is
# used to control the level of filtering between the sensitive and non-sensitive features
# remove the outcome variable and sensitive variable
data_rm_columns = data.columns.drop([self.protected_variable, self.target_variable])
cr = CorrelationRemover(sensitive_feature_ids=[self.protected_variable], alpha=alpha)
data_std = cr.fit_transform(data.drop(columns=[self.target_variable]))
train_data_cr = pd.DataFrame(data_std, columns=data_rm_columns, index=data.index)
# Concatenate data after correlation remover
mitigated_data = pd.concat(
[pd.DataFrame(data[self.target_variable]),
pd.DataFrame(data[self.protected_variable]),
train_data_cr], axis=1)
# Keep the same columns order
mitigated_data = mitigated_data[data.columns]
return mitigated_data

elif mitigation_method == "correlation-remover":
mitigated_training_data = self._cr_removing_data(self.training_data, alpha)
mitigated_dataset['training_data'] = mitigated_training_data
self.mitigated_training_data = mitigated_training_data
if self.testing_data is not None:
mitigated_testing_data = self._cr_removing_data(self.testing_data, alpha)
mitigated_dataset['testing_data'] = mitigated_testing_data
self.mitigated_testing_data = mitigated_testing_data

Comment:

Disparate impact remover is coded to use only fit_transform(). AIF360 did not provide a transform() function https://github.com/Trusted-AI/AIF360/blob/master/aif360/algorithms/preprocessing/disparate_impact_remover.py. They say "In order to transform test data in the same manner as training data, the distributions of attributes conditioned on the protected attribute must be the same." We could technically make the same assumption and always use fit_transform() for correlation remover because it seems to perform well, but I think it is bad practice as in deployment sometimes we make only one prediction at a time and we can't estimate correlation when only one observation is present, fit_transform() will error out. This is a weakness of the disparate impact remover.

Add random_seed for resampling

Resampling is random so we need to make it reproducible.

I think dalex.resample uses numpy. possibly we can just put np.random_seed(random_seed) in the code someplace.

Let me know if we need to talk about this one.

def _resampling_data(self,
data,
mitigation_method):
"""
Resample the input data using 'resample' function from dalex package.
"""
# Uniform resampling
idx_resample = 0
if (mitigation_method == "resampling-uniform") or (mitigation_method == "resampling"):
idx_resample = resample(data[self.protected_variable],
data[self.target_variable],
type='uniform',
verbose=False)
# Preferential resampling
elif mitigation_method == "resampling-preferential":
_pred_prob = self._predict_binary_prob(self.orig_ml_model, data)
idx_resample = resample(data[self.protected_variable],
data[self.target_variable],
type='preferential', verbose=False,
probs=_pred_prob)
mitigated_data = data.iloc[idx_resample, :]
return mitigated_data

Inform user of 1/metric calculation

Suggestion for documentation.
We should inform the user here that we return the fairness parity metric or 1/metric as they are equivalent.

Returns the fairness metric score for the input fairness metric name.

Suggested text: "Returns the fairness metric score for the input fairness metric name. Note that in cases where the fairness metric is > 1 we return 1/fairness metric score to allow for easy comparison. "

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.