The cascadia-24 from simonpcouch

cascadia-24's Introduction

This repository contains source code and slides for the talk "Fair machine learning" at Cascadia R Conf in June 2024. The slides for the talk are available here.

To learn more about machine learning with R:

Machine learning with tidymodels: tmwr.org
More example notebooks with tidymodels are at tidymodels.org. Two fairness-oriented ones:
- Are GPT detectors fair?, focused on how different fairness metrics encoded different interpretations of fairness across linguistic proficiency.
- Fair prediction of hospital readmission, focused on training models that are near-fair with respect to a set of fairness metrics across racial groups.
An overview of our reading group's conclusions and the kinds of fairness-oriented workflows we decided to support.

I mentioned these works in the talk:

ccao-data/public from folks at the Cook County Assessor's Office—source code for a talk that inspired the property assessment example.
An Unfair Burden by Jason Grotto in 2017, an article from the Chicago Tribune investigating unfairness in Cook County's property tax system.
How Lower-Income Americans Get Cheated on Property Taxes from the NYTimes Editorial Board in 2021, which performs a similar analysis across counties in the US and additionally engages with the racial disparities in property taxation.
Algorithmic Fairness: Choices, Assumptions, and Definitions from Mitchell et al. in 2021. A great reference for the various choices and assumptions that underlie fairness-oriented analysis of ML models.

In this repository,

index.qmd contains the source code for the slides. The slides use images in the /figures directory.
/docs is auto-generated from index.qmd. Content in that folder is likely unhelpful for a human reader, and is better viewed at the links above. :)

Recommend Projects