Measuring and Mitigating Bias in AI


DLAI8

Slides: hal.cse.msu.edu/talks
VishnuBoddeti

Progress In Artificial Intelligence

Speech Processing
Image Analysis
Natural Language Processing
Physical Sciences



Key Drivers
Data, Compute, Algorithms

State-of-Affairs

(report from the real-world)
"Tay, Microsoft's AI chatbot, gets a crash course in racism from Twitter"




"Machine Bias"

"FaceApp's creator apologizes for the app's skin-lightening 'hot' filter"

"Facial recognition is accurate, if you're a white guy"

  • Boulamwini and Gebru, "Gender Shades:Intersectional Accuracy Disparities in Commercial Gender Classification," FAT 2018
"Black Artists Say A.I. Shows Bias"

"How AI reduces the world to stereotypes"
"LLMs propagate race-based medicine"
Real world machine learning systems are effective but,


are biased,


violate user’s privacy and


not trustworthy.

Research Questions



    • Measure bias in AI models.


    • Mitigate bias in AI models.

Measuring Bias in AI

Measuring Bias in Datasets

How about Data?


  • DataComp: In search of the next generation of multimodal datasets, NeurIPS D&B 2023

Measuring Hate Content in Text

  • Birhane, Prabhu, Han and Boddeti, "On Hate Scaling Laws For Data-Swamps," arXiv:2306.13141
  • Birhane, Prabhu, Han, Boddeti, Luccioni, "Into the LAION's Den: Investigating Hate in Multimodal Datasets," NeurIPS D&B Track 2023

Measuring Hate Content in Text

  • Birhane, Prabhu, Han and Boddeti, "On Hate Scaling Laws For Data-Swamps," arXiv:2306.13141
  • Birhane, Prabhu, Han, Boddeti, Luccioni, "Into the LAION's Den: Investigating Hate in Multimodal Datasets," NeurIPS D&B Track 2023

Troubling Trends in Dataset Scaling

Scale exacerbates hate content.
  • Birhane, Prabhu, Han and Boddeti, "On Hate Scaling Laws For Data-Swamps," arXiv:2306.13141
  • Birhane, Prabhu, Han, Boddeti, Luccioni, "Into the LAION's Den: Investigating Hate in Multimodal Datasets," NeurIPS D&B Track 2023

Measuring Bias in Models

Narrative of AI Training: "Moar data! Much wow!"

  • Birhane, Prabhu, Han and Boddeti, "On Hate Scaling Laws For Data-Swamps," arXiv:2306.13141
  • Birhane*, Dehdashtian*, Prabhu and Boddeti, "The Dark Side of Dataset Scaling: Evaluating Racial Classification in Multimodal Models," FAccT 2024

Evaluation on 14 CLIP Models

  • Birhane, Prabhu, Han and Boddeti, "On Hate Scaling Laws For Data-Swamps," arXiv:2306.13141
  • Birhane*, Dehdashtian*, Prabhu and Boddeti, "The Dark Side of Dataset Scaling: Evaluating Racial Classification in Multimodal Models," FAccT 2024

Chicago Face Dataset

Troubling Trends in Dataset Scaling

Scale exacerbates stereotypes.
  • Birhane, Prabhu, Han and Boddeti, "On Hate Scaling Laws For Data-Swamps," arXiv:2306.13141
  • Birhane*, Dehdashtian*, Prabhu and Boddeti, "The Dark Side of Dataset Scaling: Evaluating Racial Classification in Multimodal Models," FAccT 2024

Fairness: The Multi-Headed Hydra

Fairness Definitions: Statistical Parity

Fairness Definitions: Equalized Odds

Fairness Definitions: Equality of Opportunity

How Fair is Your ML Model?

  • Sadeghi, Dehdashtian, Boddeti, "On Characterizing the Trade-off in Invariant Representation Learning," TMLR 2022
  • Dehdashtian, Sadeghi, Boddeti, "Utility-Fairness Trade-Offs and How to Find Them," CVPR 2024

How to Estimate these Trade-Offs?

U-FaTE ( Utility-Fairness Trade-Off Estimator) U-FaTE
  • Dehdashtian, Sadeghi, Boddeti, "Utility-Fairness Trade-Offs and How to Find Them," CVPR 2024

Face Image Dataset

  • Liu, Luo, Wang and Tang "Deep Learning Face Attributes in the Wild," ICCV 2015

CelebA Faces

  • $Y$: high cheekbones (binary) and $S$: age and sex (continuous + binary)
Evaluation of over 1000 supervised image feature extractors.
  • Sadeghi, Dehdashtian, Boddeti, "On Characterizing the Trade-off in Invariant Representation Learning," TMLR 2022
  • Dehdashtian, Sadeghi, Boddeti, "Utility-Fairness Trade-Offs and How to Find Them," CVPR 2024

CelebA Faces

  • $Y$: high cheekbones (binary) and $S$: age and sex (continuous + binary)
Evaluation of over 100 zero-shot multimodal (CLIP) models.
  • Sadeghi, Dehdashtian, Boddeti, "On Characterizing the Trade-off in Invariant Representation Learning," TMLR 2022
  • Dehdashtian, Sadeghi, Boddeti, "Utility-Fairness Trade-Offs and How to Find Them," CVPR 2024

Face Image Dataset

  • Karkkainen and Joo "FairFace: Face Attribute Dataset for Balanced Race, Gender, and Age for Bias Measurement and Mitigation," WACV 2021

FairFace Dataset

  • $Y$: sex (binary) and $S$: race (7 classes)
Evaluation of over 1000 supervised image feature extractors.
  • Sadeghi, Dehdashtian, Boddeti, "On Characterizing the Trade-off in Invariant Representation Learning," TMLR 2022
  • Dehdashtian, Sadeghi, Boddeti, "Utility-Fairness Trade-Offs and How to Find Them," CVPR 2024

FairFace Dataset

  • $Y$: sex (binary) and $S$: race (7 classes)
Evaluation of over 100 zero-shot multimodal (CLIP) models.
  • Sadeghi, Dehdashtian, Boddeti, "On Characterizing the Trade-off in Invariant Representation Learning," TMLR 2022
  • Dehdashtian, Sadeghi, Boddeti, "Utility-Fairness Trade-Offs and How to Find Them," CVPR 2024

Mitigating Bias in AI Systems

From Fair Learning to Fair Representation Learning

$Z \perp \!\!\! \perp S \Rightarrow \hat{Y} \perp \!\!\! \perp S$

Learning Fair Representations

  • Target Attribute: Smile & Demographic Attribute: Gender
  • Problem Definition:
    • Learn a representation $\mathbf{z} \in \mathbb{R}^d$ from data $\mathbf{x}$
    • Retain information necessary to predict target attribute $\mathbf{t}\in\mathcal{T}$
    • Remove information related to a desired demographic attribute $\mathbf{s}\in\mathcal{S}$

A Fork in the Road



  • Design metric to measure sensitive demographic attribute information
    • non-parameteric statistical dependence measures


  • Learn metric to measure semantic attribute information
    • probably feasible, many prior attempts
Adversarial Representation Learning

Game Theoretic Formulation

  • Three player game between:
    • Encoder extracts features $\mathbf{z}$
    • Target Predictor for desired task from features $\mathbf{z}$
    • Adversary extracts sensitive information from features $\mathbf{z}$
    $$ \begin{equation} \begin{aligned} \min_{\mathbf{\Theta}_E,\mathbf{\Theta}_T} & \underbrace{\color{cyan}{J_t(\mathbf{\Theta}_E,\mathbf{\Theta}_T)}}_{\color{cyan}{\text{error of target}}} \quad s.t. \text{ } \min_{\mathbf{\Theta}_A} \underbrace{\color{orange}{J_s(\mathbf{\Theta}_E,\mathbf{\Theta}_A)}}_{\color{orange}{\text{error of adversary}}} \geq \alpha \nonumber \end{aligned} \end{equation} $$
  • Adversary: learned measure of semantic attribute information

How do we learn model parameters?

  • Simultaneous/Alternating Stochastic Gradient Descent
    • Update target while keeping encoder and adversary frozen.
    • Update adversary while keeping encoder and target frozen.
    • Update encoder while keeping target and adversary frozen.

Three Player Game: Linear Case

  • Global solution is $(w_1, w_2, w_3)=(0, 0, 0)$
What we get
  • P. Roy and V.N. Boddeti, "Mitigating Information Leakage in Image Representations: A Maximum Entropy Approach", CVPR 2019

Many Solutions for Bias Mitigation

    • Standard Adversarial Representation Learning
    • Linear Adversarial Measure: linear dependency between $Z$ and $S$ [ICCV 2019, CVPRW 2020]
    • Non-Linear Adversarial Measure: Beyond linear dependency between $Z$ and $S$, but not all types [ECML 2021]
    • Universal Dependence Measure: All types of dependency between $Z$ and $S$ [TMLR 2022]
    • End-to-End Universal Dependence Measure: All types of dependency between $Z$ and $S$ [CVPR 2024]

Face Image Dataset

  • Liu, Luo, Wang and Tang "Deep Learning Face Attributes in the Wild," ICCV 2015

CelebA Faces

  • $Y$: high cheekbones (binary) and $S$: age and sex (continuous + binary)
  • Sadeghi, Dehdashtian, Boddeti, "On Characterizing the Trade-off in Invariant Representation Learning," TMLR 2022
  • Dehdashtian, Sadeghi, Boddeti, "Utility-Fairness Trade-Offs and How to Find Them," CVPR 2024

Folktables

  • $Y$: employement status (binary) and $S$: age (continuous)
  • Sadeghi, Dehdashtian, Boddeti, "On Characterizing the Trade-off in Invariant Representation Learning," TMLR 2022
  • Dehdashtian, Sadeghi, Boddeti, "Utility-Fairness Trade-Offs and How to Find Them," CVPR 2024

How about zero-shot models?

Bias in CLIP's Zero-Shot Prediction

Error: Unable to load Plotly figure.
  • Dehdashtian*, Wang* and Boddeti, "FairerCLIP: Debiasing CLIP's Zero-Shot Predictions using Functions in RKHSs," (ICLR 2024)

Debiasing CLIP Models

  • Dehdashtian*, Wang* and Boddeti, "FairerCLIP: Debiasing CLIP's Zero-Shot Predictions using Functions in RKHSs," (ICLR 2024)

FairerCLIP: CelebA Dataset


  • $Y$: high cheekbones (binary)
  • $S$: sex (binary)

FairerCLIP: FairFace Dataset

FairerCLIP: Chicago Face Dataset

  • $Y$: attractiveness (binary)
  • $S$: gender (binary)

Concluding Remarks

Summary

    • AI systems are progressing at a rapid pace.
      • But, they exhibit biases.

    • Need to develop methods for automated auditing of artificial intelligence systems for bias.

    • Next generation of artificial intelligence systems have to be designed with bias mitigation.

    • Appreciable gap exists between current solutions and ideal unbiased AI systems.

Thank You