Measuring and Mitigating Bias in AI


Ethical AI and High-Performance Computing Seminar

Slides: hal.cse.msu.edu/talks
VishnuBoddeti

Progress In Artificial Intelligence

Speech Processing
Image Analysis
Natural Language Processing
Physical Sciences



Key Drivers
Data, Compute, Algorithms

State-of-Affairs

(report from the real-world)
"Facial recognition is accurate, if you're a white guy"

  • Boulamwini and Gebru, "Gender Shades:Intersectional Accuracy Disparities in Commercial Gender Classification," FAT 2018
Facial recognition bias frustrates Black asylum applicants to US, advocates say


"Detroit changes rules for police use of facial recognition after wrongful arrest of Black man"


"Black Artists Say A.I. Shows Bias"

"How AI reduces the world to stereotypes"
"Google chief admits ‘biased’ AI tool’s photo diversity offended users"


"ChatGPT leans liberal, research shows"


"LLMs propagate race-based medicine"
Real world machine learning systems are effective but,


are biased,


violate user's privacy and


not trustworthy.

Research Questions



    • Measure bias in AI models.


    • Mitigate bias in AI models.

Measuring Bias in AI

Measuring Bias in Datasets

How about Data?


  • DataComp: In search of the next generation of multimodal datasets, NeurIPS D&B 2023

Measuring Hate Content in Text

  • Birhane, Prabhu, Han and Boddeti, "On Hate Scaling Laws For Data-Swamps," arXiv:2306.13141
  • Birhane, Prabhu, Han, Boddeti, Luccioni, "Into the LAION's Den: Investigating Hate in Multimodal Datasets," NeurIPS D&B Track 2023

Measuring Hate Content in Text

  • Birhane, Prabhu, Han and Boddeti, "On Hate Scaling Laws For Data-Swamps," arXiv:2306.13141
  • Birhane, Prabhu, Han, Boddeti, Luccioni, "Into the LAION's Den: Investigating Hate in Multimodal Datasets," NeurIPS D&B Track 2023

Troubling Trends in Dataset Scaling

Scale exacerbates hate content.
  • Birhane, Prabhu, Han and Boddeti, "On Hate Scaling Laws For Data-Swamps," arXiv:2306.13141
  • Birhane, Prabhu, Han, Boddeti, Luccioni, "Into the LAION's Den: Investigating Hate in Multimodal Datasets," NeurIPS D&B Track 2023

Measuring Bias in Discriminative Models

Narrative of AI Training: "Moar data! Much wow!"

  • Birhane, Prabhu, Han and Boddeti, "On Hate Scaling Laws For Data-Swamps," arXiv:2306.13141
  • Birhane*, Dehdashtian*, Prabhu and Boddeti, "The Dark Side of Dataset Scaling: Evaluating Racial Classification in Multimodal Models," FAccT 2024

Evaluation on 14 CLIP Models

  • Birhane, Prabhu, Han and Boddeti, "On Hate Scaling Laws For Data-Swamps," arXiv:2306.13141
  • Birhane*, Dehdashtian*, Prabhu and Boddeti, "The Dark Side of Dataset Scaling: Evaluating Racial Classification in Multimodal Models," FAccT 2024

Chicago Face Dataset

Troubling Trends in Dataset Scaling

Scale exacerbates stereotypes.
  • Birhane, Prabhu, Han and Boddeti, "On Hate Scaling Laws For Data-Swamps," arXiv:2306.13141
  • Birhane*, Dehdashtian*, Prabhu and Boddeti, "The Dark Side of Dataset Scaling: Evaluating Racial Classification in Multimodal Models," FAccT 2024

Fairness: The Multi-Headed Hydra

How Fair is Your ML Model?

  • Sadeghi, Dehdashtian, Boddeti, "On Characterizing the Trade-off in Invariant Representation Learning," TMLR 2022
  • Dehdashtian, Sadeghi, Boddeti, "Utility-Fairness Trade-Offs and How to Find Them," CVPR 2024

How to Estimate these Trade-Offs?

U-FaTE ( Utility-Fairness Trade-Off Estimator) U-FaTE
  • Dehdashtian, Sadeghi, Boddeti, "Utility-Fairness Trade-Offs and How to Find Them," CVPR 2024

Face Image Dataset

  • Liu, Luo, Wang and Tang "Deep Learning Face Attributes in the Wild," ICCV 2015

CelebA Faces

  • $Y$: high cheekbones (binary) and $S$: age and sex (continuous + binary)
Evaluation of over 1000 supervised image feature extractors.
  • Sadeghi, Dehdashtian, Boddeti, "On Characterizing the Trade-off in Invariant Representation Learning," TMLR 2022
  • Dehdashtian, Sadeghi, Boddeti, "Utility-Fairness Trade-Offs and How to Find Them," CVPR 2024

CelebA Faces

  • $Y$: high cheekbones (binary) and $S$: age and sex (continuous + binary)
Evaluation of over 100 zero-shot multimodal (CLIP) models.
  • Sadeghi, Dehdashtian, Boddeti, "On Characterizing the Trade-off in Invariant Representation Learning," TMLR 2022
  • Dehdashtian, Sadeghi, Boddeti, "Utility-Fairness Trade-Offs and How to Find Them," CVPR 2024

Measuring Bias in Generative Models

High-Quality T2I Models, Same Old Stereotypes

OASIS: Toolbox for Measuring and Understanding Stereotypes

Lower, Yet Significant Stereotypes in Newer T2I Models

Nationality Worsens Existing Gender Stereotypes about Professions

Stereotypes worsen with compositional concepts.

T2I Models Have Stereotypical Predispositions about Nationalities

Indian

Mitigating Bias in AI Systems

Mitigating Bias in Discriminative Models

From Fair Learning to Fair Representation Learning

$Z \perp \!\!\! \perp S \Rightarrow \hat{Y} \perp \!\!\! \perp S$

Learning Fair Representations

  • Target Attribute: Smile & Demographic Attribute: Gender
  • Problem Definition:
    • Learn a representation $\mathbf{z} \in \mathbb{R}^d$ from data $\mathbf{x}$
    • Retain information necessary to predict target attribute $\mathbf{t}\in\mathcal{T}$
    • Remove information related to a desired demographic attribute $\mathbf{s}\in\mathcal{S}$

A Fork in the Road



  • Design metric to measure sensitive demographic attribute information
    • non-parameteric statistical dependence measures


  • Learn metric to measure semantic attribute information
    • probably feasible, many prior attempts
Adversarial Representation Learning

Game Theoretic Formulation

  • Three player game between:
    • Encoder extracts features $\mathbf{z}$
    • Target Predictor for desired task from features $\mathbf{z}$
    • Adversary extracts sensitive information from features $\mathbf{z}$
    $$ \begin{equation} \begin{aligned} \min_{\mathbf{\Theta}_E,\mathbf{\Theta}_T} & \underbrace{\color{cyan}{J_t(\mathbf{\Theta}_E,\mathbf{\Theta}_T)}}_{\color{cyan}{\text{error of target}}} \quad s.t. \text{ } \min_{\mathbf{\Theta}_A} \underbrace{\color{orange}{J_s(\mathbf{\Theta}_E,\mathbf{\Theta}_A)}}_{\color{orange}{\text{error of adversary}}} \geq \alpha \nonumber \end{aligned} \end{equation} $$
  • Adversary: learned measure of semantic attribute information

How do we learn model parameters?

  • Simultaneous/Alternating Stochastic Gradient Descent
    • Update target while keeping encoder and adversary frozen.
    • Update adversary while keeping encoder and target frozen.
    • Update encoder while keeping target and adversary frozen.

Three Player Game: Linear Case

  • Global solution is $(w_1, w_2, w_3)=(0, 0, 0)$
What we get
  • P. Roy and V.N. Boddeti, "Mitigating Information Leakage in Image Representations: A Maximum Entropy Approach", CVPR 2019

Many Solutions for Bias Mitigation

    • Standard Adversarial Representation Learning
    • Linear Adversarial Measure: linear dependency between $Z$ and $S$ [ICCV 2019, CVPRW 2020]
    • Non-Linear Adversarial Measure: Beyond linear dependency between $Z$ and $S$, but not all types [ECML 2021]
    • Universal Dependence Measure: All types of dependency between $Z$ and $S$ [TMLR 2022]
    • End-to-End Universal Dependence Measure: All types of dependency between $Z$ and $S$ [CVPR 2024]

Face Image Dataset

  • Liu, Luo, Wang and Tang "Deep Learning Face Attributes in the Wild," ICCV 2015

CelebA Faces

  • $Y$: high cheekbones (binary) and $S$: age and sex (continuous + binary)
  • Sadeghi, Dehdashtian, Boddeti, "On Characterizing the Trade-off in Invariant Representation Learning," TMLR 2022
  • Dehdashtian, Sadeghi, Boddeti, "Utility-Fairness Trade-Offs and How to Find Them," CVPR 2024

Folktables

  • $Y$: employement status (binary) and $S$: age (continuous)
  • Sadeghi, Dehdashtian, Boddeti, "On Characterizing the Trade-off in Invariant Representation Learning," TMLR 2022
  • Dehdashtian, Sadeghi, Boddeti, "Utility-Fairness Trade-Offs and How to Find Them," CVPR 2024

How about zero-shot models?

Bias in CLIP's Zero-Shot Prediction

Error: Unable to load Plotly figure.
  • Dehdashtian*, Wang* and Boddeti, "FairerCLIP: Debiasing CLIP's Zero-Shot Predictions using Functions in RKHSs," (ICLR 2024)

Debiasing CLIP Models

  • Dehdashtian*, Wang* and Boddeti, "FairerCLIP: Debiasing CLIP's Zero-Shot Predictions using Functions in RKHSs," (ICLR 2024)

FairerCLIP: CelebA Dataset


  • $Y$: high cheekbones (binary)
  • $S$: sex (binary)

FairerCLIP: FairFace Dataset

FairerCLIP: Chicago Face Dataset

  • $Y$: attractiveness (binary)
  • $S$: gender (binary)

(Partially) Mitigating Bias in Generative Models

Sampling diverse modes from Generative Models

Mitigating Stereotypes in T2I Models

Sampling Diverse Data from Generative Models

Diverse Mode Coverage

Concluding Remarks

Summary

    • AI systems are progressing at a rapid pace.
      • But, they exhibit biases.

    • Need to develop methods for automated auditing of artificial intelligence systems for bias.

    • Next generation of artificial intelligence systems have to be designed with bias mitigation.

    • Appreciable gap exists between current solutions and ideal unbiased AI systems.

Thank You