Measuring and Mitigating Bias in AI
Ethical AI and High-Performance Computing Seminar
Slides: hal.cse.msu.edu/talks
VishnuBoddeti
Progress In Artificial Intelligence
Speech Processing
Image Analysis
Natural Language Processing
Physical Sciences
Key Drivers
Data, Compute, Algorithms
State-of-Affairs
(report from the real-world)
Real world machine learning systems are effective but,
are biased,
violate user's privacy
and
not trustworthy.
Research Questions
-
Measure bias in AI models.
-
Mitigate bias in AI
models.
Measuring Bias in Datasets
How about Data?
-
DataComp: In search of the next generation of multimodal
datasets, NeurIPS D&B 2023
Measuring Hate Content in Text
-
Birhane, Prabhu, Han and Boddeti, "On Hate Scaling Laws
For Data-Swamps," arXiv:2306.13141
-
Birhane, Prabhu, Han, Boddeti, Luccioni, "Into the
LAION's Den: Investigating Hate in Multimodal Datasets," NeurIPS
D&B Track 2023
Measuring Hate Content in Text
-
Birhane, Prabhu, Han and Boddeti, "On Hate Scaling Laws
For Data-Swamps," arXiv:2306.13141
-
Birhane, Prabhu, Han, Boddeti, Luccioni, "Into the
LAION's Den: Investigating Hate in Multimodal Datasets," NeurIPS
D&B Track 2023
Troubling Trends in Dataset Scaling
Scale exacerbates hate content.
-
Birhane, Prabhu, Han and Boddeti, "On Hate Scaling Laws
For Data-Swamps," arXiv:2306.13141
-
Birhane, Prabhu, Han, Boddeti, Luccioni, "Into the
LAION's Den: Investigating Hate in Multimodal Datasets," NeurIPS
D&B Track 2023
Measuring Bias in Discriminative Models
Narrative of AI Training: "Moar data! Much wow!"
-
Birhane, Prabhu, Han and Boddeti, "On Hate Scaling Laws
For Data-Swamps," arXiv:2306.13141
-
Birhane*, Dehdashtian*, Prabhu and Boddeti, "The Dark
Side of Dataset Scaling: Evaluating Racial Classification in
Multimodal Models," FAccT 2024
Evaluation on 14 CLIP Models
-
Birhane, Prabhu, Han and Boddeti, "On Hate Scaling Laws
For Data-Swamps," arXiv:2306.13141
-
Birhane*, Dehdashtian*, Prabhu and Boddeti, "The Dark
Side of Dataset Scaling: Evaluating Racial Classification in
Multimodal Models," FAccT 2024
Chicago Face Dataset
- human being
- animal
- gorilla
- chimpanzee
- orangutan
- thief
- criminal
- suspicious person
Troubling Trends in Dataset Scaling
Scale exacerbates stereotypes.
-
Birhane, Prabhu, Han and Boddeti, "On Hate Scaling Laws
For Data-Swamps," arXiv:2306.13141
-
Birhane*, Dehdashtian*, Prabhu and Boddeti, "The Dark
Side of Dataset Scaling: Evaluating Racial Classification in
Multimodal Models," FAccT 2024
Fairness: The Multi-Headed Hydra
-
Verma and Rubin, "Fairness Definitions Explained,"
International Workshop on Software Fairness, 2018
How Fair is Your ML Model?
-
Sadeghi, Dehdashtian, Boddeti, "On Characterizing the
Trade-off in Invariant Representation Learning," TMLR 2022
-
Dehdashtian, Sadeghi, Boddeti, "Utility-Fairness
Trade-Offs and How to Find Them," CVPR 2024
How to Estimate these Trade-Offs?
U-FaTE ( Utility-Fairness Trade-Off
Estimator)
-
Dehdashtian, Sadeghi, Boddeti, "Utility-Fairness
Trade-Offs and How to Find Them," CVPR 2024
Face Image Dataset
-
Liu, Luo, Wang and Tang "Deep Learning Face Attributes in the
Wild," ICCV 2015
CelebA Faces
-
$Y$: high cheekbones (binary) and $S$: age and sex
(continuous + binary)
Evaluation of over 1000 supervised image feature extractors.
-
Sadeghi, Dehdashtian, Boddeti, "On Characterizing the
Trade-off in Invariant Representation Learning," TMLR 2022
-
Dehdashtian, Sadeghi, Boddeti, "Utility-Fairness
Trade-Offs and How to Find Them," CVPR 2024
CelebA Faces
-
$Y$: high cheekbones (binary) and $S$: age and sex
(continuous + binary)
Evaluation of over 100 zero-shot multimodal (CLIP) models.
-
Sadeghi, Dehdashtian, Boddeti, "On Characterizing the
Trade-off in Invariant Representation Learning," TMLR 2022
-
Dehdashtian, Sadeghi, Boddeti, "Utility-Fairness
Trade-Offs and How to Find Them," CVPR 2024
Measuring Bias in Generative Models
High-Quality T2I Models, Same Old Stereotypes
OASIS: Toolbox for Measuring and Understanding Stereotypes
Lower, Yet Significant Stereotypes in Newer T2I Models
Nationality Worsens Existing Gender Stereotypes about
Professions
Stereotypes worsen with compositional concepts.
T2I Models Have Stereotypical Predispositions about Nationalities
Indian
Mexican
Mitigating Bias in AI Systems
Mitigating Bias in Discriminative Models
From Fair Learning to Fair Representation Learning
$Z \perp \!\!\! \perp S \Rightarrow \hat{Y} \perp \!\!\! \perp
S$
Learning Fair Representations
-
Target Attribute: Smile &
Demographic Attribute: Gender
- Problem Definition:
-
Learn a representation $\mathbf{z} \in \mathbb{R}^d$ from
data $\mathbf{x}$
-
Retain information
necessary to predict
target attribute
$\mathbf{t}\in\mathcal{T}$
-
Remove information
related to a desired
demographic attribute
$\mathbf{s}\in\mathcal{S}$
A Fork in the Road
-
Design metric to measure sensitive demographic
attribute information
-
non-parameteric statistical dependence measures
-
Learn metric to measure semantic attribute
information
-
probably feasible, many prior attempts
Adversarial Representation Learning
Game Theoretic Formulation
- Three player game between:
-
Encoder extracts features
$\mathbf{z}$
-
Target Predictor for
desired task from features $\mathbf{z}$
-
Adversary extracts
sensitive information from features $\mathbf{z}$
$$ \begin{equation} \begin{aligned}
\min_{\mathbf{\Theta}_E,\mathbf{\Theta}_T} &
\underbrace{\color{cyan}{J_t(\mathbf{\Theta}_E,\mathbf{\Theta}_T)}}_{\color{cyan}{\text{error
of target}}} \quad s.t. \text{ } \min_{\mathbf{\Theta}_A}
\underbrace{\color{orange}{J_s(\mathbf{\Theta}_E,\mathbf{\Theta}_A)}}_{\color{orange}{\text{error
of adversary}}} \geq \alpha \nonumber \end{aligned}
\end{equation} $$
-
Adversary: learned measure
of semantic attribute information
How do we learn model parameters?
- Simultaneous/Alternating Stochastic Gradient Descent
-
Update target while keeping encoder and adversary
frozen.
-
Update adversary while keeping encoder and target
frozen.
-
Update encoder while keeping target and adversary
frozen.
Three Player Game: Linear Case
- Global solution is $(w_1, w_2, w_3)=(0, 0, 0)$
What we get
What we want
-
P. Roy and V.N. Boddeti, "Mitigating Information Leakage in
Image Representations: A Maximum Entropy Approach", CVPR
2019
Many Solutions for Bias Mitigation
-
Standard Adversarial Representation Learning
-
Linear Adversarial Measure: linear dependency between $Z$
and $S$ [ICCV 2019, CVPRW 2020]
-
Non-Linear Adversarial Measure: Beyond linear dependency
between $Z$ and $S$, but not all types [ECML 2021]
-
Universal Dependence Measure: All types of dependency
between $Z$ and $S$ [TMLR 2022]
-
End-to-End Universal Dependence Measure: All types of
dependency between $Z$ and $S$ [CVPR 2024]
Face Image Dataset
-
Liu, Luo, Wang and Tang "Deep Learning Face Attributes in the
Wild," ICCV 2015
CelebA Faces
-
$Y$: high cheekbones (binary) and $S$: age and sex
(continuous + binary)
-
Sadeghi, Dehdashtian, Boddeti, "On Characterizing the
Trade-off in Invariant Representation Learning," TMLR 2022
-
Dehdashtian, Sadeghi, Boddeti, "Utility-Fairness
Trade-Offs and How to Find Them," CVPR 2024
Folktables
-
$Y$: employement status (binary) and $S$: age (continuous)
-
Sadeghi, Dehdashtian, Boddeti, "On Characterizing the
Trade-off in Invariant Representation Learning," TMLR 2022
-
Dehdashtian, Sadeghi, Boddeti, "Utility-Fairness
Trade-Offs and How to Find Them," CVPR 2024
How about zero-shot models?
Bias in CLIP's Zero-Shot Prediction
-
Dehdashtian*, Wang* and Boddeti, "FairerCLIP: Debiasing
CLIP's Zero-Shot Predictions using Functions in RKHSs," (ICLR
2024)
Debiasing CLIP Models
-
Dehdashtian*, Wang* and Boddeti, "FairerCLIP: Debiasing
CLIP's Zero-Shot Predictions using Functions in RKHSs," (ICLR
2024)
Debiasing CLIP Models
-
Dehdashtian*, Wang* and Boddeti, "FairerCLIP: Debiasing
CLIP's Zero-Shot Predictions using Functions in RKHSs," (ICLR
2024)
FairerCLIP: CelebA Dataset
- $Y$: high cheekbones (binary)
- $S$: sex (binary)
FairerCLIP: FairFace Dataset
FairerCLIP: Chicago Face Dataset
- $Y$: attractiveness (binary)
- $S$: gender (binary)
(Partially) Mitigating Bias in Generative Models
Sampling diverse modes from Generative Models
Mitigating Stereotypes in T2I Models
Sampling Diverse Data from Generative Models
Diverse Mode Coverage
Summary
-
AI systems are progressing at a rapid pace.
- But, they exhibit biases.
-
Need to develop methods for automated auditing of artificial
intelligence systems for bias.
-
Next generation of artificial intelligence systems have to
be designed with bias mitigation.
-
Appreciable gap exists between current solutions and ideal
unbiased AI systems.