Measuring and Mitigating Bias in AI
Progress In Artificial Intelligence
Speech Processing
Image Analysis
Natural Language Processing
Physical Sciences
Key Drivers
Data, Compute, Algorithms
(report from the real-world)
"Tay, Microsoft's AI chatbot, gets a crash course in racism from Twitter"
March 24, 2016
"FaceApp's creator apologizes for the app's skin-lightening 'hot' filter"
April 25, 2017
Real world machine learning systems are effective but,
are biased,
violate user’s privacy and
not trustworthy.
Research Questions
- Measure bias in AI models.
- Mitigate bias in AI models.
Measuring Bias in Datasets
How about Data?
- DataComp: In search of the next generation of multimodal datasets, NeurIPS D&B 2023
Measuring Hate Content in Text
- Birhane, Prabhu, Han and Boddeti, "On Hate Scaling Laws For Data-Swamps," arXiv:2306.13141
- Birhane, Prabhu, Han, Boddeti, Luccioni, "Into the LAION's Den: Investigating Hate in Multimodal Datasets," NeurIPS D&B Track 2023
Measuring Hate Content in Text
- Birhane, Prabhu, Han and Boddeti, "On Hate Scaling Laws For Data-Swamps," arXiv:2306.13141
- Birhane, Prabhu, Han, Boddeti, Luccioni, "Into the LAION's Den: Investigating Hate in Multimodal Datasets," NeurIPS D&B Track 2023
Troubling Trends in Dataset Scaling
Scale exacerbates hate content.
- Birhane, Prabhu, Han and Boddeti, "On Hate Scaling Laws For Data-Swamps," arXiv:2306.13141
- Birhane, Prabhu, Han, Boddeti, Luccioni, "Into the LAION's Den: Investigating Hate in Multimodal Datasets," NeurIPS D&B Track 2023
Narrative of AI Training: "Moar data! Much wow!"
- Birhane, Prabhu, Han and Boddeti, "On Hate Scaling Laws For Data-Swamps," arXiv:2306.13141
- Birhane*, Dehdashtian*, Prabhu and Boddeti, "The Dark Side of Dataset Scaling: Evaluating Racial Classification in Multimodal Models," FAccT 2024
Evaluation on 14 CLIP Models
- Birhane, Prabhu, Han and Boddeti, "On Hate Scaling Laws For Data-Swamps," arXiv:2306.13141
- Birhane*, Dehdashtian*, Prabhu and Boddeti, "The Dark Side of Dataset Scaling: Evaluating Racial Classification in Multimodal Models," FAccT 2024
Chicago Face Dataset
- human being
- animal
- gorilla
- chimpanzee
- orangutan
- thief
- criminal
- suspicious person
Troubling Trends in Dataset Scaling
Scale exacerbates stereotypes.
- Birhane, Prabhu, Han and Boddeti, "On Hate Scaling Laws For Data-Swamps," arXiv:2306.13141
- Birhane*, Dehdashtian*, Prabhu and Boddeti, "The Dark Side of Dataset Scaling: Evaluating Racial Classification in Multimodal Models," FAccT 2024
Fairness: The Multi-Headed Hydra
- Verma and Rubin, "Fairness Definitions Explained," International Workshop on Software Fairness, 2018
Fairness Definitions: Statistical Parity
- $P(\hat{Y}=1|S=1) = P(\hat{Y}=1|S=0)$
Probability of correct prediction is the same across demographic groups.
- $\hat{Y} \perp \!\!\! \perp S$
Fairness Definitions: Equalized Odds
- $P(\hat{Y}=y|Y=y, S=1) = P(\hat{Y}=y|Y=y, S=0)$
True positive rate of predictions is the same across demographic groups.
- $\hat{Y} \perp \!\!\! \perp S | Y$
Fairness Definitions: Equality of Opportunity
- $P(\hat{Y}=1|Y=1, S=1) = P(\hat{Y}=1|Y=1, S=0)$
Among eligible candidates, probability of correct prediction is the same across demographic groups.
- $\hat{Y} \perp \!\!\! \perp S | Y=1$
How Fair is Your ML Model?
- Sadeghi, Dehdashtian, Boddeti, "On Characterizing the Trade-off in Invariant Representation Learning," TMLR 2022
- Dehdashtian, Sadeghi, Boddeti, "Utility-Fairness Trade-Offs and How to Find Them," CVPR 2024
How to Estimate these Trade-Offs?
U-FaTE ( Utility-Fairness Trade-Off Estimator)
- Dehdashtian, Sadeghi, Boddeti, "Utility-Fairness Trade-Offs and How to Find Them," CVPR 2024
Face Image Dataset
- Liu, Luo, Wang and Tang "Deep Learning Face Attributes in the Wild," ICCV 2015
CelebA Faces
- $Y$: high cheekbones (binary) and $S$: age and sex (continuous + binary)
Evaluation of over 1000 supervised image feature extractors.
- Sadeghi, Dehdashtian, Boddeti, "On Characterizing the Trade-off in Invariant Representation Learning," TMLR 2022
- Dehdashtian, Sadeghi, Boddeti, "Utility-Fairness Trade-Offs and How to Find Them," CVPR 2024
CelebA Faces
- $Y$: high cheekbones (binary) and $S$: age and sex (continuous + binary)
Evaluation of over 100 zero-shot multimodal (CLIP) models.
- Sadeghi, Dehdashtian, Boddeti, "On Characterizing the Trade-off in Invariant Representation Learning," TMLR 2022
- Dehdashtian, Sadeghi, Boddeti, "Utility-Fairness Trade-Offs and How to Find Them," CVPR 2024
Face Image Dataset
- Karkkainen and Joo "FairFace: Face Attribute Dataset for Balanced Race, Gender, and Age for Bias Measurement and Mitigation," WACV 2021
FairFace Dataset
- $Y$: sex (binary) and $S$: race (7 classes)
Evaluation of over 1000 supervised image feature extractors.
- Sadeghi, Dehdashtian, Boddeti, "On Characterizing the Trade-off in Invariant Representation Learning," TMLR 2022
- Dehdashtian, Sadeghi, Boddeti, "Utility-Fairness Trade-Offs and How to Find Them," CVPR 2024
FairFace Dataset
- $Y$: sex (binary) and $S$: race (7 classes)
Evaluation of over 100 zero-shot multimodal (CLIP) models.
- Sadeghi, Dehdashtian, Boddeti, "On Characterizing the Trade-off in Invariant Representation Learning," TMLR 2022
- Dehdashtian, Sadeghi, Boddeti, "Utility-Fairness Trade-Offs and How to Find Them," CVPR 2024
Mitigating Bias in AI Systems
From Fair Learning to Fair Representation Learning
$Z \perp \!\!\! \perp S \Rightarrow \hat{Y} \perp \!\!\! \perp S$
Learning Fair Representations
- Target Attribute: Smile & Demographic Attribute: Gender
- Problem Definition:
- Learn a representation $\mathbf{z} \in \mathbb{R}^d$ from data $\mathbf{x}$
- Retain information necessary to predict target attribute $\mathbf{t}\in\mathcal{T}$
- Remove information related to a desired demographic attribute $\mathbf{s}\in\mathcal{S}$
A Fork in the Road
- Design metric to measure sensitive demographic attribute information
- non-parameteric statistical dependence measures
- Learn metric to measure semantic attribute information
- probably feasible, many prior attempts
Adversarial Representation Learning
Game Theoretic Formulation
- Three player game between:
- Encoder extracts features $\mathbf{z}$
- Target Predictor for desired task from features $\mathbf{z}$
- Adversary extracts sensitive information from features $\mathbf{z}$
\min_{\mathbf{\Theta}_E,\mathbf{\Theta}_T} & \underbrace{\color{cyan}{J_t(\mathbf{\Theta}_E,\mathbf{\Theta}_T)}}_{\color{cyan}{\text{error of target}}} \quad s.t. \text{ } \min_{\mathbf{\Theta}_A} \underbrace{\color{orange}{J_s(\mathbf{\Theta}_E,\mathbf{\Theta}_A)}}_{\color{orange}{\text{error of adversary}}} \geq \alpha \nonumber
- Adversary: learned measure of semantic attribute information
How do we learn model parameters?
- Simultaneous/Alternating Stochastic Gradient Descent
- Update target while keeping encoder and adversary frozen.
- Update adversary while keeping encoder and target frozen.
- Update encoder while keeping target and adversary frozen.
Three Player Game: Linear Case
- Global solution is $(w_1, w_2, w_3)=(0, 0, 0)$
What we get
What we want
- P. Roy and V.N. Boddeti, "Mitigating Information Leakage in Image Representations: A Maximum Entropy Approach", CVPR 2019
Many Solutions for Bias Mitigation
- Standard Adversarial Representation Learning
- Linear Adversarial Measure: linear dependency between $Z$ and $S$ [ICCV 2019, CVPRW 2020]
- Non-Linear Adversarial Measure: Beyond linear dependency between $Z$ and $S$, but not all types [ECML 2021]
- Universal Dependence Measure: All types of dependency between $Z$ and $S$ [TMLR 2022]
- End-to-End Universal Dependence Measure: All types of dependency between $Z$ and $S$ [CVPR 2024]
Face Image Dataset
- Liu, Luo, Wang and Tang "Deep Learning Face Attributes in the Wild," ICCV 2015
CelebA Faces
- $Y$: high cheekbones (binary) and $S$: age and sex (continuous + binary)
- Sadeghi, Dehdashtian, Boddeti, "On Characterizing the Trade-off in Invariant Representation Learning," TMLR 2022
- Dehdashtian, Sadeghi, Boddeti, "Utility-Fairness Trade-Offs and How to Find Them," CVPR 2024
- $Y$: employement status (binary) and $S$: age (continuous)
- Sadeghi, Dehdashtian, Boddeti, "On Characterizing the Trade-off in Invariant Representation Learning," TMLR 2022
- Dehdashtian, Sadeghi, Boddeti, "Utility-Fairness Trade-Offs and How to Find Them," CVPR 2024
How about zero-shot models?
Bias in CLIP's Zero-Shot Prediction
- Dehdashtian*, Wang* and Boddeti, "FairerCLIP: Debiasing CLIP's Zero-Shot Predictions using Functions in RKHSs," (ICLR 2024)
Debiasing CLIP Models
- Dehdashtian*, Wang* and Boddeti, "FairerCLIP: Debiasing CLIP's Zero-Shot Predictions using Functions in RKHSs," (ICLR 2024)
Debiasing CLIP Models
- Dehdashtian*, Wang* and Boddeti, "FairerCLIP: Debiasing CLIP's Zero-Shot Predictions using Functions in RKHSs," (ICLR 2024)
FairerCLIP: CelebA Dataset
- $Y$: high cheekbones (binary)
- $S$: sex (binary)
FairerCLIP: FairFace Dataset
FairerCLIP: Chicago Face Dataset
- $Y$: attractiveness (binary)
- $S$: gender (binary)
FairerCLIP: Mitigating Spurious Correlation
W/O Labels
W/ Labels
- AI systems are progressing at a rapid pace.
- But, they exhibit biases.
- Need to develop methods for automated auditing of artificial intelligence systems for bias.
- Next generation of artificial intelligence systems have to be designed with bias mitigation.
- Appreciable gap exists between current solutions and ideal unbiased AI systems.