The Utility-Fairness Trade-Offs in Learning Fair Representations


Wayne State University (CSE Graduate Seminar)

Slides: hal.cse.msu.edu/talks
VishnuBoddeti

Progress In Machine Learning

Speech Processing
Image Analysis
Natural Language Processing
Physical Sciences



Key Drivers
Data, Compute, Algorithms

State-of-Affairs

(report from the real-world)
"Machine Bias"

"Facial recognition is accurate, if you're a white guy"

  • Boulamwini and Gebru, "Gender Shades:Intersectional Accuracy Disparities in Commercial Gender Classification," FAT 2018
"Black Artists Say A.I. Shows Bias"

"How AI reduces the world to stereotypes"
"LLMs propagate race-based medicine"
Real world machine learning systems are effective but,


are biased,


violate user’s privacy and


not trustworthy.

Today's Agenda



Build ML systems that are fair while retaining utility.

Trade-Offs in Fair Machine Learning

Questions of Interest



    • What are the trade-offs between fairness and utility?


    • When do the trade-offs between fairness and utility exist?


    • How to explicitly characterize the utility-fairness trade-offs?
The What



Short Answer: Data and Label Space Trade-Offs

What are the Trade-Offs in Fair Machine Learning?

  • Sadeghi, Dehdashtian, Boddeti, "On Characterizing the Trade-off in Invariant Representation Learning," TMLR 2022
  • Dehdashtian, Sadeghi, Boddeti, "The Utility-Fairness Trade-offs in Learning Fair Representations," (Under Review)

Fairness: The Multi-Headed Hydra

Fairness Definitions: Statistical Parity

Fairness Definitions: Equalized Odds

Fairness Definitions: Equality of Opportunity

The When



Short Answer: Target and demographic attributes are related.

A Causal Perspective

$Y$ is related to $S \Rightarrow$ trade-off exists.

A Subspace Geometry Perspective

  • Case 1: when $\mathcal{S} \perp \!\!\! \perp \mathcal{T}$ (Gender, Age)
  • Case 3: when $\mathcal{S} \sim \mathcal{T}$ ($\mathcal{T}\subseteq\mathcal{S}$)
  • Case 2: when $\mathcal{S} \not\perp \!\!\! \perp \mathcal{T}$ (High Cheekbones, Gender)
  • B. Sadeghi, L. Wang, V.N. Boddeti, ‘‘Adversarial Representation Learning with Closed-Form Solutions," CVPRW 2020
The How



Short Answer: It depends

From Fair Learning to Fair Representation Learning

$Z \perp \!\!\! \perp S \Rightarrow \hat{Y} \perp \!\!\! \perp S$

Learning Fair Representations

  • Target Attribute: Smile & Demographic Attribute: Gender
  • Problem Definition:
    • Learn a representation $\mathbf{z} \in \mathbb{R}^d$ from data $\mathbf{x}$
    • Retain information necessary to predict target attribute $\mathbf{t}\in\mathcal{T}$
    • Remove information related to a desired demographic attribute $\mathbf{s}\in\mathcal{S}$

A Fork in the Road



  • Design metric to measure sensitive demographic attribute information
    • non-parameteric statistical dependence measures, this talk


  • Learn metric to measure semantic attribute information
    • probably feasible, many prior attempts
Prior Work: Adversarial Representation Learning

Game Theoretic Formulation

  • Three player game between:
    • Encoder extracts features $\mathbf{z}$
    • Target Predictor for desired task from features $\mathbf{z}$
    • Adversary extracts sensitive information from features $\mathbf{z}$
    $$ \begin{equation} \begin{aligned} \min_{\mathbf{\Theta}_E,\mathbf{\Theta}_T} & \underbrace{\color{cyan}{J_t(\mathbf{\Theta}_E,\mathbf{\Theta}_T)}}_{\color{cyan}{\text{error of target}}} \quad s.t. \text{ } \min_{\mathbf{\Theta}_A} \underbrace{\color{orange}{J_s(\mathbf{\Theta}_E,\mathbf{\Theta}_A)}}_{\color{orange}{\text{error of adversary}}} \geq \alpha \nonumber \end{aligned} \end{equation} $$
  • Adversary: learned measure of semantic attribute information

How do we learn model parameters?

  • Simultaneous/Alternating Stochastic Gradient Descent
    • Update target while keeping encoder and adversary frozen.
    • Update adversary while keeping encoder and target frozen.
    • Update encoder while keeping target and adversary frozen.

ARL is Suboptimal




Unstable Optimization




Lack of Invariance Guarantees

Overview of our Solutions

    • Standard Adversarial Representation Learning
    • Linear Adversarial Measure: linear dependency between $Z$ and $S$ [ICCV 2019, CVPRW 2020]
    • Non-Linear Adversarial Measure: Beyond linear dependency between $Z$ and $S$, but not all types [ECML 2021]
    • Universal Dependence Measure: All types of dependency between $Z$ and $S$ [TMLR 2022]
    • End-to-End Universal Dependence Measure: All types of dependency between $Z$ and $S$ [Under Review]

Covariance Operator and Dependence Measure

Linear Dependence: $\displaystyle C_{SZ}\approx\displaystyle \frac{1}{n}{\color{Maroon}\tilde{\bm S}} \tilde{\bm Z}^T$

Universal Dependence: $\displaystyle \Sigma_{SZ}\approx\frac{1}{n}{\color{Maroon}\tilde{\bm K}_S} \tilde{\bm K}_Z$

$\mathcal H_Z\times\mathcal H_S\rightarrow \mathbb R:\ Cov\left(\alpha(Z),\ \beta({\color{Maroon}S})\right)$

$=\big\langle \beta, \Sigma_{SZ}\, \alpha \big\rangle_{\mathcal H_S}, \Sigma_{SZ}:\mathcal H_Z\rightarrow \mathcal H_S$

$Z \perp \!\!\! \perp {\color{Maroon}S} \Leftrightarrow \Sigma_{SZ} =0 \Leftrightarrow \left\|\Sigma_{SZ}\right\| =0$, where $\|\cdot\|$ can be any operator norm
HSIC$(Z,S)=\left\|\Sigma_{SZ}\right\|^2_{\text{HS}}=\displaystyle\sum_{\alpha\in\mathcal U_Z}\sum_{\beta\in \mathcal H_S}Cov^2(\alpha(Z),\beta(S))$

Universal Dependence Measure

$\mathcal A_r:=\Big\{(f_1,\cdots,f_r)\,|\,Cov(f_i(X), f_j(X))+\gamma \langle f_i, f_j\rangle_{\mathcal H_X}=\delta_{i,j} \Big\}$
$\displaystyle\sup_{\bm f\in\mathcal A_r} {\Big\{J(\bm f):=\color{ForestGreen}(1-\lambda)\,\text{Dep}(Z, Y)}{\color{Maroon}-\lambda\,\text{Dep}(Z, S) }\Big\}$
Solution: eigenfunctions corresponding to the $r$ largest eigenvalues of $\big({\color{ForestGreen}(1-\lambda)\, \Sigma_{YX}^*\,\Sigma_{YX}}{\color{Maroon}-\lambda\,\Sigma_{SX}^*\Sigma_{SX} }\big)\bm f = \tau (\Sigma_{XX}+\gamma I)\bm f$
  • Sadeghi, Dehdashtian, Boddeti, "On Characterizing the Trade-off in Invariant Representation Learning," TMLR 2022

End-to-End Dependence Universal Measure

  • Learning through alternating optimization
  • Dehdashtian, Sadeghi, Boddeti, "The Utility-Fairness Trade-offs in Learning Fair Representations," (Under Review)

Estimating Trade-Offs on Real Datasets

Face Image Dataset

  • Liu, Luo, Wang and Tang "Deep Learning Face Attributes in the Wild," ICCV 2015

CelebA Faces

  • $Y$: high cheekbones (binary) and $S$: age and sex (continuous + binary)
  • Sadeghi, Dehdashtian, Boddeti, "On Characterizing the Trade-off in Invariant Representation Learning," TMLR 2022
  • Dehdashtian, Sadeghi, Boddeti, "The Utility-Fairness Trade-offs in Learning Fair Representations," (Under Review)

Folktables

  • $Y$: employement status (binary) and $S$: age (continuous)
  • Sadeghi, Dehdashtian, Boddeti, "On Characterizing the Trade-off in Invariant Representation Learning," TMLR 2022
  • Dehdashtian, Sadeghi, Boddeti, "The Utility-Fairness Trade-offs in Learning Fair Representations," (Under Review)

Other Aspects of Fairness in AI

How about zero-shot models?

Fairness of Zero-Shot Predictions


  • $Y$: high cheekbones (binary)
  • $S$: sex (binary)
  • Dehdashtian, Wang, Boddeti, "FairVLM: Mitigating Bias in Pre-Trained Vision-Language Models," Under Review

FairFace

  • Dehdashtian, Wang, Boddeti, "FairVLM: Mitigating Bias in Pre-Trained Vision-Language Models," Under Review

Chicago Face Dataset

  • $Y$: attractiveness (binary)
  • $S$: gender (binary)
  • Dehdashtian, Wang, Boddeti, "FairVLM: Mitigating Bias in Pre-Trained Vision-Language Models," Under Review

How about Data?


  • DataComp: In search of the next generation of multimodal datasets, NeurIPS D&B 2023

Troubling Trends in Dataset Scaling

Scale exacerbates hate content.
  • Birhane, Prabhu, Han and Boddeti, "On Hate Scaling Laws For Data-Swamps," arXiv:2306.13141
  • Birhane, Prabhu, Han, Boddeti, Luccioni, "Into the LAION's Den: Investigating Hate in Multimodal Datasets," NeurIPS D&B Track 2023

Troubling Trends in Dataset Scaling

Scale exacerbates stereotypes.
  • Birhane, Prabhu, Han and Boddeti, "On Hate Scaling Laws For Data-Swamps," arXiv:2306.13141
  • Birhane, Prabhu, Han, Boddeti, Luccioni, "Into the LAION's Den: Investigating Hate in Multimodal Datasets," NeurIPS D&B Track 2023

Concluding Remarks

Summary

    • Next generation of machine learning systems have to be designed with fairness constraints.

    • Identified two trade-offs between utility and fairness.

    • Proposed algorithms to characterize the utility-fairness trade-off.

    • Appreciable gap exists between current solutions and ideal trade-off.

Thank You