Purdue University

Vishnu Boddeti

September 14, 2020

Face

Fingerprint

Iris/Periocular

Gait

March 24, 2016

April 25, 2017

Feb. 09, 2018

- Boulamwini and Gebru, "Gender Shades:Intersectional Accuracy Disparities in Commercial Gender Classification," FAT 2018

Jan. 18, 2020

- Training:
- Inference: Microsoft Gender classification

- Training:
- Inference: Microsoft Smile classification
- Target Task
- Smile: 93.1%
- Privacy Leakage
- Gender: 82.9%

- B. Sadeghi, L. Wang, V.N. Boddeti, "Adversarial Representation Learning With Closed-Form Solvers," CVPRW 2020

- Learned Embeddings:
- Attacks on Embeddings:

- Mai et. al., ‘‘On the reconstruction of face images from deep face templates," IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018

Recklessly absorb all statistical correlations in data

- Target Concept: Smile & Private Concept: Gender
- Problem Definition:
- Learn a representation $\mathbf{z} \in \mathbb{R}^d$ from data $\mathbf{x}$
- Retain information necessary to predict target attribute $\mathbf{t}\in\mathcal{T}$
- Remove information related to a desired sensitive attribute $\mathbf{s}\in\mathcal{S}$

- How to explicitly control semantic information in learned representations?
- Can we explicitly control semantic information in learned representations?

- Case 1: when $\mathcal{S} \perp \!\!\! \perp \mathcal{T}$ (Gender, Age)

- Case 3: when $\mathcal{S} \sim \mathcal{T}$ ($\mathcal{T}\subseteq\mathcal{S}$)

- Case 2: when $\mathcal{S} \not\perp \!\!\! \perp \mathcal{T}$ (Car, Wheels)

- B. Sadeghi, L. Wang, V.N. Boddeti, ‘‘Adversarial Representation Learning with Closed-Form Solutions," CVPRW 2020

metric to measure semantic attribute information*Design*- not obvious how
metric to measure semantic attribute information*Learn*- probably feasible

- Three player game between:
- Encoder extracts features $\mathbf{z}$
- Target Predictor for desired task from features $\mathbf{z}$
- Adversary extracts sensitive information from features $\mathbf{z}$
- Adversary: learned measure of semantic attribute information

- Simultaneous/Alternating Stochastic Gradient Descent
- Update
__target__while keeping encoder and adversary frozen. - Update
__adversary__while keeping encoder and target frozen. - Update
__encoder__while keeping target and adversary frozen.

- Global solution is $(w_1, w_2, w_3)=(0, 0, 0)$

- Non-Zero Sum Formulation for Iterative Methods (CVPR'19)
- Standard setting, each player is a deep neural network.
- Local optima
- Global Optima for Kernel Methods (ICCV'19)
- Simplified setting, each player is linear.
- closed form solution + stable + performance bounds
- Hybrid Model with CNNs and Closed-Form Solvers (CVPRW'20)
- Standard setting, encoder is a deep neural network, other players are closed-form solvers.
- Local optima

- Limitations:
- Encoder target distribution leaks information !!
- Practice: simultaneous SGD does not reach equilibrium
- Class Imbalance: likelihood biases solution to majority class

Encoder optimizes entropy of adversary instead of likelihood.

Converges to Local Optima

- Three player game between:
- Encoder extracts features $\mathbf{z}$
- Target Predictor for desired task from features $\mathbf{z}$
- Adversary extracts sensitive information from features $\mathbf{z}$
- Three Player Non-Zero Sum Game:

\begin{equation}
\begin{aligned}
\min_{\mathbf{\theta}_A} & \mbox{ } \underbrace{\color{orange}{J_1(\mathbf{\theta}_E,\mathbf{\theta}_A)}}_{\color{orange}{\mbox{error of adversary}}} \\
\min_{\mathbf{\theta}_E,\mathbf{\theta}_T} & \mbox{ } \underbrace{\color{cyan}{J_2(\mathbf{\theta}_E,\mathbf{\theta}_T)}}_{\color{cyan}{\mbox{error of target}}} - \alpha \underbrace{\color{orange}{J_3(\mathbf{\theta}_E,\mathbf{\theta}_A)}}_{\color{orange}{\mbox{entropy of adversary}}} \nonumber
\end{aligned}
\end{equation}

\begin{equation} \begin{aligned} \min_{\mathbf{\Theta}_E} & \ \ {\color{cyan}{J_t(\mathbf{\Theta}_E)}} \\ \mathrm {s.t. \ \ } & {\color{orange}{J_s (\mathbf{\Theta}_E) \ge \alpha}} \nonumber \end{aligned} \end{equation}

- Non-convexity: feasible set is non-convex
- Non-differentiability: solution is either a plane or a line

- B. Sadeghi, R. Yu, V.N. Boddeti, ‘‘On the Global Optima of Kernelized Adversarial Representation Learning," ICCV 2019

__Lagrangian formulation:__
\begin{equation}
\min_{\mathbf{\Theta}_E} \Big\{(1-\lambda){\color{cyan}{J_t(\mathbf{\Theta}_E)}}- (\lambda) {\color{orange}{J_s (\mathbf{\Theta}_E)} }\Big\} \nonumber
\end{equation}

Non-Convex + Non-Differentiable

__Solution:__
\begin{equation}
\mathbf{\Theta}_E, r^*=\mbox{Negative Eig} \Big\{\mathbf{X}\left(\lambda \color{orange}{\mathbf{S}^T \mathbf{S}} - (1-\lambda)\color{cyan}{\mathbf{Y}^T \mathbf{Y}} \right)\mathbf{X}^T \Big\}\nonumber
\end{equation}

Global Optima + Optimal Dimensionality + Performance Bounds

- B. Sadeghi, R. Yu, V.N. Boddeti, "On the Global Optima of Kernelized Adversarial Representation Learning," ICCV 2019

- Encoder extracts features $\mathbf{z}$
- Target Predictor: kernel ridge regressor to predict target from $\mathbf{z}$
- Adversary: kernel ridge regressor to extract sensitive information from $\mathbf{z}$

- B. Sadeghi, L. Wang, V.N. Boddeti, "Adversarial Representation Learning with Closed-Form Solutions," CVPRW 2020

- Embedding Dimensionality
- # of negative eigenvalues of \begin{equation} \mathbf{B} = \lambda \tilde{\mathbf{S}}^T \tilde{\mathbf{S}} -(1-\lambda)\tilde{\mathbf{Y}}^T \tilde{\mathbf{Y}} \end{equation}

- UCI Adult Dataset (creditworthiness, gender)

Method | Income | Gender | $\Delta^*$ |
---|---|---|---|

Raw Data | 84.3 | 98.2 | 22.8 |

Remove Gender | 84.2 | 83.6 | 16.1 |

Zero-Sum game | 84.4 | 67.7 | 0.3 |

Non-Zero-Sum Game | 84.6 | 67.3 | 0.1 |

Global-Optima | 84.1 | 67.4 | 0.0 |

Hybrid | 83.8 | 67.4 | 0.0 |

- CelebA Dataset (smile, gender)

Method | Smile | Gender | $\Delta^*$ |
---|---|---|---|

Raw Data | 93.1 | 82.9 | 21.5 |

Zero-Sum game | 91.8 | 72.5 | 11.1 |

Non-Zero-Sum Game | 91.6 | 62.1 | 0.7 |

Global-Optima | 92.0 | 61.4 | 0.0 |

Hybrid | 92.5 | 61.4 | 0.0 |

- 38 identities and 5 illumination directions
- Target:Identity Label
- Sensitive:Illumination Label

Method | $s$ (lighting) | $t$ (identity) |
---|---|---|

Raw Data | 96 | 78 |

NN + MMD (NeurIPS 2014) | - | 82 |

VFAE (ICLR 2016) | 57 | 85 |

Zero-Sum Game (NeurIPS 2017) | 57 | 89 |

Non-Zero-Sum Game | 40 | 89 |

Global-Optima | 20 | 86 |

- Understand fundamental trade-off between utility and fairness.
- Understand achievable trade-off between utility and fairness.
- Optimization of adversarial training, especially three player games under general settings.
- $\dots$

- Striving step towards explicit control of,
- semantic information in learned representations
- access to information in learned representations
- Many unanswered open questions and practical challenges.