Fairness and Privacy in Artificial Intelligence


RAISE

Vishnu Boddeti

March 24, 2021

VishnuBoddeti

Progress In Artificial Intelligence

Speech Processing
Image Analysis
Natural Language Processing
Robotics



Key Driver
Data, Compute, Algorithms

State-of-Affairs

(report from the real-world)
"Tay, Microsoft's AI chatbot, gets a crash course in racism from Twitter"




"FaceApp's creator apologizes for the app's skin-lightening 'hot' filter"

"Facial recognition is accurate, if you're a white guy"

  • Boulamwini and Gebru, "Gender Shades:Intersectional Accuracy Disparities in Commercial Gender Classification," FAT 2018
"The Secretive Company That Might End Privacy as We Know It"

Real world artificial intelligence systems are effective but,


are biased,


violate user’s privacy and


not trustworthy.

Today's Agenda



Build effective AI systems that are fair and trustworthy.
Fair and Trustworthy AI


Mechanism: control semantic information in data representations

100 Years of Data Representations


Control Mathematical Concepts
variance, sparsity, translation, rotation, scale, etc.

Bias in Learning

    • Training:
    • Inference: Microsoft Gender classification
  • Boulamwini and Gebru, "Gender Shades:Intersectional Accuracy Disparities in Commercial Gender Classification," FAT 2018

Privacy Leakage

    • Training:
    • Inference: Microsoft Smile classification
  • B. Sadeghi, L. Wang, V.N. Boddeti, "Adversarial Representation Learning With Closed-Form Solvers," CVPRW 2020
What is going on?
Dark Secret of Deep Learning

Recklessly absorb all statistical correlations in data

So What?

Demographic Bias
light: 0.7% error & 12.9% error
Overfitting to Domain
classification/regression & domain

Next Era of Data Representations

Control Semantic Concepts
age, gender, domain, etc.

Controlling Semantic Information

  • Target Concept: Smile & Private Concept: Gender
  • Problem Definition:
    • Learn a representation $\mathbf{z} \in \mathbb{R}^d$ from data $\mathbf{x}$
    • Retain information necessary to predict target attribute $\mathbf{t}\in\mathcal{T}$
    • Remove information related to a desired sensitive attribute $\mathbf{s}\in\mathcal{S}$

Technical Challenge



    • How to explicitly control semantic information in learned representations?


    • Can we explicitly control semantic information in learned representations?
The Can



Short Answer: Yes, we can, sometimes.

A Subspace Geometry Perspective

  • Case 1: when $\mathcal{S} \perp \!\!\! \perp \mathcal{T}$ (Gender, Age)
  • Case 3: when $\mathcal{S} \sim \mathcal{T}$ ($\mathcal{T}\subseteq\mathcal{S}$)
  • Case 2: when $\mathcal{S} \not\perp \!\!\! \perp \mathcal{T}$ (Car, Wheels)
  • B. Sadeghi, L. Wang, V.N. Boddeti, ‘‘Adversarial Representation Learning with Closed-Form Solutions," CVPRW 2020
The How



Short Answer: It depends.

Practical Applications

Application-1: Fair Classification

  • UCI Adult Dataset (creditworthiness, gender)
Method Income Gender $\Delta^*$
Raw Data 84.3 98.2 22.8
Remove Gender 84.2 83.6 16.1
Our Approach 84.1 67.4 0.0
$^*$ Absolute difference between adversary accuracy and random chance

Fair Classification: Interpreting Encoder Weights

Embedding Weights (Adult Dataset)

Application-2: Mitigating Privacy Leakage

  • CelebA Dataset (smile, gender)
Method Smile Gender $\Delta^*$
Raw Data 93.1 82.9 21.5
Zero-Sum game 91.8 72.5 11.1
Our Approach 92.5 61.4 0.0
$^*$ Absolute difference between adversary accuracy and random chance

Application-3: Mitigating Privacy Leakage

Application-4: Illumination Invariance



  • 38 identities and 5 illumination directions
  • Target: Identity Label
  • Sensitive: Illumination Label

Privacy Preserving AI



Mechanism: control access to information in data representations

Privacy Leakage in Augmented Reality

  • Pittaluga et. al., "Revealing Scenes by Inverting Structure from Motion Reconstructions", CVPR 2019

Information Leakage from Gradients

  • Yonetani et. al., "Privacy-Preserving Visual Learning Using Doubly Permuted Homomorphic Encryption", ICCV 2017

Learning from Private Data: Federated Learning

  • Distributed learning of parameters from private data.
  • Clients download current global model $\bar{\mathbf{w}_t}$.
  • Client updates model from local data.
  • Aggregator updates global model
  • Yonetani et. al., "Privacy-Preserving Visual Learning Using Doubly Permuted Homomorphic Encryption", ICCV 2017

So What?

$\dots$ consent should be given for all purposes $\dots$

Encryption: The Holy Grail?

  • Data encryption is an attractive option for protection.
    • protects user's privacy
    • enables free and open sharing
    • mitigate legal and ethical issues


Goal
efficient learning directly from encrypted data
efficient inference directly on encrypted data

Learning from Private Data

Homomorphic Encryption for Learning Sparse Models
Facial Attribute Recognition
Methods Accuracy Privacy
LLWT15 87 No
DP 78 Yes
DP+SGD 64 Yes
Our Approach 84 Yes
  • Yonetani et. al., "Privacy-Preserving Visual Learning Using Doubly Permuted Homomorphic Encryption", ICCV 2017

Inference on Private Data

Amortized Homomorphically Encrypted Inner Products
  • V.N. Boddeti, "Secure Face Matching Using Fully Homomorphic Encryption,", BTAS 2018

Open Problems

    • How do we mitigate bias in AI?

    • How do we ensure AI systems do not violate user privacy?

    • Understand fundamental trade-off between utility and fairness.

    • Understand fundamental trade-off between utility and privacy.

    • $\dots$

Summary

  • Today's AI systems are biased and violate user's privacy.

  • How do we make AI systems fair and privacy-preserving?

  • Many unanswered open questions and practical challenges.


Human Analysis Lab
VishnuBoddeti