Privacy and Security in AI

CSE 891: Deep Learning

Vishnu Boddeti

Privacy for AI

CIA Model: Confidentiality, Integrity and Availability

Confidentiality: Protect sensitive information against unauthorized access.
Integrity: Guarantee that ML model has not been tampered with.
Availability: Guarantee that ML systems are available to users when they need them.

Privacy Threats

Privacy Threats for Biometrics

Privacy Threats for Neural Networks

Problem: Information Leakage from Representations

Learned Embeddings:

Enrollment

Authentication

Problem: Information Leakage from Representations

Attacks on Face templates

"Assessing Privacy Risks from Feature Vector Reconstruction Attacks," arXiv:2202.05760

Face reconstruction from template

"On the reconstruction of face images from deep face templates," TPAMI, 2018

Finger vein reconstruction from binary templates

"Inverse Biometrics: Reconstructing Grayscale Finger Vein Images from Binary Features," IJCB, 2020

Privacy Leakage From AR/VR

Revealing Scenes by Inverting Structure from Motion Reconstructions, CVPR 2019

Problem: Membership Inference

Problem: Stealing Model Weights

Tools at Hand

Privacy and Security Solutions

Anonymization

Data Anonymization

Differential Privacy

Differential Privacy: Concept

$$Pr[M(d) \in S] \leq e^{\epsilon}Pr[M(d') \in S]$$

Differential Privacy [Dwork et.al. 2006]

Differential Privacy for Deep Learning

Private Learning: Differentially Privacy SGD

Add noise to gradients to prevent reconstruction.
Limitation: Trade-off accuracy for privacy

DP-SGD

DP for Federated Learning

Distributed learning of parameters from private data.
Clients download current global model $\bar{\mathbf{w}_t}$.

Client updates model from local data.
Aggregator updates global model

Information Leakage from Gradients

See through Gradients: Image Batch Recovery via GradInversion, CVPR 2021

K-Anonimity

Semi-supervised Knowledge Transfer for Deep Learning from Private Training Data, ICLR 2017

PATE: Private Aggregation of Teacher Ensemble

Semi-supervised Knowledge Transfer for Deep Learning from Private Training Data, ICLR 2017

Semantic Security and Encryption

Encryption: The Holy Grail?

Data encryption is an attractive option

protects user's privacy
enables free and open sharing
mitigate legal and ethical issues

Encryption scheme needs to allow computations directly on the encrypted data.

Solution: Homomorphic Encryption

Cryptography: Multi-Party Computation

Cryptography: Fully Homomorphic Encryption

RLWE: Ring Learning with Errors

op	plaintext	ciphertext
	$x$	$(x + e_1) \mbox{ mod } t$
	$y$	$(y + e_2) \mbox{ mod } t$
$+$	$x+y$	$(x+y + e_3') \mbox{ mod } t$
$\times$	$x\times y$	$(x\times y + e_4'') \mbox{ mod } t$

Secure Learning: Federated Learning

Secure Feature Matching

Boddeti, "Secure Face Matching Using Fully Homomorphic Encryption," BTAS 2018
Engelsma, Jain, Boddeti, "HERS: Homomorphically Encrypted Representation Search," TBIOM 2022

Efficient Search with Dimensionality Reduction

- Build upon DeepMDS for dimensionality reduction.

Sixue Gong, Vishnu Boddeti, Anil Jain, "On the Intrinsic Dimensionality of Image Representations,", CVPR 2019

Scaling to 100 Million Gallery

Engelsma, Jain, Boddeti, "HERS: Homomorphically Encrypted Representation Search," TBIOM 2022

HEFT: Homomorphically Encrypted Fusion of Biometric Templates

$\ell_2$-Normalization of Vector

$\hat{\mathbf{u}} = \frac{\mathbf{u}}{\|\mathbf{u}\|_2} \quad \rightarrow \quad$ division^$\dagger$

where

$\|\mathbf{u}\|_2 = \sqrt{\sum_{i=1}^d u_i^2} \quad \rightarrow \quad$ square-root^$\dagger$

^$\dagger$: problematic operations for FHE

Inverse Square Root: Polynomial Approximation

$$\frac{1}{\sqrt{x}} = \sum_{i=1}^6 a_i x^i$$

Secure Inference of Deep Neural Networks

Problem: Non-arithmetic operations are either costly or not supported by cryptographic schemes.

Secure MPC supports non-arithmetic operations, but is very slow.
Fully Homomorphic Encryption only supports addition and multiplication.

Solution: Replace non-arithmetic operations like ReLU, MaxPool by polynomial approximations.

Secure Inference of Deep Neural Networks

Fully Homomorphic Encryption (CKKS)

Secure Multi-Party Computation (CrypTen)

AutoFHE: Automated Adaption of CNNs for Efficient Evaluation over FHE (Under Review at ICLR 2023)

Summary

Privacy and security is a nuanced and challenging issue with many open problems.
There are many possible paths, including differential privacy and semantic security through encryption.
Real-world solution need both privacy and security.
There is a dire need for ground-up rethinking and redesigning deep learning systems for privacy and security.