Privacy and Security in AI
CSE 891: Deep Learning
Vishnu Boddeti
CIA Model: Confidentiality, Integrity and Availability
- Confidentiality: Protect sensitive information against unauthorized access.
- Integrity: Guarantee that ML model has not been tampered with.
- Availability: Guarantee that ML systems are available to users when they need them.
Privacy Threats
Privacy Threats for Biometrics
Privacy Threats for Neural Networks
Problem: Information Leakage from Representations
- Learned Embeddings:
Enrollment
Authentication
Problem: Information Leakage from Representations
Attacks on Face templates
"Assessing Privacy Risks from Feature Vector Reconstruction Attacks," arXiv:2202.05760
Face reconstruction from template
"On the reconstruction of face images from deep face templates," TPAMI, 2018
Finger vein reconstruction from binary templates
"Inverse Biometrics: Reconstructing Grayscale Finger Vein Images from Binary Features," IJCB, 2020
Privacy Leakage From AR/VR
- Revealing Scenes by Inverting Structure from Motion Reconstructions, CVPR 2019
Problem: Membership Inference
Problem: Stealing Model Weights
Privacy and Security Solutions
Data Anonymization
Differential Privacy: Concept
$$Pr[M(d) \in S] \leq e^{\epsilon}Pr[M(d') \in S]$$
Differential Privacy [Dwork et.al. 2006]
Differential Privacy for Deep Learning
Private Learning: Differentially Privacy SGD
- Add noise to gradients to prevent reconstruction.
- Limitation: Trade-off accuracy for privacy
DP-SGD
DP for Federated Learning
- Distributed learning of parameters from private data.
- Clients download current global model $\bar{\mathbf{w}_t}$.
- Client updates model from local data.
- Aggregator updates global model
Information Leakage from Gradients
K-Anonimity
- Semi-supervised Knowledge Transfer for Deep Learning from Private Training Data, ICLR 2017
PATE: Private Aggregation of Teacher Ensemble
- Semi-supervised Knowledge Transfer for Deep Learning from Private Training Data, ICLR 2017
Semantic Security and Encryption
Encryption: The Holy Grail?
- Data encryption is an attractive option
- protects user's privacy
- enables free and open sharing
- mitigate legal and ethical issues
- Encryption scheme needs to allow computations directly on the encrypted data.
- Solution: Homomorphic Encryption
Cryptography: Multi-Party Computation
Cryptography: Fully Homomorphic Encryption
RLWE: Ring Learning with Errors
op |
plaintext |
ciphertext |
|
$x$ |
$(x + e_1) \mbox{ mod } t$ |
|
$y$ |
$(y + e_2) \mbox{ mod } t$ |
$+$ |
$x+y$ |
$(x+y + e_3') \mbox{ mod } t$ |
$\times$ |
$x\times y$ |
$(x\times y + e_4'') \mbox{ mod } t$ |
Secure Learning: Federated Learning
Secure Feature Matching
- Boddeti, "Secure Face Matching Using Fully Homomorphic Encryption," BTAS 2018
- Engelsma, Jain, Boddeti, "HERS: Homomorphically Encrypted Representation Search," TBIOM 2022
Efficient Search with Dimensionality Reduction
- Build upon DeepMDS for dimensionality reduction.
- Sixue Gong, Vishnu Boddeti, Anil Jain, "On the Intrinsic Dimensionality of Image Representations,", CVPR 2019
Scaling to 100 Million Gallery
- Engelsma, Jain, Boddeti, "HERS: Homomorphically Encrypted Representation Search," TBIOM 2022
HEFT: Homomorphically Encrypted Fusion of Biometric Templates
$\ell_2$-Normalization of Vector
$\hat{\mathbf{u}} = \frac{\mathbf{u}}{\|\mathbf{u}\|_2} \quad \rightarrow \quad$ division$\dagger$
where
$\|\mathbf{u}\|_2 = \sqrt{\sum_{i=1}^d u_i^2} \quad \rightarrow \quad$ square-root$\dagger$
- $\dagger$: problematic operations for FHE
Inverse Square Root: Polynomial Approximation
$$\frac{1}{\sqrt{x}} = \sum_{i=1}^6 a_i x^i$$
Secure Inference of Deep Neural Networks
Problem: Non-arithmetic operations are either costly or not supported by cryptographic schemes.
- Secure MPC supports non-arithmetic operations, but is very slow.
- Fully Homomorphic Encryption only supports addition and multiplication.
- Solution: Replace non-arithmetic operations like ReLU, MaxPool by polynomial approximations.
Secure Inference of Deep Neural Networks
Secure Inference of Deep Neural Networks
Fully Homomorphic Encryption (CKKS)
Secure Multi-Party Computation (CrypTen)
AutoFHE: Automated Adaption of CNNs for Efficient Evaluation over FHE (Under Review at ICLR 2023)
Summary
- Privacy and security is a nuanced and challenging issue with many open problems.
- There are many possible paths, including differential privacy and semantic security through encryption.
- Real-world solution need both privacy and security.
- There is a dire need for ground-up rethinking and redesigning deep learning systems for privacy and security.