→
Composing concepts amplifies stereotypes.
High Predictive Accuracy, but Not Robust
Can synthetic image detectors tell these images apart?
Unsteered generation
Detector correctly flags as synthetic
PolyJuice-steered generation
Detector fooled
High Predictive Accuracy, but Not Robust
Attack success rate against state-of-the-art synthetic image detectors
~55%
Without steering
(averaged across models)
→
~91%
+36
With PolyJuice steering
Simple latent-space steering defeats deployed detectors.
How Are These Challenges Addressed Today?
$$\min_\theta \; \mathcal{L}_{\text{task}}(\theta)
\;+\; \lambda \cdot \mathcal{L}_{\text{constraint}}(\theta)$$
Pick a proxy loss. Pick a $\lambda$. Run gradient descent. Report a single number.
- Is there a fundamental Trade-Off?
- What is the best any method can achieve?
- How far are current methods from what is achievable?
- Can we close the gap?
My Research Vision
My research answers these questions.
Characterize fundamental limits
— prove the trade-off is inherent,
not a limitation of current methods.
Map the frontier
— measure where existing methods stand
relative to what is achievable.
Close the gap
— build systems that approach the limits
through principled constraints and optimizers.
My Approach to Building Trustworthy AI Systems
A common methodology across,
fairness, privacy, robustness, and controllability:
Diagnose
Where do models
violate constraints?
→
Characterize
What are the fundamental
limits any method must obey?
→
Achieve
Design systems that
approach these limits
Selected Contributions
Fairness & Controllability
- OASIS ICLR 2025 Spotlight
- PolyJuice NeurIPS 2025
- Obliviator NeurIPS 2025
- CoInD ICLR 2025
- DiverseFlow CVPR 2025
- FairerCLIP ICLR 2024
- Utility-Fairness Trade-Offs CVPR 2024
- Dataset Scaling FAccT 2024
- LAION's Den NeurIPS 2023
- Invariant Representations TMLR 2022 Featured
- Kernelized ARL ICCV 2019
- Info Leakage CVPR 2019 Oral
Cryptographic Privacy
- Book Chapter Springer 2026
- CryptoFace CVPR 2025
- SecureRAG NeurIPS WS 2025
- HE Template Fusion TBIOM 2025
- Shielding Face Repr. FG 2025
- AutoFHE USENIX Security 2024
- FHE Face Analytics FG 2024
- FHE Score Fusion WIFS 2023
- HEFT IJCB 2022 Best Paper
- HERS TBIOM 2022 Best Paper
- Secure Face Matching BTAS 2018
- Privacy-Preserving VL ICCV 2017
Physics-Informed AI
- Mechanics-Informed AE Nat. Comms 2024 Editor's Highlight
- HADAR Nature 2023 Cover
- Boundary Detection SMASIS 2022 Best Paper
Other Work
- SEAL CVPR 2025 Oral
- Gen. Zero-Shot CIR CVPR 2025
- Symbolic Algorithms IROS 2023 Best Paper Finalist
- Transmission-Friendly CNNs TMC 2023 Best Paper
- Neural Arch. Transfer TPAMI 2021
- NSGANetV2 ECCV 2020 Oral
- NSGA-Net GECCO 2019 Best Paper
Today's Talk
-
Fairness & Controllability
—
Optimal trade-offs and concept erasure
-
Privacy
—
Encrypted inference via homomorphic encryption
-
Other Work
—
Physics-informed AI for engineering
-
Vision
—
Joint constraints and composed systems
Part 1: Fairness & Controllability
What are the optimal trade-offs
for bias mitigation and concept erasure?
From Fair Learning to Fair Representation Learning
$Z \perp \!\!\! \perp S \Rightarrow \hat{Y} \perp \!\!\! \perp S$
Learn representation $\mathbf{z}$ that
retains target information
$Y$ while
removing sensitive attribute
$S$.
Statistical Dependence Formulation
- Bi-Objective Optimization Problem:
-
Encoder extracts features
$\mathbf{z}$
-
Statistical dependence between target task and features
$\mathbf{z}$
-
Statistical dependence between sensitive attribute and features
$\mathbf{z}$
$$ \begin{equation} \begin{aligned}
\max_{\mathbf{\Theta}_E} & \text{ } \underbrace{\color{cyan}{Dep(Z,Y)}}_{\color{cyan}{\text{target
dependence}}} \quad s.t. \text{ } \underbrace{\color{orange}{Dep(Z,S)}}_{\color{orange}{\text{sensitive
dependence}}} \leq \alpha
\nonumber \end{aligned}
\end{equation} $$
Making Bias Mitigation Near-Optimal
-
Standard Adversarial Representation Learning - $E(\hat{Y}) \perp \!\!\! \perp S$
-
Universal Dependence Measure: $Z \perp \!\!\! \perp S$ [TMLR 2022]
-
End-to-End Universal Dependence Measure: $Z \perp \!\!\! \perp S$ [CVPR 2024]
Characterize: What is the Optimal Trade-Off?
- Data-Space Trade-Off (DST):
$$ \begin{equation} \begin{aligned}
\max_{f \in \mathcal{H}_{X}} & \text{ } \underbrace{\color{cyan}{Dep(f(X),Y)}}_{\color{cyan}{\text{target
dependence}}} \quad - \lambda \text{ } \underbrace{\color{orange}{Dep(f(X),S)}}_{\color{orange}{\text{sensitive
dependence}}}
\nonumber \end{aligned}
\end{equation} $$
- Label-Space Trade-Off (LST):
$$ \begin{equation} \begin{aligned}
\max_{Z\in L^2} & \text{ } \underbrace{\color{cyan}{Dep(Z,Y)}}_{\color{cyan}{\text{target
dependence}}} \quad - \lambda \text{ } \underbrace{\color{orange}{Dep(Z,S)}}_{\color{orange}{\text{sensitive
dependence}}}
\nonumber \end{aligned}
\end{equation} $$
-
Sadeghi, Dehdashtian, Boddeti,
"On Characterizing the Trade-off in Invariant Representation Learning,"
TMLR 2022 (Outstanding Certification Finalist)
How to Estimate These Trade-Offs?
U-FaTE (Utility-Fairness Trade-Off Estimator)
-
Dehdashtian, Sadeghi, Boddeti,
"Utility-Fairness Trade-Offs and How to Find Them," CVPR 2024
Utility-Fairness Trade-Offs
Folktables
-
Sadeghi, Dehdashtian, Boddeti, "On Characterizing the
Trade-off in Invariant Representation Learning," TMLR 2022
-
Dehdashtian, Sadeghi, Boddeti, "Utility-Fairness
Trade-Offs and How to Find Them," CVPR 2024
Map: Where Do Existing Models Stand?
- $Y$: high cheekbones and $S$: age and sex
Over 1,000 supervised models: most lie far from the optimal trade-off.
-
Sadeghi, Dehdashtian, Boddeti, TMLR 2022;
Dehdashtian, Sadeghi, Boddeti, CVPR 2024
Foundation Models Are No Better
- $Y$: high cheekbones and $S$: age and sex
Over 100 zero-shot CLIP models: same gap from optimal.
-
Dehdashtian, Sadeghi, Boddeti,
"Utility-Fairness Trade-Offs and How to Find Them," CVPR 2024
Bias in CLIP's Zero-Shot Prediction
-
Dehdashtian*, Wang* and Boddeti, "FairerCLIP: Debiasing
CLIP's Zero-Shot Predictions using Functions in RKHSs," (ICLR
2024)
Achieve: FairerCLIP - Debiasing Foundation Models
-
Dehdashtian*, Wang* and Boddeti, "FairerCLIP: Debiasing
CLIP's Zero-Shot Predictions using Functions in RKHSs," (ICLR
2024)
From fairness to controllability
What is the cost of erasing
a concept from a model?
Obliviator: The Cost of Nonlinear Guardedness in Concept Erasure
Formalize erasure as statistical independence:
$\text{Dep}(Z_\theta, S) = 0 \iff Z_\theta \perp \!\!\! \perp S$
$$\inf_\theta\quad\underbrace{\textrm{Dep}(Z_\theta,S)}_{\text{Minimize Statistical Dependency}}-\underbrace{\textrm{Dep}(Z_\theta,Y)}_{\text{A Proxy to Preserve Utility Information}}$$
Obliviator: The Cost of Nonlinear Concept Erasure
Prior methods only achieve linear guardedness.
Obliviator achieves full nonlinear guardedness.
Single-Stage Optimization Fails; Iterative Erasure Traces the Frontier
Single-stage optimization
yields scattered, suboptimal solutions.
Iterative erasure traces the
full Pareto frontier between utility and erasure.
Obliviator Reveals: Nonlinear Erasure Has an Unavoidable Cost
Representation: Frozen
Representation: Fine-tuned
Two Pareto frontiers (supervised/unsupervised) mirror the
DST/LST structure from fairness.
Erasure has an unavoidable cost, and access to task labels
significantly shifts the achievable frontier.
Mitigating Stereotypes in Generative Models
DiverseFlow: Sample-efficient diverse mode coverage in flow models.
Part 2: Cryptographic Privacy
Can we run AI on data that
remains encrypted throughout?
The blind spot of traditional encryption
Privacy of user data is not guaranteed.
FHE can help AI models achieve trustworthiness
FHE enables AI models to process
encrypted data without decryption.
FHE Inverts the Cost Hierarchy of Computation
- All operations roughly equal cost
- Nonlinearities are cheap
- Go deep: more layers = more accuracy
- Additions much faster than Multiplications
- Bootstrapping: dominates latency
- Rotations: highest overhead
Standard AI architectures are not compatible with FHE cost model.
how to Adapt Neural Networks for FHE
- Supported one-dimensional operations under FHE:
- Multiplication
- Addition
- Rotation
ReLU(x)=max(x,0)
Approximate
$\mathcal{F}(x)=(f_k^{d_k}\circ f_{k-1}^{d_{k-1}}\circ \cdots \circ
f_1^{d_1})(x)$
Polynomial approximation for non-linear activations
space of Homomorphic neural Architectures
How to effectively trade-off between accuracy and latency?
AutoFHE: Network-Level Co-Design for Encrypted Inference
Key insight:
Approximate the end-to-end function, not individual activations.
via per-layer polynomials.
Characterize: Accuracy-Latency Pareto Frontier Under FHE
AutoFHE discovers architectures spanning the
accuracy-latency Pareto frontier.
2x - 103x faster than prior methods.
Achieve: CryptoFace - FHE-Native Deep Networks
AutoFHE adapts existing architectures.
Can we do better by designing architectures
natively for encryption?
- Shallow, parallel patch-based design
- Minimizes multiplicative depth
- Exploits parallelism across ciphertext slots
- Near-constant latency across resolutions
CryptoFace: Comparable Accuracy, 8x Lower Latency
| Approach |
Resolution |
Network |
Bootstraps |
Avg Accuracy |
Latency (s) |
| MPCNN |
64x64 |
ResNet44 |
43 |
89.64 |
1,640 |
| AutoFHE |
64x64 |
ResNet32 |
8 |
82.69 |
667 |
| CryptoFace |
64x64 |
CryptoFaceNet4 |
2 |
89.42 |
220 |
| CryptoFace |
128x128 |
CryptoFaceNet16 |
2 |
91.46 |
241 |
7.5x speedup (27 min → 3.6 min), preserving accuracy.
Near-constant latency across resolutions (64x64 → 128x128).
Selected for the
homomorphicencryption.org benchmark suite (Google, Amazon, Intel).
Beyond Classification: Encrypted Retrieval-Augmented Generation
- FHE-based encrypted search over knowledge bases
- Attribute-based encryption for access control
- Defends against prompt injection, data extraction
- Only 0.05s overhead
Other Work: Physics-Informed AI
Can physical laws serve as constraints
that keep AI grounded in reality?
"Ghosting Effect" in Thermal Vision
Why are thermal images blurry?
Ghosting effect: when radiated signal is stronger than
reflected ambient signal.
TeX Decomposition of Thermal Signals
TeX Decomposition: Solve inverse problem with physics constraints.
Thermal vs TeX Decomposition
Mechanics-Informed Structural Health Monitoring
Mechanics constraints lead to 35% better zero-shot damage detection.
Estimating Non-Linear Parameter Fields in Multi-Physics Problems
Solving inverse problems with the adjoint method.
Vision: Future Directions
What happens when multiple trust
constraints must hold simultaneously
across composed systems?
The Next Frontier: Scaling the Science of Trust
Trust in Composed and Agentic AI Systems
$$f = f_{lm}(f_{vis} \circ f_{proj}, f_{text})$$
-
Do guarantees for individual components
survive composition?
-
Can failures that emerge only from
interaction be diagnosed systematically?
-
Can trust be auditable by construction?
Joint Limits: When Constraints Must Coexist
- Fairness requires demographic data.
Privacy regulations protect it.
What is the jointly achievable region?
Can we design systems that reach it?
NSF SCH: Fundamental Limits of Fair and Privacy-Preserving
Healthcare Models (2025-2029)
Towards a Science of Trustworthy AI
AI is becoming critical infrastructure.
Every other form of infrastructure has a science of its limits.
AI does not.
My research builds towards this science.
Towards a Science of Trustworthy AI