Abstract
Secure inference of deep convolutional neural networks (CNNs) was recently demonstrated under RNS-CKKS. The high-degree polynomial approximation of ReLU preserves accuracy but results in prohibitively high latency due to the need for as many bootstrapping layers as non-linear activation layers. Low-degree polynomial activations greatly accelerate ciphertext inference but suffer from a considerable drop in predictive accuracy. To improve the trade-off between accuracy and latency, we propose AutoFHE, a multi-objective optimization framework to automatically adapt standard CNNs to polynomial CNNs. AutoFHE can maximize accuracy and minimize the number of bootstrapping operations by assigning layerwise mixed-degree polynomial activations and searching for the placement of bootstrapping operations. As a result, AutoFHE can generate diverse solutions spanning a trade-off front between accuracy and latency. Experimental results of ResNet and VGG backbones on encrypted CIFAR datasets under RNS-CKKS indicate that AutoFHE accelerates inference by $1.32\times\sim1.8\times$ compared to the high-degree solution. In contrast to the low-degree solution, AutoFHE improves accuracy by up to $2.56\%$. Furthermore, AutoFHE accelerates inference by $103\times$ and increases accuracy by $3.42\%$ compared to the TFHE solution.