Abstract
Biometric fusion is a promising method to elevate the recognition performance of unimodal biometric systems. Nevertheless, the exposure of feature vectors for feature-level fusion raises security concerns, as it is feasible to extract sensitive information from these vectors. This paper proposes a non-interactive, end-to-end approach to securely fuse and match biometric templates using Fully Homomorphic Encryption (FHE). For a pair of encrypted feature vectors, we perform the following operations on a ciphertext domain: i) feature concatenation, ii) fusion and dimensionality reduction through a learned linear projection, iii) an optional scale normalization to unit $\ell_2$-norm, and iv) match score computation. Our method, dubbed HEFT, is custom-designed to circumvent a key limitation of FHE - the lack of support for non-arithmetic operations. From an inference perspective, we systematically explore different data packing schemes for computationally efficient linear projection and introduce a polynomial approximation for scale normalization. From a training perspective, we introduce two distinct FHE-aware algorithms to improve the learning of the projection matrix and address the challenges posed by the non-arithmetic normalization step. We demonstrate the utility of HEFT on two multimodal combinations: face and voice and face and fingerprint. For the face-voice fusion, HEFT improves verification performance by a range of 143.25% - 244.35% compared to unibiometric features. On the fingerprint-face fusion, improvements are from 13.99% to 37.99%.