In this paper, we propose an end-to-end solution for secure fusion and matching of biometric templates using fully homomorphic encryption (FHE). Given a pair of encrypted feature vectors, we perform the following ciphertext operations, i) feature concatenation, ii) fusion and dimensionality reduction through a learned linear projection, iii) scale normalization to unit $\ell_2$-norm, iv) and match score computation. Our method, dubbed HEFT (Homomorphically Encrypted Fusion of biometric Templates), is custom-designed to overcome the unique constraint imposed by FHE, namely the lack of support for non-arithmetic operations. Specifically, from an inference perspective, we explore different data packing schemes for computationally efficient linear projection and introduce a polynomial approximation for scale normalization. And, from a training perspective, we adopt an FHE-aware algorithm for learning the linear projection matrix that mitigates the errors induced by the approximate normalization. Experimental evaluation for template fusion and matching of face and voice biometrics shows that (i) HEFT improves biometric verification performance by 11.07% and 9.58% AUROC compared to the respective unibiometric representations while compressing the feature vectors by a factor of 16 (512D to 32D), and (ii) performs a fusion of two encrypted vectors and match score computation against a gallery of size 1024 in 884 ms.