Principal Component Analysis (PCA) is a pivotal technique in the fields of machine learning and data analysis. In this study, we present a novel approach for privacy-preserving PCA using an approximate numerical arithmetic homomorphic encryption scheme. We build our method upon a proposed PCA routine known as the PowerMethod, which takes the covariance matrix as input and produces an approximate eigenvector corresponding to the first principal component of the dataset. Our method surpasses previous approaches (e.g., Pandas CSCML 21) in terms of efficiency, accuracy, and scalability. To achieve such efficiency and accuracy, we have implemented the following optimizations: (i) We optimized a homomorphic matrix multiplication technique (Jiang et al. SIGSAC 2018) that will play a crucial role in the computation of the covariance matrix. (ii) We devised an efficient homomorphic circuit for computing the covariance matrix homomorphically. (iii) We designed a novel and efficient homomorphic circuit for the PowerMethod that incorporates a systematic strategy for homomorphic vector normalization enhancing both its accuracy and practicality. Our matrix multiplication optimization reduces the minimum rotation key space required for a $128\times 128$ homomorphic matrix multiplication by up to 64\%, enabling more extensive parallel computation of multiple matrix multiplication instances. Our homomorphic covariance matrix computation method manages to compute the covariance matrix of the MNIST dataset ($60000\times 256$) in 51 minutes. Our privacy-preserving PCA scheme based on our new homomorphic PowerMethod circuit successfully computes the top 8 principal components of datasets such as MNIST and Fashion-MNIST in approximately 1 hour, achieving an r2 accuracy of 0.7 to 0.9, achieving an average speed improvement of over 4 times and offers higher accuracy compared to previous approaches.
翻译:暂无翻译