Privacy-preserving neural network (NN) inference can be achieved by utilizing homomorphic encryption (HE), which allows computations to be directly carried out over ciphertexts. Popular HE schemes are built over large polynomial rings. To allow simultaneous multiplications in the convolutional (Conv) and fully-connected (FC) layers, multiple input data are mapped to coefficients in the same polynomial, so are the weights of NNs. However, ciphertext rotations are necessary to compute the sums of products and/or incorporate the outputs of different channels into the same polynomials. Ciphertext rotations have much higher complexity than ciphertext multiplications and contribute to the majority of the latency of HE-evaluated Conv and FC layers. This paper proposes a novel reformulated server-client joint computation procedure and a new filter coefficient packing scheme to eliminate ciphertext rotations without affecting the security of the HE scheme. Our proposed scheme also leads to substantial reductions on the number of coefficient multiplications needed and the communication cost between the server and client. For various plain-20 classifiers over the CIFAR-10/100 datasets, our design reduces the running time of the Conv and FC layers by 15.5% and the communication cost between client and server by more than 50%, compared to the best prior design.
翻译:暂无翻译