The rapid growth and deployment of deep learning (DL) has witnessed emerging privacy and security concerns. To mitigate these issues, secure multi-party computation (MPC) has been discussed, to enable the privacy-preserving DL computation. In practice, they often come at very high computation and communication overhead, and potentially prohibit their popularity in large scale systems. Two orthogonal research trends have attracted enormous interests in addressing the energy efficiency in secure deep learning, i.e., overhead reduction of MPC comparison protocol, and hardware acceleration. However, they either achieve a low reduction ratio and suffer from high latency due to limited computation and communication saving, or are power-hungry as existing works mainly focus on general computing platforms such as CPUs and GPUs. In this work, as the first attempt, we develop a systematic framework, PolyMPCNet, of joint overhead reduction of MPC comparison protocol and hardware acceleration, by integrating hardware latency of the cryptographic building block into the DNN loss function to achieve high energy efficiency, accuracy, and security guarantee. Instead of heuristically checking the model sensitivity after a DNN is well-trained (through deleting or dropping some non-polynomial operators), our key design principle is to em enforce exactly what is assumed in the DNN design -- training a DNN that is both hardware efficient and secure, while escaping the local minima and saddle points and maintaining high accuracy. More specifically, we propose a straight through polynomial activation initialization method for cryptographic hardware friendly trainable polynomial activation function to replace the expensive 2P-ReLU operator. We develop a cryptographic hardware scheduler and the corresponding performance model for Field Programmable Gate Arrays (FPGA) platform.
翻译:深层次学习(DL)的快速增长和部署见证了对隐私和安全的担忧。为了缓解这些问题,已经讨论了安全的多方计算(MPC)问题,以便能够进行隐私保存DL计算。实际上,它们往往出现在非常高的计算和通信间接费用上,并有可能禁止其在大型系统中的受欢迎程度。两种正统研究趋势都吸引了在安全深层次学习中解决能源效率的极大兴趣,即MPC比较协议的管理减少和硬件加速。然而,它们要么实现了低下降率,由于计算和通信节约有限而导致高度悬浮,要么是电源饥饿,因为现有工作主要侧重于一般计算平台,如CPU和GPU。在这项工作中,我们开发了一个系统化框架,即CMPC比较协议和硬件加速,将加密建筑块的硬性拉力纳入DNNE损失功能中,以实现高的精度、精度和精度的精度,因为计算和通信节率的节省,或者由于现有2个电路的热度计算平台,我们正在通过一个高压的操作器进行快速的操作程序,而将一个高精度的DMLL程序用于在设计中进行。