The proliferation of machine learning services in the last few years has raised data privacy concerns. Homomorphic encryption (HE) enables inference using encrypted data but it incurs 100x--10,000x memory and runtime overheads. Secure deep neural network (DNN) inference using HE is currently limited by computing and memory resources, with frameworks requiring hundreds of gigabytes of DRAM to evaluate small models. To overcome these limitations, in this paper we explore the feasibility of leveraging hybrid memory systems comprised of DRAM and persistent memory. In particular, we explore the recently-released Intel Optane PMem technology and the Intel HE-Transformer nGraph to run large neural networks such as MobileNetV2 (in its largest variant) and ResNet-50 for the first time in the literature. We present an in-depth analysis of the efficiency of the executions with different hardware and software configurations. Our results conclude that DNN inference using HE incurs on friendly access patterns for this memory configuration, yielding efficient executions.
翻译:过去几年中机器学习服务的扩散引起了数据隐私方面的关注。 单态加密(HE)使得能够使用加密数据进行推断,但它产生了100x-10,000x的内存和运行管理费。 使用HE的深度神经网络(DNN)目前受到计算和存储资源的局限,其框架要求DRAM的数百千兆字节对小型模型进行评估。为了克服这些限制,我们在本文件中探讨了利用由DRAM和耐久性记忆组成的混合内存系统的可行性。特别是,我们探索了最近释放的Intel Optane Pminem技术以及Intel HE-Transectron nGraph 来运行大型神经网络,例如移动网络2 (其最大变式) 和 ResNet-50 第一次在文献中运行。我们用不同的硬件和软件配置对处决效率进行了深入分析。我们的结论是, DNNE为这一记忆配置使用友好访问模式,从而产生了高效的处决。