The proliferation of machine learning services in the last few years has raised data privacy concerns. Homomorphic encryption (HE) enables inference using encrypted data but it incurs 100x-10,000x memory and runtime overheads. Secure deep neural network (DNN) inference using HE is currently limited by computing and memory resources, with frameworks requiring hundreds of gigabytes of DRAM to evaluate small models. To overcome these limitations, in this paper we explore the feasibility of leveraging hybrid memory systems comprised of DRAM and persistent memory. In particular, we explore the recently-released Intel Optane PMem technology and the Intel HE-Transformer nGraph to run large neural networks such as MobileNetV2 (in its largest variant) and ResNet-50 for the first time in the literature. We present an in-depth analysis of the efficiency of the executions with different hardware and software configurations. Our results conclude that DNN inference using HE incurs on friendly access patterns for this memory configuration, yielding efficient executions.
翻译:过去几年中机器学习服务的扩散引起了数据隐私方面的关注。 单态加密(HE)使得能够使用加密数据进行推论,但它产生了100x-10,000的内存和运行时的间接费用。 使用HE的深度神经网络(DNN)的推论目前受到计算和记忆资源的限制,其框架要求DRAM的数百千兆字节来评估小模型。为了克服这些限制,我们在本文件中探讨了利用由DRAM和耐久记忆组成的混合内存系统的可行性。 特别是,我们探索了最近释放的Intel Optane PMEM技术和Intel HE-Transente nGraph 技术,以运行大型神经网络,例如移动网络V2(其最大变式)和ResNet-50,这是文献中的第一次。我们用不同的硬件和软件配置深入分析了处决的效率。我们得出的结论是, DNNE利用H为这种记忆配置带来的友好访问模式,产生了有效的处决。