私人推断的端至端系统特征化和优化 (Characterizing and Optimizing End-to-End Systems for Private Inference)

In two-party machine learning prediction services, the client's goal is to query a remote server's trained machine learning model to perform neural network inference in some application domain. However, sensitive information can be obtained during this process by either the client or the server, leading to potential collection, unauthorized secondary use, and inappropriate access to personal information. These security concerns have given rise to Private Inference (PI), in which both the client's personal data and the server's trained model are kept confidential. State-of-the-art PI protocols consist of a pre-processing or offline phase and an online phase that combine several cryptographic primitives: Homomorphic Encryption (HE), Secret Sharing (SS), Garbled Circuits (GC), and Oblivious Transfer (OT). Despite the need and recent performance improvements, PI remains largely arcane today and is too slow for practical use. This paper addresses PI's shortcomings with a detailed characterization of a standard high-performance protocol to build foundational knowledge and intuition in the systems community. Our characterization pinpoints all sources of inefficiency -- compute, communication, and storage. In contrast to prior work, we consider inference request arrival rates rather than studying individual inferences in isolation and we find that the pre-processing phase cannot be ignored and is often incurred online as there is insufficient downtime to hide pre-compute latency. Finally, we leverage insights from our characterization and propose three optimizations to address the storage (Client-Garbler), computation (layer-parallel HE), and communication (wireless slot allocation) overheads. Compared to the state-of-the-art PI protocol, these optimizations provide a total PI speedup of 1.8$\times$ with the ability to sustain inference requests up to a 2.24$\times$ greater rate.

翻译：在两方机器学习预测服务中,客户的目标是查询远程服务器经过培训的机器学习模型,以在某些应用域进行神经网络推断;然而,客户或服务器在这一过程中可以获得敏感信息,从而可能导致收集、未经授权的二次使用以及个人信息的不适当获取。这些安全关切导致私人推断(PI),其中客户的个人数据和服务器经过培训的模式都保密。最先进的PI协议包括一个预处理或离线阶段和一个在线阶段,该阶段结合了若干直线式加密原始数据:智能式电路加密(HE)、秘密共享(SS)、腐蚀式电路路接头(GC)和Oblicast Gread(OT))。尽管需要和最近的性能改进,但PII今天基本上仍然在弧,实际使用速度太慢了。本文用一种详细的高性价协议来描述PII的缺点,目的是在系统社区中建立基础性知识和直觉(我们从效率中找出所有来源 -- -- 忠实、通信、通信和存储速度能力,但最终无法在存储阶段进行。