ML-as-a-service continues to grow, and so does the need for very strong privacy guarantees. Secure inference has emerged as a potential solution, wherein cryptographic primitives allow inference without revealing users' inputs to a model provider or model's weights to a user. For instance, the model provider could be a diagnostics company that has trained a state-of-the-art DenseNet-121 model for interpreting a chest X-ray and the user could be a patient at a hospital. While secure inference is in principle feasible for this setting, there are no existing techniques that make it practical at scale. The CrypTFlow2 framework provides a potential solution with its ability to automatically and correctly translate clear-text inference to secure inference for arbitrary models. However, the resultant secure inference from CrypTFlow2 is impractically expensive: Almost 3TB of communication is required to interpret a single X-ray on DenseNet-121. In this paper, we address this outstanding challenge of inefficiency of secure inference with three contributions. First, we show that the primary bottlenecks in secure inference are large linear layers which can be optimized with the choice of network backbone and the use of operators developed for efficient clear-text inference. This finding and emphasis deviates from many recent works which focus on optimizing non-linear activation layers when performing secure inference of smaller networks. Second, based on analysis of a bottle-necked convolution layer, we design a X-operator which is a more efficient drop-in replacement. Third, we show that the fast Winograd convolution algorithm further improves efficiency of secure inference. In combination, these three optimizations prove to be highly effective for the problem of X-ray interpretation trained on the CheXpert dataset.
翻译:ML-as- a 服务继续增长,对非常强的隐私保障的需求也不断增长。 安全推断已经作为一种潜在的解决方案出现, 加密原始结构允许在不向模型提供者或模型重量向用户透露用户输入信息的情况下进行推断。 例如, 模型提供商可以是一家诊断公司, 该公司已经培训了一个最先进的DenseNet-121 模型来解释胸部X射线, 用户也可以是一家医院的病人。 虽然对于这一环境来说,安全推断原则上是可行的, 但目前没有任何技术可以使这一系统在规模上实际操作。 加密XTFlow2 框架提供了一种潜在的解决方案, 它能够自动和正确地翻译用户对模型提供或模型重量的推断。 然而, 模型提供商可能是一个诊断公司, 已经培训了最先进的DencyNet-121 。 几乎需要3个通讯屏障来解释Dense- Net-121 的单一的X射线, 用户可能是病人。 在本文中, 我们用3个贡献来应对安全度更低的替代方法。 首先, 我们展示的是, 安全度网络中的主要导线段 是如何在运行中, 以最精度分析中, 的精度分析中, 以最精度的精度在运行中, 的精度 的精度的精度在运行中, 也就是的精度 的精细的精细的精细的精细的精细的精细的精细的精细的精细的精细的精度是 。