Computing an AUC as a performance measure to compare the quality of different machine learning models is one of the final steps of many research projects. Many of these methods are trained on privacy-sensitive data and there are several different approaches like $\epsilon$-differential privacy, federated machine learning and methods based on cryptographic approaches if the datasets cannot be shared or evaluated jointly at one place. In this setting, it can also be a problem to compute the global performance measure like an AUC, since the labels might also contain privacy-sensitive information. There have been approaches based on $\epsilon$-differential privacy to deal with this problem, but to the best of our knowledge, no exact privacy preserving solution has been introduced. In this paper, we propose an MPC-based framework, called \fw{}, with private merging of sorted lists and novel methods for comparing two secret-shared values, selecting between two secret-shared values, converting the modulus, and performing division to compute the exact AUC as one could obtain on the pooled original test samples. With \fw{} computation of the exact area under precision-recall curve and receiver operating characteristic curve is even possible when ties between prediction confidence values exist. To show the applicability of \fw{}, we use it to evaluate a model trained to predict acute myeloid leukemia therapy response and we also assess its scalability via experiments on synthetic data. The experiments show that we efficiently compute exactly the same AUC with both evaluation metrics in a privacy preserving manner as one can obtain on the pooled test samples in the plaintext domain. Our solution provides security against semi-honest corruption of at most one of the servers performing the secure computation.
翻译:计算一个AUC作为比较不同机器学习模型质量的业绩计量, 用于比较不同机器学习模型的质量, 是许多研究项目的最后步骤之一。 许多这些方法都是在隐私敏感数据上培训的, 并且有几种不同的方法, 比如 $\ epsilon$ 差异隐私, 如果数据集无法在一个地方共享或共同评估, 则采用联合机器学习和基于加密方法的方法。 在这种环境下, 计算像 AUC 这样的全球业绩计量也可能是一个问题, 因为标签也可能包含对隐私敏感的信息。 已经有基于 $\ epselon$ 差异隐私的方法来解决这个问题, 但根据我们的知识, 没有精确隐私保护解决方案。 在本文中, 我们提议一个基于 MPC 的框架, 称为\ fw ⁇, 由个人将分类列表合并, 和新方法来比较两个秘密共享值, 选择两个共享的值, 转换模型的缩略图, 执行分解可以将准确的 AUC用于原始测试样品的保存。 即便根据我们的知识, 我们用一个固定的缩度计算, 也用一个固定的缩缩缩缩的缩的缩缩的缩 。