Privacy and security-related concerns are growing as machine learning reaches diverse application domains. The data holders want to train or infer with private data while exploiting accelerators, such as GPUs, that are hosted in the cloud. Cloud systems are vulnerable to attackers that compromise the privacy of data and integrity of computations. Tackling such a challenge requires unifying theoretical privacy algorithms with hardware security capabilities. This paper presents DarKnight, a framework for large DNN training while protecting input privacy and computation integrity. DarKnight relies on cooperative execution between trusted execution environments (TEE) and accelerators, where the TEE provides privacy and integrity verification, while accelerators perform the bulk of the linear algebraic computation to optimize the performance. In particular, DarKnight uses a customized data encoding strategy based on matrix masking to create input obfuscation within a TEE. The obfuscated data is then offloaded to GPUs for fast linear algebraic computation. DarKnight's data obfuscation strategy provides provable data privacy and computation integrity in the cloud servers. While prior works tackle inference privacy and cannot be utilized for training, DarKnight's encoding scheme is designed to support both training and inference.
翻译:随着机器学习进入不同的应用域,隐私和安全方面的关切正在增加。数据持有人想要在利用云中托管的GPU等加速器的同时,对私人数据进行培训或推断。 云层系统对损害数据隐私和计算完整性的攻击者十分脆弱。 应对这一挑战需要统一具有硬件安全能力的理论隐私算法。 本文展示了大型 DNN培训的DarKnight 框架,同时保护输入隐私和计算完整性。 DarKnight 依赖信任的执行环境(TEE)和加速器之间的合作执行, TEE提供隐私和完整性核查, 而加速器则进行线形代数计算,以优化性能。 特别是, DarKnight 使用基于矩阵掩码的定制数据编码战略, 以在TEE 中创建输入模糊。 模糊的数据随后被卸载到 GPUS 中, 用于快速直线测算。 DarKnight的数据模糊化战略提供可辨识的数据隐私和完整性, 在云层服务器中进行可辨识的数据保密性和计算, 而之前的加密系统则无法使用。