Leveraging parallel hardware (e.g. GPUs) to conduct deep neural network (DNN) training/inference, though significantly speeds up the computations, raises several data privacy concerns. Trusted execution environments (TEEs) have emerged as a promising solution to enable privacy-preserving inference and training. TEEs, however, have limited memory and computation resources which renders it not comparable to untrusted parallel hardware in performance. To mitigate the trade-off between privacy and computing performance, we propose an asymmetric model decomposition framework, AsymML, to (1) accelerate training/inference using parallel hardware; and (2) preserve privacy using TEEs. By exploiting the low-rank characteristics in data and intermediate features, AsymML asymmetrically splits a DNN model into trusted and untrusted parts: the trusted part features privacy-sensitive data but incurs small compute/memory costs; while the untrusted part is computationally-intensive but not privacy-sensitive. Computing performance and privacy are guaranteed by respectively delegating the trusted and untrusted part to TEEs and GPUs. Furthermore, we present a theoretical rank bound analysis showing that low-rank characteristics are still preserved in intermediate features, which guarantees efficiency of AsymML. Extensive evaluations on DNN models shows that AsymML delivers $11.2\times$ speedup in inference, $7.6\times$ in training compared to the TEE-only executions.
翻译:利用平行硬件(例如,GPUs)来进行深层神经网络(DNN)培训/推断,虽然大大加快计算速度,但也引起了一些数据隐私方面的关注。信任的执行环境(TEEs)已成为一个有希望的解决方案,有助于保护隐私的推断和培训。但是,TEEs的记忆和计算资源有限,使得它无法与不可信的平行硬件进行对比。为了减轻隐私和计算性能之间的权衡,我们提议了一个不对称模型拆解框架(AsymMLL),以便(1) 利用平行硬件加快培训/推断速度;(2) 使用TEEs保护隐私。通过利用数据和中间特征中低级别的特点,AsymML将DNNS模型不对称地分成信任和不可靠的部分:信任部分是隐私敏感数据,但成本小于不可信的平行;虽然不信任部分是计算密集的,但不是对隐私敏感。计算绩效和隐私的保证是分别将信任和不信任部分委托给TEESDEs和GPOs。此外,我们在低级别培训中将低级别分析显示低级别,As-MLS-S-deals。