With the increasing popularity of cloud based machine learning (ML) techniques there comes a need for privacy and integrity guarantees for ML data. In addition, the significant scalability challenges faced by DRAM coupled with the high access-times of secondary storage represent a huge performance bottleneck for ML systems. While solutions exist to tackle the security aspect, performance remains an issue. Persistent memory (PM) is resilient to power loss (unlike DRAM), provides fast and fine-granular access to memory (unlike disk storage) and has latency and bandwidth close to DRAM (in the order of ns and GB/s, respectively). We present PLINIUS, a ML framework using Intel SGX enclaves for secure training of ML models and PM for fault tolerance guarantees. P LINIUS uses a novel mirroring mechanism to create and maintain (i) encrypted mirror copies of ML models on PM, and (ii) encrypted training data in byte-addressable PM, for near-instantaneous data recovery after a system failure. Compared to disk-based checkpointing systems,PLINIUS is 3.2x and 3.7x faster respectively for saving and restoring models on real PM hardware, achieving robust and secure ML model training in SGX enclaves.
翻译:由于基于云的机器学习技术越来越受欢迎,需要为ML数据提供隐私和完整保障。此外,DRAM面临巨大的可缩缩化挑战,加上二级储存的进入时间太长,这对ML系统来说是一个巨大的性能瓶颈。虽然存在解决安全方面问题的解决办法,但性能仍然是一个问题。持久性记忆(PM)具有耐力,可以失去动力(不同于DRAM),提供快速和微调的存储存储存储(不相似的磁盘存储),并且具有接近DRAM(分别是ns和GB/s的顺序)的延缓和带宽度。我们介绍了利用Intel SGX飞地对ML模型和PMME进行安全培训的ML飞地,一个使用Intel SGX 飞地对ML模型和PMM(防故障保证)进行安全度安全培训的 ML框架。 PLNIUS使用新的镜像反射机制来创建和维护(i)MPM模型的加密反射镜拷贝,以及(二)在系统失灵后可处理的加密培训数据回收数据。与磁盘检查系统相比,PLINUS分别在硬模模型上恢复了3.2和3.7MIS的硬模型。