Big data has been a pervasive catchphrase in recent years, but dealing with data scarcity has become a crucial question for many real-world deep learning (DL) applications. A popular methodology to efficiently enable the training of DL models to perform tasks in scenarios where only a small dataset is available is transfer learning (TL). TL allows knowledge transfer from a general domain to a specific target one; however, such a knowledge transfer may put privacy at risk when it comes to sensitive or private data. With CryptoTL we introduce a solution to this problem, and show for the first time a cryptographic privacy-preserving TL approach based on homomorphic encryption that is efficient and feasible for real-world use cases. We demonstrate this by focusing on classification tasks with small datasets and show the applicability of our approach for sentiment analysis. Additionally we highlight how our approach can be combined with differential privacy to further increase the security guarantees. Our extensive benchmarks show that using CryptoTL leads to high accuracy while still having practical fine-tuning and classification runtimes despite using homomorphic encryption. Concretely, one forward-pass through the encrypted layers of our setup takes roughly 1s on a notebook CPU.
翻译:近几年来,大数据是一个普遍存在的口号,但处理数据稀缺问题已成为许多现实世界深层学习(DL)应用的关键问题。一种高效地使DL模型的培训能够有效在只有小数据集的情景下执行任务的流行方法是转移学习(TL)。TL允许知识从一般领域向特定目标领域转移;然而,这种知识转让可能会对敏感或私人数据造成隐私风险。使用CryptoTL,我们引入了解决这一问题的解决方案,并首次展示了基于对现实世界使用案例既有效又可行的同质加密的加密隐私保护TL方法。我们通过侧重于小数据集的分类任务来展示这一点,并展示了我们用于情感分析的方法的适用性。此外,我们强调我们的方法如何与差异隐私相结合,以进一步增加安全保障。我们的广泛基准显示,使用CryptoTL会导致高度的准确性,尽管使用同质定型加密,但仍有实用的微调和分类运行时间。具体地说,我们设置的加密的CPOPO大约用1个笔。