Classifiers in supervised learning have various security and privacy issues, e.g., 1) data poisoning attacks, backdoor attacks, and adversarial examples on the security side as well as 2) inference attacks and the right to be forgotten for the training data on the privacy side. Various secure and privacy-preserving supervised learning algorithms with formal guarantees have been proposed to address these issues. However, they suffer from various limitations such as accuracy loss, small certified security guarantees, and/or inefficiency. Self-supervised learning is an emerging technique to pre-train encoders using unlabeled data. Given a pre-trained encoder as a feature extractor, supervised learning can train a simple yet accurate classifier using a small amount of labeled training data. In this work, we perform the first systematic, principled measurement study to understand whether and when a pre-trained encoder can address the limitations of secure or privacy-preserving supervised learning algorithms. Our key findings are that a pre-trained encoder substantially improves 1) both accuracy under no attacks and certified security guarantees against data poisoning and backdoor attacks of state-of-the-art secure learning algorithms (i.e., bagging and KNN), 2) certified security guarantees of randomized smoothing against adversarial examples without sacrificing its accuracy under no attacks, 3) accuracy of differentially private classifiers, and 4) accuracy and/or efficiency of exact machine unlearning.
翻译:受监督学习的分类者有各种安全和隐私问题,例如:(1)数据中毒攻击、幕后攻击和安全方面的对抗性例子;(2)推断攻击和隐私方面的培训数据被遗忘的权利;提出了各种安全和隐私保护监督学习算法,并有正式保障来解决这些问题;然而,他们受到各种限制,如准确性损失、小额经认证的安全保障和/或效率低下;自我监督学习是使用未贴标签数据对编程者进行预培训的一种新兴技术;鉴于事先培训的编码器是特征提取器,受监督的学习可以使用少量标签培训的培训数据来训练简单而准确的分类者;在这项工作中,我们进行了第一次系统、有原则的测量研究,以了解事先培训的编码者是否和何时能够解决安全性或隐私受监管的学习算法的局限性;我们的主要结论是,预先培训的编码器大大改进了1) 攻击和经认证的安全保障之下,防止数据中毒和不公开攻击的后门攻击,利用少量的标记的培训数据数据数据数据进行精确性分类;以及没有认证的国家安全/保密性精确性(没有)的保密性、不作保的保密性、不进行安全的保密的保密性)的保密的研算法。