Pre-trained encoders are general-purpose feature extractors that can be used for many downstream tasks. Recent progress in self-supervised learning can pre-train highly effective encoders using a large volume of unlabeled data, leading to the emerging encoder as a service (EaaS). A pre-trained encoder may be deemed confidential because its training requires lots of data and computation resources as well as its public release may facilitate misuse of AI, e.g., for deepfakes generation. In this paper, we propose the first attack called StolenEncoder to steal pre-trained image encoders. We evaluate StolenEncoder on multiple target encoders pre-trained by ourselves and three real-world target encoders including the ImageNet encoder pre-trained by Google, CLIP encoder pre-trained by OpenAI, and Clarifai's General Embedding encoder deployed as a paid EaaS. Our results show that our stolen encoders have similar functionality with the target encoders. In particular, the downstream classifiers built upon a target encoder and a stolen one have similar accuracy. Moreover, stealing a target encoder using StolenEncoder requires much less data and computation resources than pre-training it from scratch. We also explore three defenses that perturb feature vectors produced by a target encoder. Our results show these defenses are not enough to mitigate StolenEncoder.
翻译:经过事先训练的编码器是可用于许多下游任务的通用特征提取器。最近自我监督的学习进展可以使用大量未经贴标签的数据对高效的编码器进行预演,从而导致正在出现的编码器服务(EaaS)。经过事先训练的编码器可以被视为保密,因为其培训需要大量数据和计算资源以及公开发布可能会助长对AI的滥用,例如,用于深媒生成。在本文中,我们提议了第一次称为“被盗编码器”的袭击,以窃取经过预先训练的图像编码器。我们评估了由自己预先训练的多个目标编码器和三个真实世界目标编码器的被盗编码器,包括谷歌以前训练的图像网络编码器、OpenAI预先训练的CLIP编码器,以及克拉里法伊的通用内嵌入式编码器总编码器,作为付款的 EaaS。我们的结果表明,我们被盗的编码器与目标编码器的功能是相似的。具体来说,在使用目标编码器的下游的编码器上建造了多个目标的编码编码编码编码器, 也要求用一个类似的方法进行精确性的数据。