Contrastive learning (CL) pre-trains general-purpose encoders using an unlabeled pre-training dataset, which consists of images or image-text pairs. CL is vulnerable to data poisoning based backdoor attacks (DPBAs), in which an attacker injects poisoned inputs into the pre-training dataset so the encoder is backdoored. However, existing DPBAs achieve limited effectiveness. In this work, we propose new DPBAs called CorruptEncoder to CL. CorruptEncoder uses a theory-guided method to create optimal poisoned inputs to maximize attack effectiveness. Our experiments show that CorruptEncoder substantially outperforms existing DPBAs. In particular, CorruptEncoder is the first DPBA that achieves more than 90% attack success rates with only a few (3) reference images and a small poisoning ratio (0.5%). Moreover, we also propose a defense, called localized cropping, to defend against DPBAs. Our results show that our defense can reduce the effectiveness of DPBAs, though it slightly sacrifices the utility of the encoder.
翻译:使用由图像或图像文本配对组成的未贴标签的培训前一般目的编码器进行编程前对比性学习。 CL 很容易受到基于后门攻击的数据中毒(DPBAs) 。 攻击者向培训前的数据集注入有毒投入,使编码器被后门击退。 但是, 现有的DPBAs 取得了有限的效果。 在这项工作中, 我们提议新的DPBAs 将Corrupt Encoder 称为 CL. Corrupt Encoder 改为 CL. Corrupt Enccoder 使用理论指导的方法来创造最佳的有毒投入, 以最大限度地提高攻击效果。 我们的实验显示, Corrupt Encoder 大大优于现有的DPBAs 。 特别是, Corrupt Encoder 是第一个DPBA, 仅达到90%以上攻击成功率的DPBA, 参考图像和小中毒率( 0.5% ) 。 此外,我们还提议一种防御, 称为局部裁剪裁剪裁, 以防御。 我们的结果表明, 我们的防御可以降低DPBA的效用, 尽管它略牺牲了编码的效用。</s>