We conduct a systematic study of backdoor vulnerabilities in normally trained Deep Learning models. They are as dangerous as backdoors injected by data poisoning because both can be equally exploited. We leverage 20 different types of injected backdoor attacks in the literature as the guidance and study their correspondences in normally trained models, which we call natural backdoor vulnerabilities. We find that natural backdoors are widely existing, with most injected backdoor attacks having natural correspondences. We categorize these natural backdoors and propose a general detection framework. It finds 315 natural backdoors in the 56 normally trained models downloaded from the Internet, covering all the different categories, while existing scanners designed for injected backdoors can at most detect 65 backdoors. We also study the root causes and defense of natural backdoors.
翻译:我们用通常经过训练的深层学习模型对后门脆弱性进行系统研究,它们与数据中毒的后门一样危险,因为两者均可同样加以利用。我们在文献中利用20种不同类型的注射后门攻击,如在通常经过训练的模型中提供指导和研究其通信,我们称之为天然后门脆弱性。我们发现天然后门是广泛存在的,大多数注射后门攻击都有自然通信。我们对这些天然后门进行了分类,并提出了一个一般检测框架。我们从互联网下载的56个通常经过训练的模型中发现了315个天然后门,涵盖所有不同类别,而为注射后门设计的现有扫描仪最多可以探测65个后门。我们还研究自然后门的根源和防御。