Malware classifiers are subject to training-time exploitation due to the need to regularly retrain using samples collected from the wild. Recent work has demonstrated the feasibility of backdoor attacks against malware classifiers, and yet the stealthiness of such attacks is not well understood. In this paper, we investigate this phenomenon under the clean-label setting (i.e., attackers do not have complete control over the training or labeling process). Empirically, we show that existing backdoor attacks in malware classifiers are still detectable by recent defenses such as MNTD. To improve stealthiness, we propose a new attack, Jigsaw Puzzle (JP), based on the key observation that malware authors have little to no incentive to protect any other authors' malware but their own. As such, Jigsaw Puzzle learns a trigger to complement the latent patterns of the malware author's samples, and activates the backdoor only when the trigger and the latent pattern are pieced together in a sample. We further focus on realizable triggers in the problem space (e.g., software code) using bytecode gadgets broadly harvested from benign software. Our evaluation confirms that Jigsaw Puzzle is effective as a backdoor, remains stealthy against state-of-the-art defenses, and is a threat in realistic settings that depart from reasoning about feature-space only attacks. We conclude by exploring promising approaches to improve backdoor defenses.
翻译:由于需要利用从野生采集的样本定期进行再培训,恶意分类者要接受培训时间剥削。最近的工作表明,对恶意软件分类者进行幕后攻击是可行的,但这种攻击的隐秘性却无法被很好地理解。在本文中,我们在清洁标签设置下调查这种现象(即攻击者不能完全控制培训或标签过程 ) 。有规律地,我们显示,恶意软件分类者中存在的后门攻击仍然可以通过诸如MNTD等最近的防御系统来探测。为了改进隐秘性,我们提议了一个新的攻击,即Jigsaw Puzzle(JP),其依据是关键观察,即恶意软件作者没有什么动力来保护其他作者的恶意软件,但是他们自己却没有。因此,Jigsaw Puswlety学会了一种触发因素,来补充恶意软件作者样本的潜在模式。只有在触发因素和潜在模式被拼凑在一起时才启动后门。我们进一步关注问题空间中可实现的触发因素(eg. Softcol co code) 。我们用智能定位的定位定位定位定位定位定位定位定位系统来确认我们对智能定位的防御的定位的定位的定位的定位的定位。