As collaborative learning and the outsourcing of data collection become more common, malicious actors (or agents) which attempt to manipulate the learning process face an additional obstacle as they compete with each other. In backdoor attacks, where an adversary attempts to poison a model by introducing malicious samples into the training data, adversaries have to consider that the presence of additional backdoor attackers may hamper the success of their own backdoor. In this paper, we investigate the scenario of a multi-agent backdoor attack, where multiple non-colluding attackers craft and insert triggered samples in a shared dataset which is used by a model (a defender) to learn a task. We discover a clear backfiring phenomenon: increasing the number of attackers shrinks each attacker's attack success rate (ASR). We then exploit this phenomenon to minimize the collective ASR of attackers and maximize defender's robustness accuracy by (i) artificially augmenting the number of attackers, and (ii) indexing to remove the attacker's sub-dataset from the model for inference, hence proposing 2 defenses.
翻译:随着合作学习和数据收集外包的日益普遍,试图操纵学习过程的恶意行为者(或代理人)在相互竞争时面临另一个障碍。在幕后攻击中,对手试图通过将恶意样本引入培训数据来毒化模型,对手不得不认为,更多的后门攻击者的存在可能妨碍其自身后门攻击者的成功。在本文中,我们调查多剂后门攻击的情景,多剂后门攻击者手和将触发的样本插入一个共享的数据集中,该数据集被模型(辩护人)用来学习一项任务。我们发现了一个明显的反射现象:攻击者人数的增加使每个攻击者的攻击成功率都缩小(ASR),然后我们利用这个现象来尽量减少攻击者的集体反射精度,并通过(i)人为增加攻击者人数,以及(ii)编制索引,从模型中去除攻击者的子数据集,从而提出2项辩护。