Non-parametric two-sample tests (TSTs) that judge whether two sets of samples are drawn from the same distribution, have been widely used in the analysis of critical data. People tend to employ TSTs as trusted basic tools and rarely have any doubt about their reliability. This paper systematically uncovers the failure mode of non-parametric TSTs through adversarial attacks and then proposes corresponding defense strategies. First, we theoretically show that an adversary can upper-bound the distributional shift which guarantees the attack's invisibility. Furthermore, we theoretically find that the adversary can also degrade the lower bound of a TST's test power, which enables us to iteratively minimize the test criterion in order to search for adversarial pairs. To enable TST-agnostic attacks, we propose an ensemble attack (EA) framework that jointly minimizes the different types of test criteria. Second, to robustify TSTs, we propose a max-min optimization that iteratively generates adversarial pairs to train the deep kernels. Extensive experiments on both simulated and real-world datasets validate the adversarial vulnerabilities of non-parametric TSTs and the effectiveness of our proposed defense. Source code is available at https://github.com/GodXuxilie/Robust-TST.git.
翻译:判断两组样本是否来自同一分布,在分析关键数据时被广泛使用。人们倾向于将TST作为可信赖的基本工具,很少对其可靠性有任何怀疑。本文系统地揭示了非参数TST的失败模式,通过对抗性攻击,然后提出了相应的防御战略。首先,我们理论上表明,对手可以将保证攻击不可见性的分布式转变(TST)上下限。此外,我们理论上认为,对手也可以降低TST测试力的下限,这使我们能够反复地尽量减少测试标准,以寻找对抗性对配方。为了让TST能够进行认知性攻击,我们提出了一个共同尽量减少不同类型测试标准的共通攻击(EA)框架。第二,为了加强TST,我们提议了最高最小优化,以迭代生成对抗性对对等配以训练深层的对子。关于模拟性和真实性测试力测试力的大规模实验,使我们能够在模拟性和真实性数据设置上最大限度地减少测试性对立性对立性对立性对立性对等。ST-ST ST 和 ATUSUDSUD/TSDSUDSUDSUDSDSDDSDSDSDSupetregreat