It has been recognized that the data generated by the denoising diffusion probabilistic model (DDPM) improves adversarial training. After two years of rapid development in diffusion models, a question naturally arises: can better diffusion models further improve adversarial training? This paper gives an affirmative answer by employing the most recent diffusion model which has higher efficiency ($\sim 20$ sampling steps) and image quality (lower FID score) compared with DDPM. Our adversarially trained models achieve state-of-the-art performance on RobustBench using only generated data (no external datasets). Under the $\ell_\infty$-norm threat model with $\epsilon=8/255$, our models achieve $70.69\%$ and $42.67\%$ robust accuracy on CIFAR-10 and CIFAR-100, respectively, i.e. improving upon previous state-of-the-art models by $+4.58\%$ and $+8.03\%$. Under the $\ell_2$-norm threat model with $\epsilon=128/255$, our models achieve $84.86\%$ on CIFAR-10 ($+4.44\%$). These results also beat previous works that use external data. Our code is available at https://github.com/wzekai99/DM-Improves-AT.
翻译:人们已经认识到,分解扩散概率模型(DDPM)产生的数据改善了对抗性培训。在扩散模型迅速发展两年后,自然会出现一个问题:更好地推广模型能够进一步改进对抗性培训?本文件给出了肯定答案,采用了效率更高的最新扩散模型(20美元抽样步骤)和图像质量(较低的FID分数),比DDPM提高了效率(20美元)和图像质量(较低的FID分数)。我们的对抗性培训模型只利用生成的数据(没有外部数据集),在Robust Bench上实现了最新水平的绩效。在以美元=2美元计算的诺姆威胁模型下,以美元=8/255美元计算,我们的模型在CIFAR-10和CIFAR-100上分别实现了70.69美元和42.67美元强度准确度的推广模型。 也就是说,在使用美元=128/255美元的美元/诺尔姆威胁模型下,我们的模型实现了8.486美元/40美元外部数据。