This work presents strategies to learn an Energy-Based Model (EBM) according to the desired length of its MCMC sampling trajectories. MCMC trajectories of different lengths correspond to models with different purposes. Our experiments cover three different trajectory magnitudes and learning outcomes: 1) shortrun sampling for image generation; 2) midrun sampling for classifier-agnostic adversarial defense; and 3) longrun sampling for principled modeling of image probability densities. To achieve these outcomes, we introduce three novel methods of MCMC initialization for negative samples used in Maximum Likelihood (ML) learning. With standard network architectures and an unaltered ML objective, our MCMC initialization methods alone enable significant performance gains across the three applications that we investigate. Our results include state-of-the-art FID scores for unnormalized image densities on the CIFAR-10 and ImageNet datasets; state-of-the-art adversarial defense on CIFAR-10 among purification methods and the first EBM defense on ImageNet; and scalable techniques for learning valid probability densities. Code for this project can be found at https://github.com/point0bar1/ebm-life-cycle.
翻译:这项工作提出了根据MCMC取样轨迹所需长度学习基于能源模型的战略。不同长度的MCMC轨迹与不同目的的模型相对应。我们的实验涵盖三个不同的轨迹大小和学习结果:1)图像生成短程抽样;2)分类-不可知对抗防御的中跑抽样;和3)图像概率密度原则建模的长期抽样。为了实现这些结果,我们引入了三种新的MMC初始化方法,用于在最大相似度(ML)学习中使用的负面样本。根据标准网络结构和未变ML目标,我们的MCMC初始化方法本身就使我们调查的三个应用取得了显著的绩效收益。我们的结果包括CIFAR-10和图像网络数据集非正常图像密度方面的最先进的FID分数;净化方法中CIFAR-10的最新防御状态和图像网络上第一个EBM 防御系统;以及用于学习有效概率密度的可测量技术。该项目的代码:httpsurbirum/wirum。