We propose several different techniques to improve contrastive divergence training of energy-based models (EBMs). We first show that a gradient term neglected in the popular contrastive divergence formulation is both tractable to estimate and is important to avoid training instabilities in previous models. We further highlight how data augmentation, multi-scale processing, and reservoir sampling can be used to improve model robustness and generation quality. Thirdly, we empirically evaluate stability of model architectures and show improved performance on a host of benchmarks and use cases, such as image generation, OOD detection, and compositional generation.
翻译:我们提出了几种不同的技术来改进对以能源为基础的模型(EBMs)的差别对比培训。我们首先表明,在流行的差别对比配方中忽略的梯度术语既可以估算,对于避免以往模型的培训不稳定性也十分重要。我们进一步强调如何利用数据扩增、多尺度处理和储油层取样来提高模型的稳健性和生成质量。第三,我们从经验上评价模型结构的稳定性,并显示在一系列基准和使用案例中,如图像生成、OOOD探测和成份生成等方面,业绩有所改善。