Energy-based models (EBMs) have experienced a resurgence within machine learning in recent years, including as a promising alternative for probabilistic regression. However, energy-based regression requires a proposal distribution to be manually designed for training, and an initial estimate has to be provided at test-time. We address both of these issues by introducing a conceptually simple method to automatically learn an effective proposal distribution, which is parameterized by a separate network head. To this end, we derive a surprising result, leading to a unified training objective that jointly minimizes the KL divergence from the proposal to the EBM, and the negative log-likelihood of the EBM. At test-time, we can then employ importance sampling with the trained proposal to efficiently evaluate the learned EBM and produce stand-alone predictions. Furthermore, we utilize our derived training objective to learn mixture density networks (MDNs) with a jointly trained energy-based teacher, consistently outperforming conventional MDN training on four real-world regression tasks within computer vision. Code is available at https://github.com/fregu856/ebms_proposals.
翻译:近年来,基于能源的模型(EBM)在机器学习中重新出现,包括作为有希望的概率回归的替代方法,但基于能源的回归要求以人工方式为培训设计一份建议分发,并在测试时提供初步估计。我们通过采用一个概念上简单的方法,自动学习有效的建议分发,由单独的网络头作为参数,来解决这些问题。为此,我们得出一个令人惊讶的结果,导致一个统一的培训目标,将KL与EBM提案的差异和EBM的负对数值最小化。在测试时,我们可以利用经过培训的建议进行重要性抽样,以便有效地评价所学到的EBM和提出独立的预测。此外,我们利用我们衍生的培训目标,学习混合密度网络(MDNs),与一名经过联合培训的基于能源的教师共同学习混合密度网络(MDNs),在计算机视野内持续超过常规的四种真实世界回归任务MDN培训。代码见https://github.com/fregu856/ebms_proposorations。