Energy-based models (EBMs) are a simple yet powerful framework for generative modeling. They are based on a trainable energy function which defines an associated Gibbs measure, and they can be trained and sampled from via well-established statistical tools, such as MCMC. Neural networks may be used as energy function approximators, providing both a rich class of expressive models as well as a flexible device to incorporate data structure. In this work we focus on shallow neural networks. Building from the incipient theory of overparametrized neural networks, we show that models trained in the so-called "active" regime provide a statistical advantage over their associated "lazy" or kernel regime, leading to improved adaptivity to hidden low-dimensional structure in the data distribution, as already observed in supervised learning. Our study covers both maximum likelihood and Stein Discrepancy estimators, and we validate our theoretical results with numerical experiments on synthetic data.
翻译:以能源为基础的模型(EBMS)是一个简单而有力的基因模型框架。它们基于一种可训练的能源功能,它界定了一个相关的Gibbs测量标准。它们可以通过成熟的统计工具(如MCMC等)进行培训和取样。神经网络可以用作能源功能的近似器,提供丰富的表达模型和灵活的装置,以纳入数据结构。在这项工作中,我们侧重于浅层神经网络。从过度对称神经网络的初始理论中,我们显示,在所谓的“活跃”系统中培训的模型在统计上优于其相关的“懒惰”或内核系统,从而如在监督的学习中所观察到的那样,在数据分配中改进了对隐藏的低维结构的适应性。我们的研究既包括最大的可能性,也包括Stein相异性估计器,我们用合成数据的数字实验来验证我们的理论结果。