学习受限制的波尔兹曼机器时的平衡和非平衡制度 (Equilibrium and non-Equilibrium regimes in the learning of Restricted Boltzmann Machines)

Training Restricted Boltzmann Machines (RBMs) has been challenging for a long time due to the difficulty of computing precisely the log-likelihood gradient. Over the past decades, many works have proposed more or less successful training recipes but without studying the crucial quantity of the problem: the mixing time, i.e. the number of Monte Carlo iterations needed to sample new configurations from a model. In this work, we show that this mixing time plays a crucial role in the dynamics and stability of the trained model, and that RBMs operate in two well-defined regimes, namely equilibrium and out-of-equilibrium, depending on the interplay between this mixing time of the model and the number of steps, $k$, used to approximate the gradient. We further show empirically that this mixing time increases with the learning, which often implies a transition from one regime to another as soon as $k$ becomes smaller than this time. In particular, we show that using the popular $k$ (persistent) contrastive divergence approaches, with $k$ small, the dynamics of the learned model are extremely slow and often dominated by strong out-of-equilibrium effects. On the contrary, RBMs trained in equilibrium display faster dynamics, and a smooth convergence to dataset-like configurations during the sampling. Finally we discuss how to exploit in practice both regimes depending on the task one aims to fulfill: (i) short $k$ can be used to generate convincing samples in short learning times, (ii) large $k$ (or increasingly large) is needed to learn the correct equilibrium distribution of the RBM. Finally, the existence of these two operational regimes seems to be a general property of energy based models trained via likelihood maximization.

翻译：培训限制的 Boltzmann 机器( RBMM ) 长期以来一直具有挑战性, 原因是很难精确计算日值值梯度。在过去几十年里, 许多作品都提出了多少或更少的成功培训食谱, 但没有研究问题的关键数量: 混合时间, 即蒙特· 卡洛重复需要多少时间从模型中抽取新的配置。在这项工作中, 我们显示, 这一混合时间在经过培训的模型的动态和稳定性中发挥着关键作用, 并且根据模型的混合时间和用于接近梯度的步骤数量之间的相互作用, 即均衡和偏差。在过去的几十年里, 许多作品提出了多少成功的训练。我们进一步从经验上表明, 这种混合时间随着学习的混合时间增加, 通常意味着在美元比这个时候小的时候从一个制度向另一个制度过渡。特别是, 我们显示, 使用流行的美元( 流行的) 反差化操作方法, 以美元为基础, 以小美元。学习的模型的动态是极为缓慢的, 并且往往由坚固的美元正在更慢的正在快速地使用的。