Restricted Boltzmann Machines are simple and powerful generative models that can encode any complex dataset. Despite all their advantages, in practice the trainings are often unstable and it is difficult to assess their quality because the dynamics are affected by extremely slow time dependencies. This situation becomes critical when dealing with low-dimensional clustered datasets, where the time required to sample ergodically the trained models becomes computationally prohibitive. In this work, we show that this divergence of Monte Carlo mixing times is related to a phenomenon of phase coexistence, similar to that which occurs in physics near a first-order phase transition. We show that sampling the equilibrium distribution using the Markov chain Monte Carlo method can be dramatically accelerated when using biased sampling techniques, in particular the Tethered Monte Carlo (TMC) method. This sampling technique efficiently solves the problem of evaluating the quality of a given trained model and generating new samples in a reasonable amount of time. Moreover, we show that this sampling technique can also be used to improve the computation of the log-likelihood gradient during training, leading to dramatic improvements in training RBMs with artificial clustered datasets. On real low-dimensional datasets, this new training method fits RBM models with significantly faster relaxation dynamics than those obtained with standard PCD recipes. We also show that TMC sampling can be used to recover the free-energy profile of the RBM. This proves to be extremely useful to compute the probability distribution of a given model and to improve the generation of new decorrelated samples in slow PCD-trained models.
翻译:受限制的Boltzmann 机器是简单而强大的基因化模型,可以将任何复杂的数据集编码起来。尽管这些模型具有各种优势,但在实践中,培训往往不稳定,而且很难评估质量,因为动态受到极慢的时间依赖性的影响。在处理低维组集数据集时,这种情况变得十分关键,因为对受过训练的模型进行抽样所需的时间在计算上变得令人望而却步。在这项工作中,我们表明蒙特卡洛混合时间的这种差异与阶段共存现象有关,类似于在接近一级阶段过渡的物理中出现的那种现象。我们表明,使用有偏向的取样技术,特别是Tethered Monte Carlo (TMC) 方法,可以大大加快平衡分布的取样速度。这种采样技术有效地解决了评估某一经过训练的模型质量和在合理的时间里生成新的样品的问题。此外,我们表明,这种采样技术还可以用来改进在培训中采用的与人造组集集分解模型相关的新精度结构,从而大大地改进了这种低维数据流化模型的升级,我们用这种低维数据转换的方法可以大大地显示这种低维数据回收方法。