Restricted Boltzmann Machines (RBMs) are powerful tools for modeling complex systems and extracting insights from data, but their training is hindered by the slow mixing of Markov Chain Monte Carlo (MCMC) processes, especially with highly structured datasets. In this study, we build on recent theoretical advances in RBM training and focus on the stepwise encoding of data patterns into singular vectors of the coupling matrix, significantly reducing the cost of generating new samples and evaluating the quality of the model, as well as the training cost in highly clustered datasets. The learning process is analogous to the thermodynamic continuous phase transitions observed in ferromagnetic models, where new modes in the probability measure emerge in a continuous manner. We leverage the continuous transitions in the training process to define a smooth annealing trajectory that enables reliable and computationally efficient log-likelihood estimates. This approach enables online assessment during training and introduces a novel sampling strategy called Parallel Trajectory Tempering (PTT) that outperforms previously optimized MCMC methods. To mitigate the critical slowdown effect in the early stages of training, we propose a pre-training phase. In this phase, the principal components are encoded into a low-rank RBM through a convex optimization process, facilitating efficient static Monte Carlo sampling and accurate computation of the partition function. Our results demonstrate that this pre-training strategy allows RBMs to efficiently handle highly structured datasets where conventional methods fail. Additionally, our log-likelihood estimation outperforms computationally intensive approaches in controlled scenarios, while the PTT algorithm significantly accelerates MCMC processes compared to conventional methods.
翻译:受限玻尔兹曼机(RBMs)是建模复杂系统和从数据中提取洞见的强大工具,但其训练过程受限于马尔可夫链蒙特卡洛(MCMC)方法的缓慢混合特性,尤其在处理高度结构化数据集时更为明显。本研究基于近期RBM训练的理论进展,重点关注将数据模式逐步编码至耦合矩阵奇异向量的过程,从而显著降低了生成新样本、评估模型质量以及在高聚类数据集中的训练成本。该学习过程类似于铁磁模型中观察到的热力学连续相变,其中概率测度的新模式以连续方式涌现。我们利用训练过程中的连续相变定义了一条平滑的退火轨迹,实现了可靠且计算高效的对数似然估计。该方法支持训练期间的在线评估,并引入了一种称为并行轨迹回火(PTT)的新型采样策略,其性能优于先前优化的MCMC方法。为缓解训练早期阶段的关键减速效应,我们提出了预训练阶段。在此阶段中,通过凸优化过程将主成分编码至低秩RBM,从而促进高效的静态蒙特卡洛采样和配分函数的精确计算。实验结果表明,该预训练策略使RBM能够高效处理传统方法失效的高度结构化数据集。此外,在受控场景中,我们的对数似然估计方法优于计算密集型方案,而PTT算法相较于传统方法显著加速了MCMC过程。