Helmholtz Machines (HMs) are a class of generative models composed of two Sigmoid Belief Networks (SBNs), acting respectively as an encoder and a decoder. These models are commonly trained using a two-step optimization algorithm called Wake-Sleep (WS) and more recently by improved versions, such as Reweighted Wake-Sleep (RWS) and Bidirectional Helmholtz Machines (BiHM). The locality of the connections in an SBN induces sparsity in the Fisher Information Matrices associated to the probabilistic models, in the form of a finely-grained block-diagonal structure. In this paper we exploit this property to efficiently train SBNs and HMs using the natural gradient. We present a novel algorithm, called Natural Reweighted Wake-Sleep (NRWS), that corresponds to the geometric adaptation of its standard version. In a similar manner, we also introduce Natural Bidirectional Helmholtz Machine (NBiHM). Differently from previous work, we will show how for HMs the natural gradient can be efficiently computed without the need of introducing any approximation in the structure of the Fisher information matrix. The experiments performed on standard datasets from the literature show a consistent improvement of NRWS and NBiHM not only with respect to their non-geometric baselines but also with respect to state-of-the-art training algorithms for HMs. The improvement is quantified both in terms of speed of convergence as well as value of the log-likelihood reached after training.
翻译:Helmholtz Machines(HMs)是一组由两个Sigmoble Conslices 网络(SBNs)构成的基因模型,分别作为编码器和解码器。这些模型通常使用称为Wake-Sleep(WS)的两步优化算法来训练,而最近则使用经改进的版本来训练,例如Rew-Sleep(RWS)和Bidirectional-Shelmoltz Machines(BIHM)等。SBN的连接位置使与概率模型相关的渔业信息矩阵(SBBNBNS)的趋同性趋近,其形式为精细细细的块-对等结构。在本文件中,我们利用这些属性来利用SBNBIS和HMS进行高效的训练。我们提出了一个新式的算法,即与标准版本的几何重量重的休克(NBICMS)机器(NBHMS)的精度训练速度不同,我们还将显示如何在MISral-ralal 的精度结构中持续的精度改进,而无需进行。