Traditional methods for unsupervised learning of finite mixture models require to evaluate the likelihood of all components of the mixture. This becomes computationally prohibitive when the number of components is large, as it is, for example, in the sum-product (transform) networks. Therefore, we propose to apply a combination of the expectation maximization and the Metropolis-Hastings algorithm to evaluate only a small number of, stochastically sampled, components, thus substantially reducing the computational cost. The Markov chain of component assignments is sequentially generated across the algorithm's iterations, having a non-stationary target distribution whose parameters vary via a gradient-descent scheme. We put emphasis on generality of our method, equipping it with the ability to train both shallow and deep mixture models which involve complex, and possibly nonlinear, transformations. The performance of our method is illustrated in a variety of synthetic and real-data contexts, considering deep models, such as mixtures of normalizing flows and sum-product (transform) networks.
翻译:在不受监督的情况下学习有限混合物模型的传统方法需要评估混合物所有成分的可能性。当混合物的所有成分数量巨大时,这在计算上变得令人望而却步,例如,在产品(变异)网络中就是如此。因此,我们提议将预期最大化和大都会-哈斯廷斯算法结合起来,只对少量的、随机抽样的成分进行评估,从而大幅度降低计算成本。在算法的迭代中,Markov 组合任务链是依次生成的,其参数因梯度变化而异的非静止目标分布。我们强调我们的方法的通用性,使其具备对浅层和深层混合物模型进行涉及复杂和可能非线性转化的培训能力。我们的方法表现在多种合成和真实数据背景下,考虑到深层模型,例如正常流动和合成产品(变异)网络的混合物。