Variational inference with Gaussian mixture models (GMMs) enables learning of highly-tractable yet multi-modal approximations of intractable target distributions. GMMs are particular relevant for problem settings with up to a few hundred dimensions, for example in robotics, for modelling distributions over trajectories or joint distributions. This work focuses on two very effective methods for GMM-based variational inference that both employ independent natural gradient updates for the individual components and the categorical distribution of the weights. We show for the first time, that their derived updates are equivalent, although their practical implementations and theoretical guarantees differ. We identify several design choices that distinguish both approaches, namely with respect to sample selection, natural gradient estimation, stepsize adaptation, and whether trust regions are enforced or the number of components adapted. We perform extensive ablations on these design choices and show that they strongly affect the efficiency of the optimization and the variability of the learned distribution. Based on our insights, we propose a novel instantiation of our generalized framework, that combines first-order natural gradient estimates with trust-regions and component adaption, and significantly outperforms both previous methods in all our experiments.
翻译:与高斯混合模型(GMMs)的变异推论有助于学习高斯混合模型(GMMs)的高度可吸引但多式近似的棘手目标分布。 GMMs对于具有数百维度的问题设置特别相关,例如机器人、轨道或联合分布的模型分布或联合分布。这项工作侧重于基于GMM的两种非常有效的变异推论方法,这两种方法都使用独立自然梯度更新的单个组件和绝对重量分布。我们第一次显示,它们衍生的更新是等效的,尽管其实际执行和理论保证各不相同。我们确定了若干不同的设计选择,既区分了抽样选择、自然梯度估计、逐步适应,也区分了信任区域是否执行或调整了组成部分的数量。我们对这些设计选择进行了广泛的推算,并表明它们严重影响了优化的效率和所学分布的变异性。根据我们的见解,我们建议对我们的通用框架进行新颖的瞬间推论,将第一阶的自然梯度估计与信任区域和组成部分的调整结合起来,并且大大超出以往方法。