Achieving the full promise of the Thermodynamic Variational Objective (TVO),a recently proposed variational lower bound on the log evidence involving a one-dimensional Riemann integral approximation, requires choosing a "schedule" ofsorted discretization points. This paper introduces a bespoke Gaussian processbandit optimization method for automatically choosing these points. Our approach not only automates their one-time selection, but also dynamically adaptstheir positions over the course of optimization, leading to improved model learning and inference. We provide theoretical guarantees that our bandit optimizationconverges to the regret-minimizing choice of integration points. Empirical validation of our algorithm is provided in terms of improved learning and inference inVariational Autoencoders and Sigmoid Belief Networks.
翻译:实现热力变化目标(TVO)的全部承诺(TVO)是最近提出的对单维里埃曼整体近似值的日志证据的变式下限,它要求选择一个分类离散点的“时间表 ” 。本文为自动选择这些点引入了一个可言的高森进程波段优化方法。我们的方法不仅使一次性选择自动化,而且还在优化过程中动态地调整其位置,从而改进模型学习和推理。我们提供了理论保证,保证我们的山羊优化通道能够以遗憾最小化的方式选择整合点。我们的算法在改进学习和推论方面得到了经验上的验证。