Amortized inference has led to efficient approximate inference for large datasets. The quality of posterior inference is largely determined by two factors: a) the ability of the variational distribution to model the true posterior and b) the capacity of the recognition network to generalize inference over all datapoints. We analyze approximate inference in variational autoencoders in terms of these factors. We find that suboptimal inference is often due to amortizing inference rather than the limited complexity of the approximating distribution. We show that this is due partly to the generator learning to accommodate the choice of approximation. Furthermore, we show that the parameters used to increase the expressiveness of the approximation play a role in generalizing inference rather than simply improving the complexity of the approximation.
翻译:平均推论导致对大型数据集的有效近似推论。后置推论的质量主要取决于两个因素:(a) 模拟真实后继体的变异分布能力;(b) 识别网络对所有数据点的推论普遍化能力。我们从这些因素的角度分析变异自动算法的大致推论。我们发现,次优推论往往归因于相近推论的摊合性,而不是近似分布的有限复杂性。我们表明,这部分是由于发电机学习适应近近似选择的能力。此外,我们表明,用于提高近似表达性的参数在概括推论方面发挥着作用,而不是简单地改进近似的复杂性。