It is difficult to use subsampling with variational inference in hierarchical models since the number of local latent variables scales with the dataset. Thus, inference in hierarchical models remains a challenge at large scale. It is helpful to use a variational family with structure matching the posterior, but optimization is still slow due to the huge number of local distributions. Instead, this paper suggests an amortized approach where shared parameters simultaneously represent all local distributions. This approach is similarly accurate as using a given joint distribution (e.g., a full-rank Gaussian) but is feasible on datasets that are several orders of magnitude larger. It is also dramatically faster than using a structured variational distribution.
翻译:由于数据集中本地潜伏变量比例表的数量,因此很难在等级模型中以可变的推论进行子取样,因为数据组中存在本地潜伏变量比例表。 因此,等级模型中的推论仍是一个巨大的挑战。 使用结构与后方相匹配的变式组合很有帮助, 但由于本地分布数量巨大,优化仍然缓慢。 相反,本文建议采用摊销法, 共享参数同时代表所有本地分布。 这种方法与使用特定联合分布( 如全级高斯) 类似, 但对于多个数量级较大的数据集来说是可行的。 也比使用结构化的变式分布要快得多 。