Given some observed data and a probabilistic generative model, Bayesian inference aims at obtaining the distribution of a model's latent parameters that could have yielded the data. This task is challenging for large population studies where thousands of measurements are performed over a cohort of hundreds of subjects, resulting in a massive latent parameter space. This large cardinality renders off-the-shelf Variational Inference (VI) computationally impractical. In this work, we design structured VI families that can efficiently tackle large population studies. To this end, our main idea is to share the parameterization and learning across the different i.i.d. variables in a generative model -symbolized by the model's plates. We name this concept plate amortization, and illustrate the powerful synergies it entitles, resulting in expressive, parsimoniously parameterized and orders of magnitude faster to train large scale hierarchical variational distributions. We illustrate the practical utility of PAVI through a challenging Neuroimaging example featuring a million latent parameters, demonstrating a significant step towards scalable and expressive Variational Inference.
翻译:根据观测到的一些数据和一种概率遗传模型,贝耶斯人的推论旨在分配模型本可生成数据的潜在参数。这一任务对大型人口研究具有挑战性,因为对成群数百个主题进行数千项测量,从而形成巨大的潜在参数空间。这一巨大的基质使现成变异推(VI)在计算上不切实际。在这项工作中,我们设计了结构结构化的六类家庭,能够有效地解决大规模人口研究。为此,我们的主要想法是将参数化和学习的参数化和学习在不同的i.d.变量中共享到一个基因化模型中——由模型板块的同步化。我们命名了这个概念的板块摊分,并说明了它赋予的强大协同作用,从而导致显性、分解的参数化和数量级的加速,以培训大规模等级变异分布。我们通过一个具有百万个潜在参数的具有挑战性的神经成像来说明PAVI的实用性效用,显示向可伸缩和直径和直径的Variational Inference。