A KL-divergence objective of the joint distribution of data and labels allows to unify supervised learning and variational autoencoders (VAEs) under one umbrella of stochastic variational inference. The unification motivates an extended supervised scheme which allows to calculate a goodness-of-fit p-value for the neural network model. Conditional normalizing flows amortized with a neural network are crucial in this construction. We discuss how they allow to rigorously define coverage for posteriors defined jointly on a product space, e.g. $\mathbb{R}^n \times \mathcal{S}^m$, which encompasses posteriors over directions. Finally, systematic uncertainties are naturally included in the variational viewpoint. In classical likelihood approaches or other machine learning models, the ingredients of (1) systematics, (2) coverage and (3) goodness-of-fit are typically not all available or at least one of them strongly constrained. In contrast, the proposed extended supervised training with amortized normalizing flows accommodates all three of them for variational inference of arbitrary statistical distributions defined on product spaces like $\mathbb{R}^n \times \ldots \times \mathcal{S}^m$ and no fundamental barrier in terms of complexity of the underlying data. It therefore has great potential for the statistical toolbox of the contemporary (astro-)particle physicist.
翻译:联合分发数据和标签的AKL- divegence 目标允许将受监督的学习和变异自动编码器(VAEs)统一在一个随机变异推断伞的保护伞之下。 统一激励了一个扩大的监管计划, 用于计算神经网络模型的适合的值。 在这个构造中, 使用神经网络进行流动的正常摊销至关重要。 我们讨论它们如何允许严格定义产品空间共同定义的后座星座的覆盖范围, 例如, $\mathb{R ⁇ n\time autal colorations $math{S ⁇ m$m$, 包含随方向的后辈。 最后, 系统的不确定性自然包含在变异视角中。 在典型的可能性方法或其他机器学习模型中, (1) 系统、 (2) 覆盖和(3) 良好调整的元素通常不是全部可用, 或至少其中之一严重受限制。 相比之下, 与重新排列的正常流程的扩展监督培训将所有三种都用于任意的统计屏障变变 基本数据分布时间, 例如 $\\\\\\ broom mas 基本产品空间的任意定义的统计分布基础值。