In this work, we advocate for the importance of singular learning theory (SLT) as it pertains to the theory and practice of variational inference in Bayesian neural networks (BNNs). To begin, using SLT, we lay to rest some of the confusion surrounding discrepancies between downstream predictive performance measured via e.g., the test log predictive density, and the variational objective. Next, we use the SLT-corrected asymptotic form for singular posterior distributions to inform the design of the variational family itself. Specifically, we build upon the idealized variational family introduced in \citet{bhattacharya_evidence_2020} which is theoretically appealing but practically intractable. Our proposal takes shape as a normalizing flow where the base distribution is a carefully-initialized generalized gamma. We conduct experiments comparing this to the canonical Gaussian base distribution and show improvements in terms of variational free energy and variational generalization error.
翻译:在这项工作中,我们主张单项学习理论的重要性,因为它与巴伊西亚神经网络变异推断理论和实践有关。首先,我们用单项学习理论和实践来消除通过测试日志预测密度和变异目标测量的下游预测性性能之间差异的一些混淆。接着,我们用单项后端分配的经校正的SLT零用表来为变异家庭本身的设计提供参考。具体地说,我们利用在\citet{bhattacharya_evidence_2020}中引入的理想化变异家庭,这个家庭具有理论上的吸引力,但实际上难以解决。我们的建议在基础分布是经过仔细调整的通用伽马的情况下形成一个正常化的流程。我们用SLT修正后后的非现式表格来为变异式家庭本身的设计提供参考。我们用这种表进行实验,并显示变式自由能源和变式一般错误方面的改进。