Bayesian neural networks that incorporate data augmentation implicitly use a ``randomly perturbed log-likelihood [which] does not have a clean interpretation as a valid likelihood function'' (Izmailov et al. 2021). Here, we provide several approaches to developing principled Bayesian neural networks incorporating data augmentation. We introduce a ``finite orbit'' setting which allows likelihoods to be computed exactly, and give tight multi-sample bounds in the more usual ``full orbit'' setting. These models cast light on the origin of the cold posterior effect. In particular, we find that the cold posterior effect persists even in these principled models incorporating data augmentation. This suggests that the cold posterior effect cannot be dismissed as an artifact of data augmentation using incorrect likelihoods.
翻译:包含数据扩增的Bayesian神经网络隐含地使用“随机扰动的日志相似性”这一“没有干净的解释作为有效的可能性函数”(Izmailov等人,2021年)。在这里,我们为开发包含数据扩增的有原则的Bayesian神经网络提供了几种方法。我们引入了“无限轨道”设置,允许精确计算各种可能性,并在更常见的“全轨道”设置中提供紧凑的多抽样界限。这些模型揭示了冷后继效应的起源。特别是,我们发现即使在纳入数据扩增的这些原则模型中,冷后继效应也持续存在。这表明,不能用不正确的可能性将冷后继效应视为数据增强的文物。