Bayesian neural networks (BNNs) promise improved generalization under covariate shift by providing principled probabilistic representations of epistemic uncertainty. However, weight-based BNNs often struggle with high computational complexity of large-scale architectures and datasets. Node-based BNNs have recently been introduced as scalable alternatives, which induce epistemic uncertainty by multiplying each hidden node with latent random variables, while learning a point-estimate of the weights. In this paper, we interpret these latent noise variables as implicit representations of simple and domain-agnostic data perturbations during training, producing BNNs that perform well under covariate shift due to input corruptions. We observe that the diversity of the implicit corruptions depends on the entropy of the latent variables, and propose a straightforward approach to increase the entropy of these variables during training. We evaluate the method on out-of-distribution image classification benchmarks, and show improved uncertainty estimation of node-based BNNs under covariate shift due to input perturbations. As a side effect, the method also provides robustness against noisy training labels.
翻译:贝叶斯神经网络(BNNs)承诺在共变变化下改进一般化,办法是通过提供具有原则性的隐性隐性特征,来说明认知不确定性。然而,基于重量的BNNs常常与大规模建筑和数据集的高度计算复杂度挣扎。基于节点的BNS最近被引入为可缩放的替代品,通过将每个隐藏节点与潜伏随机变量相乘,从而引起隐性不确定性,同时学习对重量的点估计。在本文中,我们将这些潜在噪音变量解释为在培训期间简单和域性数据扰动的隐含性表现,产生在因输入腐败而导致的差变异性变化中表现良好的BNNs。我们观察到,隐性腐败的多样性取决于潜在变量的灵敏度,并提出在培训期间增加这些变量的灵敏度的直截面方法。我们评估了分配外图像分类基准的方法,并显示由于输入过动而使基于正位的BNNPs的不确定性估算值得到改善。作为副作用,这种方法还提供了抵御高压的培训标签。