Isotropic Gaussian priors are the de facto standard for modern Bayesian neural network inference. However, such simplistic priors are unlikely to either accurately reflect our true beliefs about the weight distributions, or to give optimal performance. We study summary statistics of neural network weights in different networks trained using SGD. We find that fully connected networks (FCNNs) display heavy-tailed weight distributions, while convolutional neural network (CNN) weights display strong spatial correlations. Building these observations into the respective priors leads to improved performance on a variety of image classification datasets. Moreover, we find that these priors also mitigate the cold posterior effect in FCNNs, while in CNNs we see strong improvements at all temperatures, and hence no reduction in the cold posterior effect.
翻译:Istotropic Gaussian 的前身是现代Bayesian神经网络推断的实际标准,然而,这种简单化的前身不可能准确地反映我们对重量分布的真实信念,也不可能产生最佳性能。我们研究了使用 SGD 培训的不同网络神经网络重量的汇总统计。我们发现,完全连接的网络(FCNNs)显示重力分布,而进化神经网络重量显示出很强的空间相关性。把这些观察结果纳入前身会提高各种图像分类数据集的性能。此外,我们发现,这些前身也减轻了FCNNs的冷后部效应,而在CNNs,我们看到,所有温度都有很大改善,因此冷后部效应没有减少。