One of the most popular estimation methods in Bayesian neural networks (BNN) is mean-field variational inference (MFVI). In this work, we show that neural networks with ReLU activation function induce posteriors, that are hard to fit with MFVI. We provide a theoretical justification for this phenomenon, study it empirically, and report the results of a series of experiments to investigate the effect of activation function on the calibration of BNNs. We find that using Leaky ReLU activations leads to more Gaussian-like weight posteriors and achieves a lower expected calibration error (ECE) than its ReLU-based counterpart.
翻译:Bayesian神经网络中最受欢迎的估计方法之一是平均场变异推断法。 在这项工作中,我们表明,带有雷射功激活功能的神经网络会诱发难以与MFVI相容的后天体。 我们为这一现象提供了理论上的理由,从经验上研究它,并报告一系列实验的结果,以调查激活功能对BNNs校准的影响。我们发现,使用雷射雷射功激活功能导致产生更多类似高斯的重后天体,并导致预期的校准误差低于以雷射功为主的对等。