We investigate the efficacy of treating all the parameters in a Bayesian neural network stochastically and find compelling theoretical and empirical evidence that this standard construction may be unnecessary. To this end, we prove that expressive predictive distributions require only small amounts of stochasticity. In particular, partially stochastic networks with only $n$ stochastic biases are universal probabilistic predictors for $n$-dimensional predictive problems. In empirical investigations, we find no systematic benefit of full stochasticity across four different inference modalities and eight datasets; partially stochastic networks can match and sometimes even outperform fully stochastic networks, despite their reduced memory costs.
翻译:我们调查了在贝叶西亚神经网络中处理所有参数的功效,并发现了令人信服的理论和经验证据,证明这一标准构造可能没有必要。为此,我们证明,表态预测分布只需要少量的随机性。特别是,只有零美元随机偏差的局部随机性网络是用于解决元值预测问题的普遍概率预测器。在实证调查中,我们发现四个不同推理方式和八个数据集的完全随机性没有系统性好处;部分随机性网络可以匹配,有时甚至超过完全随机性网络,尽管其记忆成本降低。