Bayesian neural networks and deep ensembles represent two modern paradigms of uncertainty quantification in deep learning. Yet these approaches struggle to scale mainly due to memory inefficiency issues, since they require parameter storage several times higher than their deterministic counterparts. To address this, we augment the weight matrix of each layer with a small number of inducing weights, thereby projecting the uncertainty quantification into such low dimensional spaces. We further extend Matheron's conditional Gaussian sampling rule to enable fast weight sampling, which enables our inference method to maintain reasonable run-time as compared with ensembles. Importantly, our approach achieves competitive performance to the state-of-the-art in prediction and uncertainty estimation tasks with fully connected neural networks and ResNets, while reducing the parameter size to $\leq 24.3\%$ of that of a $single$ neural network.
翻译:Bayesian神经网络和深层集合代表了深层学习中两种现代不确定量化模式。然而,这些方法主要由于记忆效率不足问题而在规模上挣扎,因为这些方法要求参数储存数倍于其确定性对等的参数储存量。为了解决这个问题,我们增加每个层的权重矩阵,以少量引力重量来计算不确定性的量化,从而将不确定性量化结果投射到这样的低维空间。我们进一步扩展了Matheron有条件的Gaussian抽样规则,以便能够进行快速加权抽样,从而使我们的推断方法能够维持与集合相比的合理运行时间。 重要的是,我们的方法在完全连接的神经网络和ResNet的预测和不确定性估算任务方面实现了最先进的业绩,同时将参数的大小减少到以美元为美元神经网络的24.3美分。