Deep neural networks have amply demonstrated their prowess but estimating the reliability of their predictions remains challenging. Deep Ensembles are widely considered as being one of the best methods for generating uncertainty estimates but are very expensive to train and evaluate. MC-Dropout is another popular alternative, which is less expensive, but also less reliable. Our central intuition is that there is a continuous spectrum of ensemble-like models of which MC-Dropout and Deep Ensembles are extreme examples. The first uses an effectively infinite number of highly correlated models while the second relies on a finite number of independent models. To combine the benefits of both, we introduce Masksembles. Instead of randomly dropping parts of the network as in MC-dropout, Masksemble relies on a fixed number of binary masks, which are parameterized in a way that allows to change correlations between individual models. Namely, by controlling the overlap between the masks and their density one can choose the optimal configuration for the task at hand. This leads to a simple and easy to implement method with performance on par with Ensembles at a fraction of the cost. We experimentally validate Masksembles on two widely used datasets, CIFAR10 and ImageNet.
翻译:深心神经网络已经充分展示了它们的精度,但估计其预测的可靠性仍然具有挑战性。深团被广泛认为是产生不确定性估计的最佳方法之一,但用于培训和评估的费用非常昂贵。MC-Dropout是另一个流行的替代方案,其成本较低,但也不太可靠。我们的中央直觉是,存在一系列连续的共性模型,其中MC-Dropout和深心团都是极端的例子。首先使用数量无限的高度关联模型,而第二个则依赖数量有限的独立模型。为了将两者的效益结合起来,我们引入 Maksssembles。在Ms-dropout中,它不是随机地将网络的部分投放,而是依靠固定数量的双面面具,这些面具的参数可以改变单个模型之间的关系。也就是说,通过控制面具及其密度之间的重叠,一个人可以选择手头任务的最佳配置。这导致一种简单而容易的方法,可以与两个图像网络使用的成本部分的Ensembles 10 进行相同的运行。