Deep Ensembles are a simple, reliable, and effective method of improving both the predictive performance and uncertainty estimates of deep learning approaches. However, they are widely criticised as being computationally expensive, due to the need to deploy multiple independent models. Recent work has challenged this view, showing that for predictive accuracy, ensembles can be more computationally efficient (at inference) than scaling single models within an architecture family. This is achieved by cascading ensemble members via an early-exit approach. In this work, we investigate extending these efficiency gains to tasks related to uncertainty estimation. As many such tasks, e.g. selective classification, are binary classification, our key novel insight is to only pass samples within a window close to the binary decision boundary to later cascade stages. Experiments on ImageNet-scale data across a number of network architectures and uncertainty tasks show that the proposed window-based early-exit approach is able to achieve a superior uncertainty-computation trade-off compared to scaling single models. For example, a cascaded EfficientNet-B2 ensemble is able to achieve similar coverage at 5% risk as a single EfficientNet-B4 with <30% the number of MACs. We also find that cascades/ensembles give more reliable improvements on OOD data vs scaling models up. Code for this work is available at: https://github.com/Guoxoug/window-early-exit.
翻译:深团是一个简单、可靠和有效的方法,可以改进深层学习方法的预测性业绩和不确定性估计。 但是,由于需要部署多个独立模型,这些方法被广泛批评为计算成本昂贵,因为需要部署多个独立模型。 最近的工作对这一观点提出了挑战, 显示为了预测准确性, 组合比在架构大家庭中缩放单一模型更具有计算效率( 推断力) 。 这是通过早期流出方法对混合成员进行递解的方法实现的。 在这项工作中, 我们调查将这些效率增益扩大到与不确定性估计有关的任务。 由于许多此类任务,例如选择性分类是二进制分类, 我们新的关键洞察觉只能将样本通过接近二进决定边界的窗口中传递到以后的阶段。 一系列网络架构和不确定性任务的图像网络规模数据实验显示, 与扩大单一模型相比, 拟议的基于窗口的早期流出方法能够实现更高级的不确定性- 转换交易。 例如, 升级的Net- B2entenble 等任务, 由于二进式的分类是二进式分类, 我们的新洞洞洞察到类似覆盖范围, 5% MACMIS/ a dustal dreal disal disal disal dis</s>