Among Bayesian methods, Monte-Carlo dropout provides principled tools for evaluating the epistemic uncertainty of neural networks. Its popularity recently led to seminal works that proposed activating the dropout layers only during inference for evaluating uncertainty. This approach, which we call dropout injection, provides clear benefits over its traditional counterpart (which we call embedded dropout) since it allows one to obtain a post hoc uncertainty measure for any existing network previously trained without dropout, avoiding an additional, time-consuming training process. Unfortunately, no previous work compared injected and embedded dropout; therefore, we provide the first thorough investigation, focusing on regression problems. The main contribution of our work is to provide guidelines on the effective use of injected dropout so that it can be a practical alternative to the current use of embedded dropout. In particular, we show that its effectiveness strongly relies on a suitable scaling of the corresponding uncertainty measure, and we discuss the trade-off between negative log-likelihood and calibration error as a function of the scale factor. Experimental results on UCI data sets and crowd counting benchmarks support our claim that dropout injection can effectively behave as a competitive post hoc uncertainty quantification technique.
翻译:在巴耶斯方法中,蒙特卡洛的辍学为评价神经网络的隐性不确定性提供了原则性工具,其普及最近导致的开创性工作提议在推断不确定性时启动辍学层,我们称之为 " 注入 ",这为其传统对应方(我们称为嵌入式辍学)提供了明显的好处,因为它允许人们为以前受过培训但没有辍学的任何现有网络获得临时不确定性措施,避免了额外的、耗时的培训过程。不幸的是,以前没有开展过与注入和嵌入式的辍学相比的工作;因此,我们提供了第一次彻底的调查,重点是回归问题。我们工作的主要贡献是提供关于有效使用注射辍学的指南,以便它能够成为目前使用嵌入式辍学的实用替代方法。特别是,我们表明其有效性在很大程度上取决于相应不确定性措施的适当规模,我们讨论负面的逻辑相似性和校准错误作为规模因素的一个函数。UCI数据集的实验结果和人群计数基准支持我们的说法,即辍学注射可以有效地作为具有竞争力的事后不确定性量化技术。