It is well-known DNNs would generate different prediction results even given the same model configuration and training dataset. As a result, it becomes more and more important to study prediction variation, i.e. the variation of the predictions on a given input example, in neural network models. Dropout has been commonly used in various applications to quantify prediction variations. However, using dropout in practice can be expensive as it requires running dropout inference many times to estimate prediction variation. In this paper, we study how to estimate dropout prediction variation in a resource-efficient manner. In particular, we demonstrate that we can use neuron activation strength to estimate dropout prediction variation under different dropout settings and on a variety of tasks using three large datasets, MovieLens, Criteo, and EMNIST. Our approach provides an inference-once alternative to estimate dropout prediction variation as an auxiliary task when the main prediction model is served. Moreover, we show that using activation strength features from a subset of neural network layers can be sufficient to achieve similar variation estimation performance compared to using activation features from all layers. This can provide further resource reduction for variation estimation.
翻译:众所周知的DNN将产生不同的预测结果,即使考虑到同样的模型配置和培训数据集。因此,研究预测变异,即神经网络模型中特定输入示例的预测变异,即神经网络模型中特定输入示例的预测变异,变得越来越重要。在各种应用中,通常都使用辍学来量化预测变异。然而,在实践中,使用辍学可能很昂贵,因为它需要多次计算预测变异。在本文件中,我们研究如何以资源效率的方式估计辍学预测变异。特别是,我们证明,我们可以使用神经激活强度来估计不同辍学环境中的辍学预测变异,以及使用三种大型数据集(MovicaLens、Criteo和EMNNIST)进行的各种任务。我们的方法提供了一种推论替代方法,用以估算主要预测模型使用时的辍学预测变异作为辅助任务。此外,我们表明,使用神经网络层子组的激活强度特征可以足以实现与所有层次的激活特性相比的类似变异性估计性性。这可以进一步减少变异性估算资源。