It is important to quantify the uncertainty of input samples, especially in mission-critical domains such as autonomous driving and healthcare, where failure predictions on out-of-distribution (OOD) data are likely to cause big problems. OOD detection problem fundamentally begins in that the model cannot express what it is not aware of. Post-hoc OOD detection approaches are widely explored because they do not require an additional re-training process which might degrade the model's performance and increase the training cost. In this study, from the perspective of neurons in the deep layer of the model representing high-level features, we introduce a new aspect for analyzing the difference in model outputs between in-distribution data and OOD data. We propose a novel method, Leveraging Important Neurons (LINe), for post-hoc Out of distribution detection. Shapley value-based pruning reduces the effects of noisy outputs by selecting only high-contribution neurons for predicting specific classes of input data and masking the rest. Activation clipping fixes all values above a certain threshold into the same value, allowing LINe to treat all the class-specific features equally and just consider the difference between the number of activated feature differences between in-distribution and OOD data. Comprehensive experiments verify the effectiveness of the proposed method by outperforming state-of-the-art post-hoc OOD detection methods on CIFAR-10, CIFAR-100, and ImageNet datasets.
翻译:摘要:量化输入样本的不确定性非常重要,尤其是在自动驾驶和医疗保健等关键任务领域中,对于外部分布(OOD)数据的失败预测可能会造成重大问题。 OOD检测问题从根本上开始于模型无法表达不知道的内容。后续处理的OOD检测方法得到广泛探索,因为它们不需要额外的重新训练过程,这可能会降低模型的性能并增加训练成本。在本研究中,从代表高级特征的深层模型神经元的角度,我们引入了一种用于分析内分布数据和OOD数据之间模型输出差异的新方面。我们提出了一种新颖的方法,称为利用重要神经元(LINe),用于后续的OOD检测。Shapley 值基于剪枝通过选择仅高贡献神经元来预测特定类别的输入数据并屏蔽其余神经元,从而减少噪声输出的影响。激活裁剪将所有值固定在某个阈值以上的神经元输出值裁剪为相同值,这使得LINe可以平等地看待所有特定类别的特征并仅考虑内分布和OOD数据之间激活特征数量之间的差异。全面的实验验证了所提出方法的有效性,优于CIFAR-10,CIFAR-100和ImageNet数据集上最先进的后续OOD检测方法。