Detecting out-of-distribution (OOD) inputs is a central challenge for safely deploying machine learning models in the real world. Previous methods commonly rely on an OOD score derived from the overparameterized weight space, while largely overlooking the role of sparsification. In this paper, we reveal important insights that reliance on unimportant weights and units can directly attribute to the brittleness of OOD detection. To mitigate the issue, we propose a sparsification-based OOD detection framework termed DICE. Our key idea is to rank weights based on a measure of contribution, and selectively use the most salient weights to derive the output for OOD detection. We provide both empirical and theoretical insights, characterizing and explaining the mechanism by which DICE improves OOD detection. By pruning away noisy signals, DICE provably reduces the output variance for OOD data, resulting in a sharper output distribution and stronger separability from ID data. DICE establishes superior performance, reducing the FPR95 by up to 24.69% compared to the previous best method.
翻译:在现实世界中安全部署机器学习模型(OOOD)是安全部署机体外输入的一个中心挑战。 以往的方法通常依赖于从过分参数化重量空间得出的OOD评分,但基本上忽略了封闭性的作用。 在本文中,我们揭示了重要的洞察力,即依赖不重要的重量和单位可以直接归因于OOOD检测的易碎性。为了缓解这一问题,我们提议了一个基于封闭性的OOOD检测框架,称为DICE。 我们的关键思想是根据贡献的量度来排位权重,并有选择地使用最突出的重量来得出OOOD检测的输出。 我们提供经验性和理论上的洞察力,描述和解释DICE用来改进OD检测的机制。通过清除噪音信号,DICE可以明显减少OD数据的输出差异,导致更锐化的输出分布和与ID数据的更分离性。 DICE确定优性性,将FPR95比先前的最佳方法减少24.69%。