It is often remarked that neural networks fail to increase their uncertainty when predicting on data far from the training distribution. Yet naively using softmax confidence as a proxy for uncertainty achieves modest success in tasks exclusively testing for this, e.g., out-of-distribution (OOD) detection. This paper investigates this contradiction, identifying two implicit biases that do encourage softmax confidence to correlate with epistemic uncertainty: 1) Approximately optimal decision boundary structure, and 2) Filtering effects of deep networks. It describes why low-dimensional intuitions about softmax confidence are misleading. Diagnostic experiments quantify reasons softmax confidence can fail, finding that extrapolations are less to blame than overlap between training and OOD data in final-layer representations. Pre-trained/fine-tuned networks reduce this overlap.
翻译:经常有人说,神经网络在预测远离培训分布的数据时没有增加不确定性;然而,天真地使用软体自信作为不确定性的替代物,却在专门测试任务(如分配外检测)方面取得了微弱的成功;本文调查了这一矛盾,找出了两个鼓励软体信心的隐含偏差,与认知不确定性相联系:(1) 大约最佳决定边界结构,和(2) 深层网络的过滤效应;它描述了低维直觉关于软体信心的误导性。诊断性实验对软体信心可能失败的原因进行了量化,发现归根结底的外推法比归根于培训与最终层次的OOD数据重叠要少。 预先训练/调整的网络减少了这种重叠性。