Refraining from confidently predicting when faced with categories of inputs different from those seen during training is an important requirement for the safe deployment of deep learning systems. While simple to state, this has been a particularly challenging problem in deep learning, where models often end up making overconfident predictions in such situations. In this work we present a simple, but highly effective approach to deal with out-of-distribution detection that uses the principle of abstention: when encountering a sample from an unseen class, the desired behavior is to abstain from predicting. Our approach uses a network with an extra abstention class and is trained on a dataset that is augmented with an uncurated set that consists of a large number of out-of-distribution (OoD) samples that are assigned the label of the abstention class; the model is then trained to learn an effective discriminator between in and out-of-distribution samples. We compare this relatively simple approach against a wide variety of more complex methods that have been proposed both for out-of-distribution detection as well as uncertainty modeling in deep learning, and empirically demonstrate its effectiveness on a wide variety of of benchmarks and deep architectures for image recognition and text classification, often outperforming existing approaches by significant margins. Given the simplicity and effectiveness of this method, we propose that this approach be used as a new additional baseline for future work in this domain.
翻译:在面对与培训期间所见不同投入类别时,不自信地预测与培训期间所见不同投入类别,这是安全部署深层次学习系统的重要要求。虽然说得简单,但这在深层次学习中是一个特别棘手的问题,因为模型往往导致在这种情况下作出过于自信的预测。在这项工作中,我们提出了一个简单但非常有效的方法来处理分配外检测问题,使用的是弃权原则:当遇到来自一个隐蔽阶级的样本时,所期望的行为是避免预测。我们的方法使用一个网络,增加一个弃权类,并培训一套数据集,该数据集配有不精确的数据集,由大量分配外(OoD)样本组成,为这种样本贴上弃权类标签;然后对模型进行培训,以学习在分配样本中和分配外的样本之间找到有效的区别。我们比较了这种相对简单的方法,而提出的一系列更为复杂的方法,既用于进行分配外检测,也用于在深层次学习中进行不确定性的建模,并从经验上展示其有效性,包括大量基准和深层次的分布,由大量分配(OD)抽样样本组成,作为弃权类;然后,我们用这种基础化的模型进行这一基础化方法,以展示新的基准,作为新的基准,以展示这一基础化方法。