Modern deep neural network models are known to erroneously classify out-of-distribution (OOD) test data into one of the in-distribution (ID) training classes with high confidence. This can have disastrous consequences for safety-critical applications. A popular mitigation strategy is to train a separate classifier that can detect such OOD samples at the test time. In most practical settings OOD examples are not known at the train time, and hence a key question is: how to augment the ID data with synthetic OOD samples for training such an OOD detector? In this paper, we propose a novel Compounded Corruption technique for the OOD data augmentation termed CnC. One of the major advantages of CnC is that it does not require any hold-out data apart from the training set. Further, unlike current state-of-the-art (SOTA) techniques, CnC does not require backpropagation or ensembling at the test time, making our method much faster at inference. Our extensive comparison with 20 methods from the major conferences in last 4 years show that a model trained using CnC based data augmentation, significantly outperforms SOTA, both in terms of OOD detection accuracy as well as inference time. We include a detailed post-hoc analysis to investigate the reasons for the success of our method and identify higher relative entropy and diversity of CnC samples as probable causes. We also provide theoretical insights via a piece-wise decomposition analysis on a two-dimensional dataset to reveal (visually and quantitatively) that our approach leads to a tighter boundary around ID classes, leading to better detection of OOD samples. Source code link: https://github.com/cnc-ood
翻译:已知的现代深心内线网络模型错误地将分配(OOOD)测试数据分类为分配(ID)培训课程之一,具有很高的信心。这可能给安全关键应用带来灾难性后果。流行的缓解战略是训练一个单独的分类器,在测试时能够检测OOOD样本。在多数实际情况下,火车时间不为OOOD示例所为,因此一个关键问题是:如何用合成OOOD样本来增加ID数据,用于培训OOOD探测器?在本文中,我们建议为OOD数据增强(CnC)推出一个新的复合腐败技术。CnC的主要优势之一是,它不需要在培训时对安全关键应用产生灾难性影响。此外,与目前最先进的ODA(STA)技术不同,COOD实例并不需要反向回调,因此在测试时,让我们的方法更快。我们与过去4年主要会议的20种方法进行广泛比较,表明我们用基于C的数据增强的模型,大大超越了OnC的准确性链接。我们用SOTA的准确性分析方法来进行更精确的测试。