Out-of-distribution (OOD) detection is a rapidly growing field due to new robustness and security requirements driven by an increased number of AI-based systems. Existing OOD textual detectors often rely on an anomaly score (e.g., Mahalanobis distance) computed on the embedding output of the last layer of the encoder. In this work, we observe that OOD detection performance varies greatly depending on the task and layer output. More importantly, we show that the usual choice (the last layer) is rarely the best one for OOD detection and that far better results could be achieved if the best layer were picked. To leverage this observation, we propose a data-driven, unsupervised method to combine layer-wise anomaly scores. In addition, we extend classical textual OOD benchmarks by including classification tasks with a greater number of classes (up to 77), which reflects more realistic settings. On this augmented benchmark, we show that the proposed post-aggregation methods achieve robust and consistent results while removing manual feature selection altogether. Their performance achieves near oracle's best layer performance.
翻译:由于基于AI的系统数量增多,新的稳健性和安全要求促使OOD的探测工作迅速扩大。现有的OOD文本探测器往往依赖于根据编码器最后一层的嵌入输出计算出的异常分数(如Mahalanobis距离)。在这项工作中,我们观察到OOD的探测性能差异很大,取决于任务和层输出。更重要的是,我们表明,通常的选择(最后一层)很少是OOD检测的最佳结果,如果选择最佳层,则可以取得更好的结果。为了利用这一观察,我们建议一种由数据驱动的、不受监督的方法,将层错分合并在一起。此外,我们扩大了传统的文本OOD基准,将分类任务扩大到更多类别(至77级),这反映了更现实的环境。关于这一扩大的基准,我们表明,拟议的分类后方法在完全删除手动特征选择的同时,也取得了稳健和一致的结果。它们的性能近于奥克莱的最高层性能。