While pre-trained large-scale deep models have garnered attention as an important topic for many downstream natural language processing (NLP) tasks, such models often make unreliable predictions on out-of-distribution (OOD) inputs. As such, OOD detection is a key component of a reliable machine-learning model for any industry-scale application. Common approaches often assume access to additional OOD samples during the training stage, however, outlier distribution is often unknown in advance. Instead, we propose a post hoc framework called POORE - POsthoc pseudo-Ood REgularization, that generates pseudo-OOD samples using in-distribution (IND) data. The model is fine-tuned by introducing a new regularization loss that separates the embeddings of IND and OOD data, which leads to significant gains on the OOD prediction task during testing. We extensively evaluate our framework on three real-world dialogue systems, achieving new state-of-the-art in OOD detection.
翻译:虽然预先培训的大型深层模型作为许多下游自然语言处理(NLP)任务的一个重要专题已经引起注意,但这类模型往往对分配外的投入作出不可靠的预测,因此,OOOD探测是任何行业应用的可靠机器学习模型的一个关键组成部分。共同方法往往假定在培训阶段可以获取更多的OOOD样本,然而,外部分布往往事先就不清楚。相反,我们提议了一个称为POORE - POSHOPE 伪OOOOOD REGAL化的后期框架,利用分配(IND)数据生成假OOOD样本。该模型经过细微调整,引入新的正规化损失,将IND和OOOD数据的嵌入分离出来,从而在测试期间在OOD预测任务上取得重大收益。我们广泛评价了我们关于三个真实世界对话系统的框架,在OOOD探测中实现了新的状态。