Deployed real-world machine learning applications are often subject to uncontrolled and even potentially malicious inputs. Those out-of-domain inputs can lead to unpredictable outputs and sometimes catastrophic safety issues. Prior studies on out-of-domain detection require in-domain task labels and are limited to supervised classification scenarios. Our work tackles the problem of detecting out-of-domain samples with only unsupervised in-domain data. We utilize the latent representations of pre-trained transformers and propose a simple yet effective method to transform features across all layers to construct out-of-domain detectors efficiently. Two domain-specific fine-tuning approaches are further proposed to boost detection accuracy. Our empirical evaluations of related methods on two datasets validate that our method greatly improves out-of-domain detection ability in a more general scenario.
翻译:实际部署的机器学习应用程序往往受到不受控制的甚至潜在的恶意输入,这些外部输入可能导致无法预测的产出,有时甚至灾难性的安全问题; 先前对外域探测的研究需要内部任务标签,并限于监督分类设想方案; 我们的工作处理的是只用未经监督的部内数据探测外域样品的问题; 我们利用预先训练的变压器的潜在表现,并提出一种简单而有效的方法来改变所有层次的特征,以便有效地建立局外探测器; 进一步提出两种针对具体域的微调方法来提高探测的准确性; 我们对两个数据集的相关方法的经验评估证实,我们的方法在更一般的情形下大大改进了外域探测能力。