In many task settings, text classification models are likely to encounter examples from novel classes on which they cannot predict correctly. Selective prediction, in which models abstain on low-confidence examples, provides a possible solution, but existing models are often overly confident on OOD examples. To remedy this overconfidence, we introduce Contrastive Novelty-Augmented Learning (CoNAL), a two-step method that generates OOD examples representative of novel classes, then trains to decrease confidence on them. First, we generate OOD examples by prompting a large language model twice: we prompt it to enumerate relevant novel labels, then generate examples from each novel class matching the task format. Second, we train our classifier with a novel contrastive objective that encourages lower confidence on generated OOD examples than training examples. When trained with CoNAL, classifiers improve in their ability to detect and abstain on OOD examples over prior methods by an average of 2.3% AUAC and 5.5% AUROC across 4 NLP datasets, with no cost to in-distribution accuracy.
翻译:在许多任务设置中,文本分类模型可能遇到无法正确预测的新类实例。选择性预测(其中模型对低信任示例投弃权票)提供了可能的解决方案,但现有模型往往对OOD实例过于自信。为了纠正这种过度自信,我们引入了一种两步方法,即产生代表新类的OOOD实例,然后进行培训降低对OOOD的信心。首先,我们通过两次推动大型语言模型生成OOD实例:我们促使它罗列相关小标签,然后从每个与任务格式相匹配的新类中生成示例。第二,我们培训我们的分类者,以新的对比目标,鼓励降低对生成OOD实例的信心,而不是培训实例。在接受CONAL培训时,分类者通过平均2.3%的 AUAC和5.5% AUROC在4 NLP数据集中探测和放弃OD实例的能力,而没有成本分配准确性。