文本分类(Text Classification)任务是根据给定文档的内容或主题,自动分配预先定义的类别标签。

VIP内容

深度神经网络对分类任务的预测准确度有显著的贡献。然而,他们倾向于在现实世界中做出过度自信的预测,其中存在领域转移和分布外(OOD)的例子。由于计算机视觉提供了对不确定性质量的视觉验证,目前对不确定性估计的研究主要集中在计算机视觉上。然而,在自然语言过程领域却鲜有研究。与贝叶斯方法通过权重不确定性间接推断不确定性不同,当前基于证据不确定性的方法通过主观意见明确地建模类别概率的不确定性。他们进一步考虑了不同根源的数据的固有不确定性,即vacuity(即由于缺乏证据而产生的不确定性)和不协调(即由于相互冲突的证据而产生的不确定性)。本文首次将证据不确定性运用于文本分类任务中的OOD检测。我们提出了一种既采用辅助离群样本,又采用伪离流形样本的廉价框架来训练具有特定类别先验知识的模型,该模型对OOD样本具有较高的空度。大量的经验实验表明,我们基于证据不确定性的模型在OOD实例检测方面优于其他同类模型。我们的方法可以很容易地部署到传统的循环神经网络和微调预训练的transformers。

https://www.zhuanzhi.ai/paper/f1ead8805294e050cc18d08d3f221296

成为VIP会员查看完整内容
0
14

最新论文

The premises of an argument give evidence or other reasons to support a conclusion. However, the amount of support required depends on the generality of a conclusion, the nature of the individual premises, and similar. An argument whose premises make its conclusion rationally worthy to be drawn is called sufficient in argument quality research. Previous work tackled sufficiency assessment as a standard text classification problem, not modeling the inherent relation of premises and conclusion. In this paper, we hypothesize that the conclusion of a sufficient argument can be generated from its premises. To study this hypothesis, we explore the potential of assessing sufficiency based on the output of large-scale pre-trained language models. Our best model variant achieves an F1-score of .885, outperforming the previous state-of-the-art and being on par with human experts. While manual evaluation reveals the quality of the generated conclusions, their impact remains low ultimately.

0
0
下载
预览
Top