This paper presents a new approach to the FNC-1 fake news classification task which involves employing pre-trained encoder models from similar NLP tasks, namely sentence similarity and natural language inference, and two neural network architectures using this approach are proposed. Methods in data augmentation are explored as a means of tackling class imbalance in the dataset, employing common pre-existing methods and proposing a method for sample generation in the under-represented class using a novel sentence negation algorithm. Comparable overall performance with existing baselines is achieved, while significantly increasing accuracy on an under-represented but nonetheless important class for FNC-1.
翻译:本文件介绍了对FNC-1假新闻分类任务的一种新方法,它涉及使用来自类似NLP任务的预先训练的编码模型模型,即句号相似和自然语言推断,并提出了采用这一方法的两个神经网络结构,探讨数据增加方法,作为解决数据集中阶级不平衡问题的手段,采用共同的原有方法,并采用新的否定句子算法,提出在代表性不足的类别中抽样生成的方法;实现了与现有基线的可比总体业绩,同时大大提高了代表性不足但很重要的类别FNC-1的准确性。