Automatically identifying fake news from the Internet is a challenging problem in deception detection tasks. Online news is modified constantly during its propagation, e.g., malicious users distort the original truth and make up fake news. However, the continuous evolution process would generate unprecedented fake news and cheat the original model. We present the Fake News Evolution (FNE) dataset: a new dataset tracking the fake news evolution process. Our dataset is composed of 950 paired data, each of which consists of articles representing the three significant phases of the evolution process, which are the truth, the fake news, and the evolved fake news. We observe the features during the evolution and they are the disinformation techniques, text similarity, top 10 keywords, classification accuracy, parts of speech, and sentiment properties.
翻译:自动识别互联网上的假新闻是欺骗性检测任务中的一个棘手问题。 在线新闻在传播过程中不断被修改,例如恶意用户歪曲原始真相和编造假新闻。 然而,持续进化过程将产生前所未有的假新闻,欺骗原始模型。 我们展示了假新闻进化(FNE)数据集:跟踪假新闻进化过程的新数据集。 我们的数据集由950对对数据组成,每对数据由代表进化过程三个重要阶段的文章组成,它们是真理、假新闻和进化的假新闻。 我们观察进化过程中的特征,它们是虚假信息技术、文本相似性、前10个关键词、分类准确性、语言部分和情绪特性。