Researchers have been investigating automated solutions for fact-checking in a variety of fronts. However, current approaches often overlook the fact that the amount of information released every day is escalating, and a large amount of them overlap. Intending to accelerate fact-checking, we bridge this gap by grouping similar messages and summarizing them into aggregated claims. Specifically, we first clean a set of social media posts (e.g., tweets) and build a graph of all posts based on their semantics; Then, we perform two clustering methods to group the messages for further claim summarization. We evaluate the summaries both quantitatively with ROUGE scores and qualitatively with human evaluation. We also generate a graph of summaries to verify that there is no significant overlap among them. The results reduced 28,818 original messages to 700 summary claims, showing the potential to speed up the fact-checking process by organizing and selecting representative claims from massive disorganized and redundant messages.
翻译:研究者们一直在调查各种方面进行事实核实的自动化解决方案。然而,目前的方法往往忽略每天发布的信息数量在不断上升,而且大量重叠这一事实。为了加快事实核实,我们通过将类似信息分组并将其归纳为综合主张,缩小了这一差距。具体地说,我们首先清理一套社交媒体职位(例如推特),并根据其语义绘制所有职位的图表;然后,我们用两种组合方法将信息进行分组,以便进一步对索赔进行总结。我们用ROUGE的分数从数量上对摘要进行评估,并从质量上与人的评价进行评定。我们还制作了一个摘要图表,以核实它们之间没有重大重叠。结果将28 818条原始信息减少到700条简要主张,表明有可能通过组织和从大规模无组织和冗余的信息中挑选有代表性的主张来加快事实核实进程。