Early rumor detection (ERD) on social media platform is very challenging when limited, incomplete and noisy information is available. Most of the existing methods have largely worked on event-level detection that requires the collection of posts relevant to a specific event and relied only on user-generated content. They are not appropriate to detect rumor sources in the very early stages, before an event unfolds and becomes widespread. In this paper, we address the task of ERD at the message level. We present a novel hybrid neural network architecture, which combines a task-specific character-based bidirectional language model and stacked Long Short-Term Memory (LSTM) networks to represent textual contents and social-temporal contexts of input source tweets, for modelling propagation patterns of rumors in the early stages of their development. We apply multi-layered attention models to jointly learn attentive context embeddings over multiple context inputs. Our experiments employ a stringent leave-one-out cross-validation (LOO-CV) evaluation setup on seven publicly available real-life rumor event data sets. Our models achieve state-of-the-art(SoA) performance for detecting unseen rumors on large augmented data which covers more than 12 events and 2,967 rumors. An ablation study is conducted to understand the relative contribution of each component of our proposed model.
翻译:社交媒体平台上的早期谣言探测(ERD)在有限、不完整和吵闹信息可用时非常具有挑战性。大多数现有方法主要用于事件一级的探测,需要收集与特定事件相关的职位,并且仅依赖用户生成的内容。这些方法不适合在事件发生之前的早期阶段,在事件展开并变得广泛之前,在早期发现谣言来源。在本文件中,我们在信息层面处理ERD的任务。我们展示了一个新型混合神经网络结构,它将基于特定任务的双向双向性语言模型和堆叠的长长长短期内存(LSTM)网络结合起来,以代表文字内容和输入源推文的社会时环境,以模拟其发展的早期阶段的谣言传播模式。我们应用多层次关注模型,共同学习对多种背景投入的留心背景嵌入。我们的实验采用了严格的放出一号交叉校准(LOO-CV)评估,对7个公开存在的实时流言事件数据集进行了评估。我们的模型实现了最先进的艺术模型内容和输入的社会时空环境环境环境环境环境环境环境环境环境环境环境。我们为12个大型流言流言流言流传数据进行的一项比较广泛的研究。