Massive rumors usually appear along with breaking news or trending topics, seriously hindering the truth. Existing rumor detection methods are mostly focused on the same domain, and thus have poor performance in cross-domain scenarios due to domain shift. In this work, we propose an end-to-end instance-wise and prototype-wise contrastive learning model with a cross-attention mechanism for cross-domain rumor detection. The model not only performs cross-domain feature alignment but also enforces target samples to align with the corresponding prototypes of a given source domain. Since target labels in a target domain are unavailable, we use a clustering-based approach with carefully initialized centers by a batch of source domain samples to produce pseudo labels. Moreover, we use a cross-attention mechanism on a pair of source data and target data with the same labels to learn domain-invariant representations. Because the samples in a domain pair tend to express similar semantic patterns, especially on the people's attitudes (e.g., supporting or denying) towards the same category of rumors, the discrepancy between a pair of the source domain and target domain will be decreased. We conduct experiments on four groups of cross-domain datasets and show that our proposed model achieves state-of-the-art performance.
翻译:大量谣言往往伴随着突发新闻或热门话题的出现,严重阻碍了真实情况的传播。现有的谣言检测方法多数集中于同一领域,因此在跨领域场景下表现较差,这是由于领域偏移所造成的。在本研究中,我们提出了一种端到端的基于样本和原型对比学习模型,并配合交叉注意力机制,用于跨领域谣言检测。该模型不仅能够实现跨领域特征对齐,还能够将目标样本与给定来源领域的相应原型对齐。由于目标领域中的标签是不可得的,因此我们采用基于聚类的方法,通过一个来源域样本组成的批次来初始化中心以生成伪标签。此外,我们针对具有相同标签的源数据和目标数据对进行交叉注意力机制的训练,以学习领域不变表示。由于领域对中的样本往往表达相似的语义模式,特别是关于人们对同一类谣言态度(例如支持或否认)的模式,因此领域对之间的差距将被缩小。我们在四组跨领域数据集上进行了实验,并展示了我们提出的模型达到了最先进的性能水平。