Despite recent progress in improving the performance of misinformation detection systems, classifying misinformation in an unseen domain remains an elusive challenge. To address this issue, a common approach is to introduce a domain critic and encourage domain-invariant input features. However, early misinformation often demonstrates both conditional and label shifts against existing misinformation data (e.g., class imbalance in COVID-19 datasets), rendering such methods less effective for detecting early misinformation. In this paper, we propose contrastive adaptation network for early misinformation detection (CANMD). Specifically, we leverage pseudo labeling to generate high-confidence target examples for joint training with source data. We additionally design a label correction component to estimate and correct the label shifts (i.e., class priors) between the source and target domains. Moreover, a contrastive adaptation loss is integrated in the objective function to reduce the intra-class discrepancy and enlarge the inter-class discrepancy. As such, the adapted model learns corrected class priors and an invariant conditional distribution across both domains for improved estimation of the target data distribution. To demonstrate the effectiveness of the proposed CANMD, we study the case of COVID-19 early misinformation detection and perform extensive experiments using multiple real-world datasets. The results suggest that CANMD can effectively adapt misinformation detection systems to the unseen COVID-19 target domain with significant improvements compared to the state-of-the-art baselines.
翻译:尽管最近在改进误报检测系统绩效方面取得了进展,但将误报归为隐蔽领域仍是一个难以应对的挑战。为了解决这一问题,一个共同的方法是引入一个域批评器,鼓励域变量输入功能;然而,早期误报往往显示与现有误报数据(如COVID-19数据集的等级不平衡)相比,有条件和标签的变化,使这类方法对早期误报的检测效果较差。在本文件中,我们建议采用对比性适应网络,以便早期误报检测(CANMD)。具体地说,我们利用假标签来生成与源数据联合培训的高信任目标范例。我们另外设计了一个标签校正部分,以估计和纠正源与目标领域之间的标签变化(如类前)。此外,在目标功能中,差异性调整损失被整合了,以减少类内差异,扩大阶级间差异。因此,我们建议采用经修改的模型学习校正的前类,并在两个领域进行不固定的有条件的分布,以改进目标数据分布。我们研究COVI-19早期误报校正案例,将数据检测结果与大规模升级,以便用实际的CAMAMD基准进行广泛的试验。