Early Rumor Detection (EARD) aims to identify the earliest point at which a claim can be accurately classified based on a sequence of social media posts. This is especially challenging in data-scarce settings. While Large Language Models (LLMs) perform well in few-shot NLP tasks, they are not well-suited for time-series data and are computationally expensive for both training and inference. In this work, we propose a novel EARD framework that combines an autonomous agent and an LLM-based detection model, where the agent acts as a reliable decision-maker for \textit{early time point determination}, while the LLM serves as a powerful \textit{rumor detector}. This approach offers the first solution for few-shot EARD, necessitating only the training of a lightweight agent and allowing the LLM to remain training-free. Extensive experiments on four real-world datasets show our approach boosts performance across LLMs and surpasses existing EARD methods in accuracy and earliness.
翻译:早期谣言检测旨在基于社交媒体帖子序列,识别能够对声明进行准确分类的最早时间点。这在数据稀缺场景中尤为困难。尽管大型语言模型在少样本自然语言处理任务中表现优异,但其不适用于时间序列数据,且在训练和推理阶段均需高昂计算成本。本研究提出一种结合自主智能体与基于LLM的检测模型的新型早期谣言检测框架:智能体作为可靠的决策者执行"早期时间点判定",而LLM则作为强大的"谣言检测器"。该框架首次为少样本早期谣言检测提供解决方案,仅需训练轻量级智能体,同时保持LLM无需训练的特性。在四个真实数据集上的大量实验表明,本方法能提升不同LLM的性能表现,并在检测准确性与时效性方面超越现有早期谣言检测方法。