In the absence of an authoritative statement about a rumor, people may expose the truth behind such rumor through their responses on social media. Most rumor detection methods aggregate the information of all the responses and have made great progress. However, due to the different backgrounds of users, the responses have different relevance for discovering th suspicious points hidden in a rumor claim. The methods that focus on all the responding tweets would dilute the effect of the critical ones. Moreover, for a multi-modal rumor claim, the focus of a user may be on several words in the text or an object in the image, so the different modalities should be considered to select the relevant responses and verify the claim. In this paper, we propose a novel multi-modal rumor detection model, termed Focal Reasoning Model (FoRM), to filter out the irrelevant responses and further conduct fine-grained reasoning with the multi-modal claim and corresponding responses. Concretely, there are two main components in our FoRM: the coarse-grained selection and the fine-grained reasoning. The coarse-grained selection component leverages the post-level features of the responses to verify the claim and learns a relevant score of each response. Based on the relevant scores, the most relevant responses are reserved as the critical ones to the further reasoning. In the fine-grained reasoning component, we design a relation attention module to explore the fine-grained relations, i.e., token-to-token and token-to-object relations, between the reserved responses and the multi-modal claim for finding out the valuable clues. Extensive experiments have been conducted on two real-world datasets, and the results demonstrate that our proposed model outperforms all the baselines.
翻译:暂无翻译