Remote sensing image change detection is one of the fundamental tasks in remote sensing intelligent interpretation. Its core objective is to identify changes within change regions of interest (CRoI). Current multimodal large models encode rich human semantic knowledge, which is utilized for guidance in tasks such as remote sensing change detection. However, existing methods that use semantic guidance for detecting users' CRoI overly rely on explicit textual descriptions of CRoI, leading to the problem of near-complete performance failure when presented with implicit CRoI textual descriptions. This paper proposes a multimodal reasoning change detection model named ReasonCD, capable of mining users' implicit task intent. The model leverages the powerful reasoning capabilities of pre-trained large language models to mine users' implicit task intents and subsequently obtains different change detection results based on these intents. Experiments on public datasets demonstrate that the model achieves excellent change detection performance, with an F1 score of 92.1\% on the BCDD dataset. Furthermore, to validate its superior reasoning functionality, this paper annotates a subset of reasoning data based on the SECOND dataset. Experimental results show that the model not only excels at basic reasoning-based change detection tasks but can also explain the reasoning process to aid human decision-making.
翻译:遥感图像变化检测是遥感智能解译的基础任务之一,其核心目标在于识别变化兴趣区域(CRoI)内的变化。当前多模态大模型编码了丰富的人类语义知识,可用于指导遥感变化检测等任务。然而,现有利用语义指导检测用户CRoI的方法过度依赖CRoI的显式文本描述,导致在面对隐式CRoI文本描述时出现性能近乎完全失效的问题。本文提出了一种名为ReasonCD的多模态推理变化检测模型,能够挖掘用户的隐式任务意图。该模型利用预训练大语言模型的强大推理能力挖掘用户的隐式任务意图,进而基于这些意图获得不同的变化检测结果。在公开数据集上的实验表明,该模型取得了优异的变化检测性能,在BCDD数据集上的F1分数达到92.1%。此外,为验证其卓越的推理功能,本文基于SECOND数据集标注了部分推理数据。实验结果表明,该模型不仅擅长基于推理的基本变化检测任务,还能解释推理过程以辅助人类决策。