Semantic Change Detection (SCD) refers to the task of simultaneously extracting the changed areas and the semantic categories (before and after the changes) in Remote Sensing Images (RSIs). This is more meaningful than Binary Change Detection (BCD) since it enables detailed change analysis in the observed areas. Previous works established triple-branch Convolutional Neural Network (CNN) architectures as the paradigm for SCD. However, it remains challenging to exploit semantic information with a limited amount of change samples. In this work, we investigate to jointly consider the spatio-temporal dependencies to improve the accuracy of SCD. First, we propose a Semantic Change Transformer (SCanFormer) to explicitly model the 'from-to' semantic transitions between the bi-temporal RSIs. Then, we introduce a semantic learning scheme to leverage the spatio-temporal constraints, which are coherent to the SCD task, to guide the learning of semantic changes. The resulting network (SCanNet) significantly outperforms the baseline method in terms of both detection of critical semantic changes and semantic consistency in the obtained bi-temporal results. It achieves the SOTA accuracy on two benchmark datasets for the SCD.
翻译:语义变化检测(SCD)是指在遥感图像中同时提取变化区域和语义类别(变化前和变化后)的任务。相较于二元变化检测,这种任务能够对观察区域进行更细粒度的变化分析。以前的工作将三分支卷积神经网络(CNN)架构作为SCD的典范。然而,利用数量有限的变化样本来提取语义信息仍然具有挑战性。本文研究了联合考虑时空依赖性以提高SCD精度的问题。首先,我们提出了一个被称为语义变化转换器(SCanFormer)的模型来显式建模双时相遥感图像之间的“从-到”语义转换。然后,我们引入了一个语义学习方案,利用与SCD任务相一致的时空约束来指导语义变化的学习。最终的网络(SCanNet)在提取关键语义变化和生成时空一致的双时相结果方面,显著优于基线方法。它在两个SCD基准数据集上均实现了最先进的精度。