Image-text retrieval in remote sensing aims to provide flexible information for data analysis and application. In recent years, state-of-the-art methods are dedicated to ``scale decoupling'' and ``semantic decoupling'' strategies to further enhance the capability of representation. However, these previous approaches focus on either the disentangling scale or semantics but ignore merging these two ideas in a union model, which extremely limits the performance of cross-modal retrieval models. To address these issues, we propose a novel Scale-Semantic Joint Decoupling Network (SSJDN) for remote sensing image-text retrieval. Specifically, we design the Bidirectional Scale Decoupling (BSD) module, which exploits Salience Feature Extraction (SFE) and Salience-Guided Suppression (SGS) units to adaptively extract potential features and suppress cumbersome features at other scales in a bidirectional pattern to yield different scale clues. Besides, we design the Label-supervised Semantic Decoupling (LSD) module by leveraging the category semantic labels as prior knowledge to supervise images and texts probing significant semantic-related information. Finally, we design a Semantic-guided Triple Loss (STL), which adaptively generates a constant to adjust the loss function to improve the probability of matching the same semantic image and text and shorten the convergence time of the retrieval model. Our proposed SSJDN outperforms state-of-the-art approaches in numerical experiments conducted on four benchmark remote sensing datasets.
翻译:遥感中图像文字检索旨在为数据分析和应用提供灵活的信息。 近年来, 最新的最新技术方法用于“ 比例脱钩” 和“ 语义脱钩” 战略, 以进一步提高代表能力。 然而, 这些先前的方法侧重于脱钩比例或语义学, 但却忽略了将这两个想法合并在一个联盟模型中, 这极大限制了跨模式检索模型的性能。 为了解决这些问题, 我们提议了一个新的 Sandric- Sermantic 联合脱钩网络( SS JDN ), 用于遥感图像文本检索。 具体而言, 我们设计了双向缩缩放比例脱钩( BSSD) 模块, 该模块利用 Salience Feature Expliton (SFE) 和 Salience-Guid Sergied 约束 (SGS) 单位, 以适应性地提取潜在特征, 压制其他尺度的繁琐烦琐特征, 以产生不同规模的线索。 此外, 我们设计了 label- supervision Smantical Decultive (LSD) modistration modistring modigradustration Segradustration the rodustration romoud roduislation seleft sildal legild sild legildal legild legild legal le legil legil legil lex legild sal legild legild left left left left legleglegy sald legy sald leglex lex leglegald leglegal legalds) lex leds leds le, lads led 将我们在远程定位模型模型模块上, 我们 上, 和我们的缩缩缩缩缩缩缩缩缩缩缩成的缩缩缩缩缩缩缩缩缩缩缩缩缩缩缩缩缩缩缩缩缩缩缩缩缩缩缩缩缩缩缩缩缩缩缩缩缩缩缩缩缩缩缩