Integrating multi-modal data to improve medical image analysis has received great attention recently. However, due to the modal discrepancy, how to use a single model to process the data from multiple modalities is still an open issue. In this paper, we propose a novel scheme to achieve better pixel-level segmentation for unpaired multi-modal medical images. Different from previous methods which adopted both modality-specific and modality-shared modules to accommodate the appearance variance of different modalities while extracting the common semantic information, our method is based on a single Transformer with a carefully designed External Attention Module (EAM) to learn the structured semantic consistency (i.e. semantic class representations and their correlations) between modalities in the training phase. In practice, the above-mentioned structured semantic consistency across modalities can be progressively achieved by implementing the consistency regularization at the modality-level and image-level respectively. The proposed EAMs are adopted to learn the semantic consistency for different scale representations and can be discarded once the model is optimized. Therefore, during the testing phase, we only need to maintain one Transformer for all modal predictions, which nicely balances the model's ease of use and simplicity. To demonstrate the effectiveness of the proposed method, we conduct the experiments on two medical image segmentation scenarios: (1) cardiac structure segmentation, and (2) abdominal multi-organ segmentation. Extensive results show that the proposed method outperforms the state-of-the-art methods by a wide margin, and even achieves competitive performance with extremely limited training samples (e.g., 1 or 3 annotated CT or MRI images) for one specific modality.
翻译:由于模式差异,如何使用单一模型处理多种模式的数据仍然是尚未解决的问题。在本文中,我们提出了一个新方案,为非平坦的多模式医疗图像实现更好的像素级分解。与以往采用模式特定模块和模式共享模块以适应不同模式的外观差异,同时提取共同语义信息的方法不同,我们的方法以单一变换器为基础,配有精心设计的外部注意模块(EAM),以学习培训阶段不同模式之间结构化的语义一致性(即语义类展示及其相互关系)。在实践中,上述结构化结构化的跨模式的语义一致性可以通过分别在模式层面和图像层面实施一致性规范来逐步实现。拟议的EAM用于学习不同规模表述的语义一致性,一旦模型得到优化,就可以被丢弃。因此,在测试阶段,我们只需保持一个结构变异性结构(即语义类类分类表解)的语义一致性和相关性,以显示所有模式的简单化模式(我们使用两种方法),即简单化模式,以显示一个模型和多层次的演算方法。