Combining information from multi-view images is crucial to improve the performance and robustness of automated methods for disease diagnosis. However, due to the non-alignment characteristics of multi-view images, building correlation and data fusion across views largely remain an open problem. In this study, we present TransFusion, a Transformer-based architecture to merge divergent multi-view imaging information using convolutional layers and powerful attention mechanisms. In particular, the Divergent Fusion Attention (DiFA) module is proposed for rich cross-view context modeling and semantic dependency mining, addressing the critical issue of capturing long-range correlations between unaligned data from different image views. We further propose the Multi-Scale Attention (MSA) to collect global correspondence of multi-scale feature representations. We evaluate TransFusion on the Multi-Disease, Multi-View \& Multi-Center Right Ventricular Segmentation in Cardiac MRI (M\&Ms-2) challenge cohort. TransFusion demonstrates leading performance against the state-of-the-art methods and opens up new perspectives for multi-view imaging integration towards robust medical image segmentation.
翻译:综合多视图图像的信息对于提高疾病诊断自动化方法的性能和稳健性至关重要,但是,由于多视图图像的不匹配特性,建立相关性和各种观点的数据融合在很大程度上仍然是一个尚未解决的问题。在本研究中,我们展示了基于变形器的架构TransFusion,这个结构将多视图图像信息结合在一起,使用相联层和强大的关注机制。特别是,为丰富的交叉视图背景建模和语义依赖性采矿提出了差异聚合关注模块,解决了从不同图像视图中获取不匹配数据之间的远程相关性这一关键问题。我们进一步建议多层次关注(MSA)收集多尺度特征代表的全球对应信息。我们评估多维度、多视图-多立心右方分层在卡迪亚茨MRI(M ⁇ MM)挑战组中的跨视角组合。 TransFusion展示了与最新方法相比的主要性能,并打开了多视图成像集成实现稳健医学图像分层的新视角。