Objective: Transformers, born to remedy the inadequate receptive fields of CNNs, have drawn explosive attention recently. However, the daunting computational complexity of global representation learning, together with rigid window partitioning, hinders their deployment in medical image segmentation. This work aims to address the above two issues in transformers for better medical image segmentation. Methods: We propose a boundary-aware lightweight transformer (BATFormer) that can build cross-scale global interaction with lower computational complexity and generate windows flexibly under the guidance of entropy. Specifically, to fully explore the benefits of transformers in long-range dependency establishment, a cross-scale global transformer (CGT) module is introduced to jointly utilize multiple small-scale feature maps for richer global features with lower computational complexity. Given the importance of shape modeling in medical image segmentation, a boundary-aware local transformer (BLT) module is constructed. Different from rigid window partitioning in vanilla transformers which would produce boundary distortion, BLT adopts an adaptive window partitioning scheme under the guidance of entropy for both computational complexity reduction and shape preservation. Results: BATFormer achieves the best performance in Dice of 92.84%, 91.97%, 90.26%, and 96.30% for the average, right ventricle, myocardium, and left ventricle respectively on the ACDC dataset and the best performance in Dice, IoU, and ACC of 90.76%, 84.64%, and 96.76% respectively on the ISIC 2018 dataset. More importantly, BATFormer requires the least amount of model parameters and the lowest computational complexity compared to the state-of-the-art approaches. Conclusion and Significance: Our results demonstrate the necessity of developing customized transformers for efficient and better medical image segmentation.
翻译:摘要:Transformer作为纠正卷积神经网络受限感受野的工具,在近期引起了广泛关注。然而,全局表示学习的巨大计算复杂度,以及刚性的窗口划分,阻碍了它们在医学图像分割中的应用。本文旨在解决Transformer在医学图像分割中的这两个问题。我们提出了一种面向边界的轻量级Transformer(BATFormer),它能够以更低的计算复杂度构建跨尺度的全局交互,带有熵的引导下灵活地产生窗口。具体来说,为了充分发掘Transformer在建立长程依赖关系方面的优势,我们引入了跨尺度全局Transformer(CGT)模块,使用多个小尺度的特征图进行联合,以更低的计算复杂度生成更丰富的全局特征。考虑到形状建模在医学图像分割中的重要性,我们构建了一个面向边界的局部Transformer(BLT)模块。与普通的Transformer中刚性的窗口划分会产生边界扭曲不同,BLT采用熵的自适应窗口划分方案,既减少了计算复杂度,又保留了形状。结果:BATFormer在ACDC数据集上的平均Dice系数为92.84%、右心室为91.97%、心肌为90.26%、左心室为96.30%,在ISIC 2018数据集上的Dice系数,IoU和ACC分别为90.76%、84.64%和96.76%。更重要的是,与现有的最佳方法相比,BATFormer需要最少的模型参数和最低的计算复杂度。结论和意义:我们的结果表明,定制Transformer是在医学图像分割中实现高效和更好性能的必要手段。