Convolutional neural networks (CNN), the most prevailing architecture for deep-learning based medical image analysis, are still functionally limited by their intrinsic inductive biases and inadequate receptive fields. Transformer, born to address this issue, has drawn explosive attention in natural language processing and computer vision due to its remarkable ability in capturing long-range dependency. However, most recent transformer-based methods for medical image segmentation directly apply vanilla transformers as an auxiliary module in CNN-based methods, resulting in severe detail loss due to the rigid patch partitioning scheme in transformers. To address this problem, we propose C2FTrans, a novel multi-scale architecture that formulates medical image segmentation as a coarse-to-fine procedure. C2FTrans mainly consists of a cross-scale global transformer (CGT) which addresses local contextual similarity in CNN and a boundary-aware local transformer (BLT) which overcomes boundary uncertainty brought by rigid patch partitioning in transformers. Specifically, CGT builds global dependency across three different small-scale feature maps to obtain rich global semantic features with an acceptable computational cost, while BLT captures mid-range dependency by adaptively generating windows around boundaries under the guidance of entropy to reduce computational complexity and minimize detail loss based on large-scale feature maps. Extensive experimental results on three public datasets demonstrate the superior performance of C2FTrans against state-of-the-art CNN-based and transformer-based methods with fewer parameters and lower FLOPs. We believe the design of C2FTrans would further inspire future work on developing efficient and lightweight transformers for medical image segmentation. The source code of this paper is publicly available at https://github.com/xianlin7/C2FTrans.
翻译:以深层学习为基础的医学图像分析最常用的神经内脏网络(CNN)在功能上仍然受到其内在诱导偏差和接收场不足的限制。为了解决这个问题而诞生的变异器在自然语言处理和计算机视觉中引起了爆炸性关注,因为其具有获得远程依赖性的巨大能力。然而,最近基于变异器的医疗图像分解方法直接将香草变异器作为CNN方法的辅助模块,从而在变异器中僵硬的补丁分解机制导致严重细节损失。为了解决这个问题,我们提议C2FTrans,这是一个创新的多尺度结构,将医疗图像分解成一个粗略至直观的程序。C2FTrans主要包括一个跨尺度的全球变异变器(CGT),该变异器处理CNN的当地环境相似性和边际变异器(BLT),它克服变异器的僵硬的补差分解隔阂。CGTTBL将全球依赖性、可接受的可接受的计算成本,而BLTF的高级变异性变异系统将未来变异性变压的系统在快速的系统上,通过不断的变异性变压的变压的变异性变换的系统,在CLLLLTFDFDF的系统上,在不断的变压的变压的系统上,在不断的变压的系统上,在不断的变压的变压的变压。