Convolutional neural network (CNN) based methods have achieved great successes in medical image segmentation, but their capability to learn global representations is still limited due to using small effective receptive fields of convolution operations. Transformer based methods are capable of modelling long-range dependencies of information for capturing global representations, yet their ability to model local context is lacking. Integrating CNN and Transformer to learn both local and global representations while exploring multi-scale features is instrumental in further improving medical image segmentation. In this paper, we propose a hierarchical CNN and Transformer hybrid architecture, called ConvFormer, for medical image segmentation. ConvFormer is based on several simple yet effective designs. (1) A feed forward module of Deformable Transformer (DeTrans) is re-designed to introduce local information, called Enhanced DeTrans. (2) A residual-shaped hybrid stem based on a combination of convolutions and Enhanced DeTrans is developed to capture both local and global representations to enhance representation ability. (3) Our encoder utilizes the residual-shaped hybrid stem in a hierarchical manner to generate feature maps in different scales, and an additional Enhanced DeTrans encoder with residual connections is built to exploit multi-scale features with feature maps of different scales as input. Experiments on several datasets show that our ConvFormer, trained from scratch, outperforms various CNN- or Transformer-based architectures, achieving state-of-the-art performance.
翻译:以革命神经网络(CNN)为基础的方法在医疗图像分割方面取得了巨大成功,但由于使用了少量有效的可接受化操作领域,它们学习全球表现的能力仍然有限。以变异器为基础的方法能够模拟长期信息依赖性,以获取全球代表,但缺乏模拟当地环境的能力。将CNN和变异器结合起来,学习当地和全球代表,同时探索多种规模特征,有助于进一步改善医疗图像分割。在本文中,我们提议建立一个等级级的CNN和变异混合结构,称为ConvFormer,用于医疗图像分割。ConvFormer以若干简单而有效的设计为基础。(1) 变异变变变变变变器(DeTranser)的饲料前方模块正在重新设计,以引入当地信息,称为增强Dertransm。 (2) 基于变变变和增强变异组合的残余状混合干燥,以捕捉地方和全球代表,增强代表能力。(3) 我们的编码以等级方式利用残余成型混合的混合结构,以生成不同规模的地貌地图,以及更多带有残余性变变变变形结构的强化变形结构,在各种变形结构上制作了不同规模的数据模型,以利用多规模的模型,以显示各种变形结构。