Transformer, which can benefit from global (long-range) information modeling using self-attention mechanisms, has been successful in natural language processing and 2D image classification recently. However, both local and global features are crucial for dense prediction tasks, especially for 3D medical image segmentation. In this paper, we for the first time exploit Transformer in 3D CNN for MRI Brain Tumor Segmentation and propose a novel network named TransBTS based on the encoder-decoder structure. To capture the local 3D context information, the encoder first utilizes 3D CNN to extract the volumetric spatial feature maps. Meanwhile, the feature maps are reformed elaborately for tokens that are fed into Transformer for global feature modeling. The decoder leverages the features embedded by Transformer and performs progressive upsampling to predict the detailed segmentation map. Experimental results on the BraTS 2019 dataset show that TransBTS outperforms state-of-the-art methods for brain tumor segmentation on 3D MRI scans. Code is available at https://github.com/Wenxuan-1119/TransBTS
翻译:使用自我注意机制进行全球(长程)信息建模的变异器可受益于全球(长程)信息建模,最近已经在自然语言处理和2D图像分类方面取得了成功。然而,本地和全球特征对于密集的预测任务,特别是3D医学图像分割至关重要。在本文中,我们首次利用3DCNN的变异器进行MRI脑图解分解,并提议基于编码器脱码器结构的名为 TransBTS的新网络。为了获取本地的 3D 上下文信息,编码器首先使用 3D CNN 来提取体积空间特征地图。同时,对地貌图进行详细修改,用于输入到变异器的标志,用于全球特征建模。解码器利用变异器嵌的特征并进行逐步升级,以预测详细的分解图。 BRATS 2019 数据集的实验结果显示, TransBTSTS超越了3D MRI 扫描的脑肿瘤分解方法。代码可在 https://github.com/Wensuan-1119/TranstystratTS.