Convolutional neural networks (CNNs) have recently achieved remarkable success in automatically identifying organs or lesions on 3D medical images. Meanwhile, vision transformer networks have exhibited exceptional performance in 2D image classification tasks. Compared with CNNs, transformer networks have an obvious advantage of extracting long-range features due to their self-attention algorithm. Therefore, in this paper we present a CNN-Transformer combined model called BiTr-Unet for brain tumor segmentation on multi-modal MRI scans. The proposed BiTr-Unet achieves good performance on the BraTS 2021 validation dataset with mean Dice score 0.9076, 0.8392 and 0.8231, and mean Hausdorff distance 4.5322, 13.4592 and 14.9963 for the whole tumor, tumor core, and enhancing tumor, respectively.
翻译:革命性神经网络(CNNs)最近在自动识别3D医疗图象上的器官或损伤方面取得显著成功;与此同时,视觉变压器网络在2D图像分类任务方面表现出色;与CNN相比,变压器网络由于自我注意算法,在提取远程特征方面具有明显的优势;因此,在本文件中我们提出了一个CNN-Transed综合模型,称为BiTr-Unet,用于在多式MRI扫描中进行脑肿瘤分解;拟议的BiTr-Unet在BRATS 2021验证数据集上表现良好,平均Dice分为0.9076、0.8392和0.8231,平均Hausdorff距离为4.5322、13.4592和14.9963,用于整个肿瘤、肿瘤核心和强化肿瘤。