Convolutional neural networks (CNNs) have achieved remarkable success in automatically segmenting organs or lesions on 3D medical images. Recently, vision transformer networks have exhibited exceptional performance in 2D image classification tasks. Compared with CNNs, transformer networks have an appealing advantage of extracting long-range features due to their self-attention algorithm. Therefore, we propose a CNN-Transformer combined model, called BiTr-Unet, with specific modifications for brain tumor segmentation on multi-modal MRI scans. Our BiTr-Unet achieves good performance on the BraTS2021 validation dataset with median Dice score 0.9335, 0.9304 and 0.8899, and median Hausdorff distance 2.8284, 2.2361 and 1.4142 for the whole tumor, tumor core, and enhancing tumor, respectively. On the BraTS2021 testing dataset, the corresponding results are 0.9257, 0.9350 and 0.8874 for Dice score, and 3, 2.2361 and 1.4142 for Hausdorff distance. The code is publicly available at https://github.com/JustaTinyDot/BiTr-Unet.
翻译:最近,视觉变压器网络在2D图像分类任务方面表现出色,与CNN相比,变压器网络具有吸引优势,因其自我注意算法而提取远程特征。因此,我们提议CNN- Transfrench综合模型,称为BiTr-Unet,在多式MRI扫描上对脑肿瘤分离进行具体修改。我们的BiTr-Unet在BRTS2021验证数据集上表现良好,中位Dice分数为0.9335、0.9304和0.8899,中位Hausdorff 距离为2.8284、2.2361和1.4142,整个肿瘤、肿瘤核心和强化肿瘤。在BRATS2021测试数据集中,相应的结果为:0.9257、0.9350和0.8874,Dice分为0.3,Hausdorf距离为2.2361和1.4142。该代码可在https://github.com/JustaryDonet/BiRT-U上公开查阅。