In recent years, computer-aided diagnosis has become an increasingly popular topic. Methods based on convolutional neural networks have achieved good performance in medical image segmentation and classification. Due to the limitations of the convolution operation, the long-term spatial features are often not accurately obtained. Hence, we propose a TransClaw U-Net network structure, which combines the convolution operation with the transformer operation in the encoding part. The convolution part is applied for extracting the shallow spatial features to facilitate the recovery of the image resolution after upsampling. The transformer part is used to encode the patches, and the self-attention mechanism is used to obtain global information between sequences. The decoding part retains the bottom upsampling structure for better detail segmentation performance. The experimental results on Synapse Multi-organ Segmentation Datasets show that the performance of TransClaw U-Net is better than other network structures. The ablation experiments also prove the generalization performance of TransClaw U-Net.
翻译:近年来,计算机辅助诊断成为一个越来越受欢迎的话题。基于神经网络的变异方法在医学图像分解和分类方面取得了良好的性能。由于变异操作的局限性,长期空间特征往往得不到准确的获取。因此,我们提议了一个TransClaw U-Net网络结构,将变异操作与编码部分的变压器操作结合起来。调频部分用于提取浅空间特征,以便利在更新后恢复图像解析。变压器部分用于对补丁进行编码,而自留机制用于在序列之间获取全球信息。解码部分保留了下层抽样结构,以更好地进行细化分解。合成多机分解数据集的实验结果表明,TransClaw U-Net的性能优于其他网络结构。变动实验还证明了 TransClaw U-Net的通用性能。