In the past few years, convolutional neural networks (CNNs), particularly U-Net, have been the prevailing technique in the medical image processing era. Specifically, the seminal U-Net, as well as its alternatives, have successfully managed to address a wide variety of medical image segmentation tasks. However, these architectures are intrinsically imperfect as they fail to exhibit long-range interactions and spatial dependencies leading to a severe performance drop in the segmentation of medical images with variable shapes and structures. Transformers, preliminary proposed for sequence-to-sequence prediction, have arisen as surrogate architectures to precisely model global information assisted by the self-attention mechanism. Despite being feasibly designed, utilizing a pure Transformer for image segmentation purposes can result in limited localization capacity stemming from inadequate low-level features. Thus, a line of research strives to design robust variants of Transformer-based U-Net. In this paper, we propose Trans-Norm, a novel deep segmentation framework which concomitantly consolidates a Transformer module into both encoder and skip-connections of the standard U-Net. We argue that the expedient design of skip-connections can be crucial for accurate segmentation as it can assist in feature fusion between the expanding and contracting paths. In this respect, we derive a Spatial Normalization mechanism from the Transformer module to adaptively recalibrate the skip connection path. Extensive experiments across three typical tasks for medical image segmentation demonstrate the effectiveness of TransNorm. The codes and trained models are publicly available at https://github.com/rezazad68/transnorm.
翻译:过去几年来,在医学图像处理时代,遗传神经网络(CNNs,特别是U-Net)一直是医学图像处理时代的流行技术。具体地说,具有开创性的U-Net及其替代品成功地解决了各种各样的医学图像分割任务。然而,这些结构在本质上是不完善的,因为它们没有表现出长距离互动和空间依赖性,导致以变形和结构为主的医疗图像的分解严重下降。为从序列到序列的跨网络预测初步提出的变换器,已经形成作为替代结构,以精确地模拟由自我监控机制协助的全球信息。尽管设计得非常灵活,但利用纯的变换器进行图像分解,可能导致因低级别特征不足而导致的局部化能力受到限制。因此,一连串的研究努力设计基于变异的U-Net的变异体。在本文件中,我们提议,Tran-Norm,这是一个新的深度分解框架,它同时将一个变换式的模块合并成内集,并跳过标准的U-Net结构分流结构机制。我们称,在不断更新的正变式的正变式结构/正态分解模式中,可以将这一结构转换成一个关键的分解模式连接。