The ability of convolutional neural networks (CNNs) to recognize objects regardless of their position in the image is due to the translation-equivariance of the convolutional operation. Group-equivariant CNNs transfer this equivariance to other transformations of the input. Dealing appropriately with objects and object parts of different scale is challenging, and scale can vary for multiple reasons such as the underlying object size or the resolution of the imaging modality. In this paper, we propose a scale-equivariant convolutional network layer for three-dimensional data that guarantees scale-equivariance in 3D CNNs. Scale-equivariance lifts the burden of having to learn each possible scale separately, allowing the neural network to focus on higher-level learning goals, which leads to better results and better data-efficiency. We provide an overview of the theoretical foundations and scientific work on scale-equivariant neural networks in the two-dimensional domain. We then transfer the concepts from 2D to the three-dimensional space and create a scale-equivariant convolutional layer for 3D data. Using the proposed scale-equivariant layer, we create a scale-equivariant U-Net for medical image segmentation and compare it with a non-scale-equivariant baseline method. Our experiments demonstrate the effectiveness of the proposed method in achieving scale-equivariance for 3D medical image analysis. We publish our code at https://github.com/wimmerth/scale-equivariant-3d-convnet for further research and application.
翻译:卷积神经网络(CNNs)识别图像中任意位置的物体的能力归因于卷积操作的平移等变性。群等变 CNN 将这种等变性转移给输入的其他变换。处理多种尺度的物体和物体部位很具有挑战性,尺度可能因多种原因而变化,例如底层物体大小或成像模态的分辨率。在本文中,我们提出了一种用于三维数据的规模等变卷积网络层,确保 3D CNN 具有尺度等变性。尺度等变性减轻了必须单独学习每种可能尺度的负担,允许神经网络专注于更高级别的学习目标,从而实现更好的结果和更好的数据效率。我们概述了尺度等变神经网络在二维领域的理论基础和科学研究,然后将二维概念转移到三维空间并创建了用于 3D 数据的尺度等变卷积层。使用所提出的尺度等变层,我们创建了用于医学图像分割的尺度等变 U-Net,并将其与非尺度等变基线方法进行比较。我们的实验表明,所提出的方法在实现三维医学图像分析的尺度等变性方面非常有效。我们将开源代码发布在网站https://github.com/wimmerth/scale-equivariant-3d-convnet,以进行进一步的研究和应用。