Understanding of spatial attributes is central to effective 3D radiology image analysis where crop-based learning is the de facto standard. Given an image patch, its core spatial properties (e.g., position & orientation) provide helpful priors on expected object sizes, appearances, and structures through inherent anatomical consistencies. Spatial correspondences, in particular, can effectively gauge semantic similarities between inter-image regions, while their approximate extraction requires no annotations or overbearing computational costs. However, recent 3D contrastive learning approaches either neglect correspondences or fail to maximally capitalize on them. To this end, we propose an extensible 3D contrastive framework (Spade, for Spatial Debiasing) that leverages extracted correspondences to select more effective positive & negative samples for representation learning. Our method learns both globally invariant and locally equivariant representations with downstream segmentation in mind. We also propose separate selection strategies for global & local scopes that tailor to their respective representational requirements. Compared to recent state-of-the-art approaches, Spade shows notable improvements on three downstream segmentation tasks (CT Abdominal Organ, CT Heart, MR Heart).
翻译:对空间属性的理解是有效进行3D放射学图像分析的核心。 在这种分析中,以作物为基础的学习是事实上的标准。 在图像补丁的情况下,其核心空间特性(如位置和方向)通过内在解剖结构提供关于预期对象大小、外观和结构的有用前科。 空间通信尤其能够有效地测量不同图像区域之间的语义相似性,而其近似提取不需要说明或过高的计算成本。然而,最近的3D对比学习方法要么忽略了通信,要么没有充分利用它们。为此,我们提议了一个可扩展的3D对比框架(空间脱偏斜的Spade),利用提取的通信选择更有效的正负样本进行代表性学习。我们的方法既学习全球的变异性,也学习地方的等异性表示,同时考虑到下游区段。我们还提议了适合各自代表性要求的全球和地方范围的单独选择战略。与最近的状态方法相比,Spade展示了三个下游区段任务(CT, Abdorminal Oral)的显著改进。