It has been widely recognized that the success of deep learning in image segmentation relies overwhelmingly on a myriad amount of densely annotated training data, which, however, are difficult to obtain due to the tremendous labor and expertise required, particularly for annotating 3D medical images. Although self-supervised learning (SSL) has shown great potential to address this issue, most SSL approaches focus only on image-level global consistency, but ignore the local consistency which plays a pivotal role in capturing structural information for dense prediction tasks such as segmentation. In this paper, we propose a PriorGuided Local (PGL) self-supervised model that learns the region-wise local consistency in the latent feature space. Specifically, we use the spatial transformations, which produce different augmented views of the same image, as a prior to deduce the location relation between two views, which is then used to align the feature maps of the same local region but being extracted on two views. Next, we construct a local consistency loss to minimize the voxel-wise discrepancy between the aligned feature maps. Thus, our PGL model learns the distinctive representations of local regions, and hence is able to retain structural information. This ability is conducive to downstream segmentation tasks. We conducted an extensive evaluation on four public computerized tomography (CT) datasets that cover 11 kinds of major human organs and two tumors. The results indicate that using pre-trained PGL model to initialize a downstream network leads to a substantial performance improvement over both random initialization and the initialization with global consistency-based models. Code and pre-trained weights will be made available at: https://git.io/PGL.
翻译:人们广泛认识到,在图像分割方面深层学习的成功主要取决于大量高密度附加说明的培训数据,然而,这些数据很难获得,因为需要大量人力和专门知识,特别是3D医学图像的说明。尽管自我监督的学习(SSL)显示有巨大的潜力解决这一问题,但大多数SSL方法仅侧重于图像层面的全球一致性,而忽视当地的一致性,这种一致性在获取大量预测任务(如分化)的结构信息方面发挥着关键作用。在本文中,我们提议了一种上游本地(PGL)自我监督模型,在潜在地物空间中学习以区域为方向的本地一致性。具体地说,我们使用空间变异,这种变异产生不同增强的相同图像观点,作为推断两种观点之间位置关系之前的先导,然后用于调整同一地区地貌图的特征图,但从两种观点中提取。我们构建了一种基于本地一致性的模型,以最大限度地减少调和地貌特征图之间的反差。因此,我们的PGL模型在初始地貌空间中学习了以区域为方向的本地度为方向的局部一致性。因此,我们使用了一种基础性变动的模型,从而保留了结构结构上的数据。