Radiomic representations can quantify properties of regions of interest in medical image data. Classically, they account for pre-defined statistics of shape, texture, and other low-level image features. Alternatively, deep learning-based representations are derived from supervised learning but require expensive annotations from experts and often suffer from overfitting and data imbalance issues. In this work, we address the challenge of learning representations of 3D medical images for an effective quantification under data imbalance. We propose a \emph{self-supervised} representation learning framework to learn high-level features of 3D volumes as a complement to existing radiomics features. Specifically, we demonstrate how to learn image representations in a self-supervised fashion using a 3D Siamese network. More importantly, we deal with data imbalance by exploiting two unsupervised strategies: a) sample re-weighting, and b) balancing the composition of training batches. When combining our learned self-supervised feature with traditional radiomics, we show significant improvement in brain tumor classification and lung cancer staging tasks covering MRI and CT imaging modalities.
翻译:典型地说,它们是形状、质地和其他低水平图像特征的预设统计数据。 或者,深层次的基于学习的表述来自受监督的学习,但需要专家提供昂贵的说明,而且往往存在过度装配和数据不平衡的问题。在这项工作中,我们处理在数据不平衡的情况下学习3D医学图像的表述以有效量化的挑战。我们提议了一个代表学习框架,以学习3D卷的高级特征,作为现有放射特征的补充。具体地说,我们展示了如何利用3D西亚网络以自我监督的方式学习图像的表述。更重要的是,我们通过利用两种不受监督的战略来处理数据不平衡问题:a)抽样重新加权,和b)平衡培训组的组成。当我们把学到的自我监督特征与传统放射学结合起来时,我们显示出大脑肿瘤分类和肺癌在包括MRI和CT成像模型模式在内的肺部的集结任务方面的重大改进。