Self-supervised learning methods for computer vision have demonstrated the effectiveness of pre-training feature representations, resulting in well-generalizing Deep Neural Networks, even if the annotated data are limited. However, representation learning techniques require a significant amount of time for model training, with most of the time spent on precise hyper-parameter optimization and selection of augmentation techniques. We hypothesized that if the annotated dataset has enough morphological diversity to capture the diversity of the general population, as is common in medical imaging due to conserved similarities of tissue morphology, the variance error of the trained model is the dominant component of the Bias-Variance Trade-off. Therefore, we proposed the Variance Aware Training (VAT) method that exploits this data property by introducing the variance error into the model loss function, thereby, explicitly regularizing the model. Additionally, we provided a theoretical formulation and proof of the proposed method to aid interpreting the approach. Our method requires selecting only one hyper-parameter and matching or improving the performance of state-of-the-art self-supervised methods while achieving an order of magnitude reduction in the GPU training time. We validated VAT on three medical imaging datasets from diverse domains and for various learning objectives. These included a Magnetic Resonance Imaging (MRI) dataset for the heart semantic segmentation (MICCAI 2017 ACDC challenge), fundus photography dataset for ordinary regression of diabetic retinopathy progression (Kaggle 2019 APTOS Blindness Detection challenge), and classification of histopathologic scans of lymph node sections (PatchCamelyon dataset). Our code is available at https://github.com/DmitriiShubin/Variance-Aware-Training.
翻译:计算机视觉的自我监督学习方法显示了培训前的特征表现的有效性,因此,即使附加说明的数据有限,也能广泛推广深神经网络;然而,代表性学习技术需要大量时间进行模型培训,大部分时间用于精确超光谱优化和增益技术的选择;我们假设,如果附加说明的数据集具有足够的形态多样性,可以捕捉普通人群的多样性,正如由于保持组织形态相似性而在医学成像中常见的那样,经过培训的模型差异错误是普通的CD-蒸汽交易的主要组成部分;因此,我们提议了通过将差异错误引入模型损失功能来利用这一数据属性的“差异意识培训”方法,从而明确将模型正规化。此外,我们提供了一种理论的构思和证明,用以帮助解释方法。我们的方法要求只选择一个超光度计,并匹配或改进州-精度血压分类的性能,同时在GPOC的深度降序中实现“ODA-VAT”的自我监督方法, 并且从GPIC的解算码中将数据解算出数据部分。