The Coronavirus disease 2019 (COVID-19) has rapidly spread all over the world since its first report in December 2019 and thoracic computed tomography (CT) has become one of the main tools for its diagnosis. In recent years, deep learning-based approaches have shown impressive performance in myriad image recognition tasks. However, they usually require a large number of annotated data for training. Inspired by Ground Glass Opacity (GGO), a common finding in COIVD-19 patient's CT scans, we proposed in this paper a novel self-supervised pretraining method based on pseudo lesions generation and restoration for COVID-19 diagnosis. We used Perlin noise, a gradient noise based mathematical model, to generate lesion-like patterns, which were then randomly pasted to the lung regions of normal CT images to generate pseudo COVID-19 images. The pairs of normal and pseudo COVID-19 images were then used to train an encoder-decoder architecture based U-Net for image restoration, which does not require any labelled data. The pretrained encoder was then fine-tuned using labelled data for COVID-19 diagnosis task. Two public COVID-19 diagnosis datasets made up of CT images were employed for evaluation. Comprehensive experimental results demonstrated that the proposed self-supervised learning approach could extract better feature representation for COVID-19 diagnosis and the accuracy of the proposed method outperformed the supervised model pretrained on large scale images by 6.57% and 3.03% on SARS-CoV-2 dataset and Jinan COVID-19 dataset, respectively.
翻译:2019年科罗纳病毒疾病(COVID-19)自2019年12月首次报告以来迅速蔓延到世界各地。自2019年12月首次报告以来,全方位计算断层成像(CT)已成为诊断的主要工具之一。近年来,深层学习方法在各种图像识别任务中表现出令人印象深刻的性能。然而,它们通常需要大量附加说明的数据用于培训。在GGGGO的启发下,COIVD-19病人CT扫描(GGGO)的共同发现,我们在本文件中提议了一种基于假损害生成和COVI-19大规模诊断恢复的新型自我监督预培训方法。我们使用 Perlin噪音、基于梯度噪音的数学模型来生成类似损害的模式,然后在正常的CT图像的肺部地区随机粘贴,以生成假的COVI19图像。然后使用普通和假的COVID19图像来培训基于U-Net的图像恢复的解析模式-解码器结构,这不需要任何贴标签的数据。我们利用SAR-19模型预设的模型来进行精确的解析,然后用COVI进行实验性数据来测试。