COVID-19 frequently provokes pneumonia, which can be diagnosed using imaging exams. Chest X-ray (CXR) is often useful because it is cheap, fast, widespread, and uses less radiation. Here, we demonstrate the impact of lung segmentation in COVID-19 identification using CXR images and evaluate which contents of the image influenced the most. Semantic segmentation was performed using a U-Net CNN architecture, and the classification using three CNN architectures (VGG, ResNet, and Inception). Explainable Artificial Intelligence techniques were employed to estimate the impact of segmentation. A three-classes database was composed: lung opacity (pneumonia), COVID-19, and normal. We assessed the impact of creating a CXR image database from different sources, and the COVID-19 generalization from one source to another. The segmentation achieved a Jaccard distance of 0.034 and a Dice coefficient of 0.982. The classification using segmented images achieved an F1-Score of 0.88 for the multi-class setup, and 0.83 for COVID-19 identification. In the cross-dataset scenario, we obtained an F1-Score of 0.74 and an area under the ROC curve of 0.9 for COVID-19 identification using segmented images. Experiments support the conclusion that even after segmentation, there is a strong bias introduced by underlying factors from different sources.
翻译:COVID-19经常引起肺炎,可以通过成像检查诊断出肺炎。胸X光(CXR)往往有用,因为它是廉价、快速、广泛和较少使用辐射。在这里,我们展示了使用CXR图像进行COVID-19识别的肺分解的影响,并评价了图像中哪些内容影响最大。使用U-NetCNN结构进行了语义分解,并使用三个CNN结构(VGG、ResNet和受孕)进行了分类。使用了可解释的人工智能技术来估计分解的影响。使用了三层数据库:肺不透明(肺炎)、COVID-19和正常。我们评估了从不同来源创建CXR图像数据库的影响,以及从一个来源到另一个来源的COVID-19一般化。分解利用了0.982的Jacccard距离和Dice系数。使用分解图像的分类在多层结构设置上达到了0.88的F1-S,而COVI-19的分级数据库是0.83,在F-VI-C的跨层图像识别中采用了0.9-C的跨段,在F-C结果中采用了一个不同的分析区块的模型。