Due to the COVID-19 global pandemic, computer-assisted diagnoses of medical images have gained much attention, and robust methods of semantic segmentation of Computed Tomography (CT) images have become highly desirable. In this work, we present a deeper analysis of how data augmentation techniques improve segmentation performance on this problem. We evaluate 20 traditional augmentation techniques on five public datasets. Six different probabilities of applying each augmentation technique on an image were evaluated. We also assess a different training methodology where the training subsets are combined into a single larger set. All networks were evaluated through a 5-fold cross-validation strategy, resulting in over 4,600 experiments. We also propose a novel data augmentation technique based on Generative Adversarial Networks (GANs) to create new healthy and unhealthy lung CT images, evaluating four variations of our approach with the same six probabilities of the traditional methods. Our findings show that GAN-based techniques and spatial-level transformations are the most promising for improving the learning of deep models on this problem, with the StarGANv2 + F with a probability of 0.3 achieving the highest F-score value on the Ricord1a dataset in the unified training strategy. Our code is publicly available at https://github.com/VRI-UFPR/DACov2022
翻译:由于COVID-19全球大流行,计算机辅助诊断医疗图像的工作引起了人们的极大关注,对计算成形成像(CT)图像的精密分解方法也变得非常可取;在这项工作中,我们深入分析了数据增强技术如何改善这一问题的分解性表现;在5个公共数据集中评估了20种传统增强技术;评估了在图像上应用每种增强技术的6种不同概率;还评估了一种不同的培训方法,其中将培训子集合并成一个更大的数据集;通过5倍交叉校验战略对所有网络进行了评价,结果进行了4 600多次实验;我们还提议了一种基于GANV2+F的新型数据增强技术,以创造新的健康和不健康的肺部CT图像;我们用传统方法的6种概率评价了我们的方法的4种差异;我们的调查结果显示,基于GAN的技术和空间级转换是改进这一问题深层模型学习的最有希望的,StarGANVV2+F, 概率为4 600多;我们还提议了基于GANAR-PR的新型战略,以0.3实现我们现有的F-RAF-RA/RACRA最高值。</s>