Diversity in data is critical for the successful training of deep learning models. Leveraged by a recurrent generative adversarial network, we propose the CT-SGAN model that generates large-scale 3D synthetic CT-scan volumes ($\geq 224\times224\times224$) when trained on a small dataset of chest CT-scans. CT-SGAN offers an attractive solution to two major challenges facing machine learning in medical imaging: a small number of given i.i.d. training data, and the restrictions around the sharing of patient data preventing to rapidly obtain larger and more diverse datasets. We evaluate the fidelity of the generated images qualitatively and quantitatively using various metrics including Fr\'echet Inception Distance and Inception Score. We further show that CT-SGAN can significantly improve lung nodule detection accuracy by pre-training a classifier on a vast amount of synthetic data.
翻译:数据的多样性对于成功培训深层学习模式至关重要。利用反复出现的基因对抗网络,我们提出CT-SGAN模型,在接受关于胸部CT扫描小数据集的培训后,生成大型3D合成CT扫描卷(geq 224\times224\time224美元)的3D合成CT扫描卷(geq 224\times224美元)。CT-SGAN为机器在医学成像方面学习所面临的两大挑战提供了一个有吸引力的解决方案:少量给定的培训数据,以及分享病人数据以防止迅速获得更大、更多样化数据集方面的限制。我们用包括Fr\'echet Inpeption Convention Convention and Inpeptionnion计分在内的各种计量尺度从质量和数量上评估生成的3D合成CT-SGAN图像的准确性。我们进一步表明,CT-SGAN可以通过对大量合成数据的分类员进行预先培训,大大提高肺结核探测的准确性。