Lung cancer is the leading cause of cancer death worldwide and a good prognosis depends on early diagnosis. Unfortunately, screening programs for the early diagnosis of lung cancer are uncommon. This is in-part due to the at-risk groups being located in rural areas far from medical facilities. Reaching these populations would require a scaled approach that combines mobility, low cost, speed, accuracy, and privacy. We can resolve these issues by combining the chest X-ray imaging mode with a federated deep-learning approach, provided that the federated model is trained on homogenous data to ensure that no single data source can adversely bias the model at any point in time. In this study we show that an image pre-processing pipeline that homogenizes and debiases chest X-ray images can improve both internal classification and external generalization, paving the way for a low-cost and accessible deep learning-based clinical system for lung cancer screening. An evolutionary pruning mechanism is used to train a nodule detection deep learning model on the most informative images from a publicly available lung nodule X-ray dataset. Histogram equalization is used to remove systematic differences in image brightness and contrast. Model training is performed using all combinations of lung field segmentation, close cropping, and rib suppression operators. We show that this pre-processing pipeline results in deep learning models that successfully generalize an independent lung nodule dataset using ablation studies to assess the contribution of each operator in this pipeline. In stripping chest X-ray images of known confounding variables by lung field segmentation, along with suppression of signal noise from the bone structure we can train a highly accurate deep learning lung nodule detection algorithm with outstanding generalization accuracy of 89% to nodule samples in unseen data.
翻译:肺癌是全世界癌症死亡的首要原因,良好的预感取决于早期诊断。不幸的是,早期诊断肺癌的筛查方案并不常见。这是部分原因,因为风险群体位于远离医疗设施的农村地区,远离医疗设施。接触这些人群需要规模化的方法,将流动性、低成本、速度、准确性和隐私结合起来。我们可以通过将胸部X射线成像模式与深层学习方法相结合来解决这些问题。只要进化模型在同质数据上培训,以确保没有一个单一数据源能够在任何时候对模型产生不利偏差。在本研究中,我们显示一个图像预处理管道,将胸部X射线图像同化并降低胸部X射线图像的偏差,可以改善内部分类和外部概括化,为低成本和方便的深层学习临床系统进行肺癌筛查。使用进化机制,用公开的肺部X射线数据集来对最知情的胸部图像进行结核检测深层学习模型。使用直流动的直径直径直径心脏的平化模型,在图像精度检测中系统消除系统差异。我们用直径直路路路路路路路路路路路路路路路路路路路路路路路路段的精路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路段路段路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路路