Lung cancer is the leading cause of cancer death and morbidity worldwide. Many studies have shown machine learning models to be effective at detecting lung nodules from chest X-ray images. However, these techniques have yet to be embraced by the medical community due to several practical, ethical, and regulatory constraints stemming from the black-box nature of deep learning models. Additionally, most lung nodules visible on chest X-ray are benign; therefore, the narrow task of computer vision-based lung nodule detection cannot be equated to automated lung cancer detection. Addressing both concerns, this study introduces a novel hybrid deep learning and decision tree-based computer vision model which presents lung cancer malignancy predictions as interpretable decision trees. The deep learning component of this process is trained using a large publicly available dataset on pathological biomarkers associated with lung cancer. These models are then used to inference biomarker scores for chest X-ray images from two, independent data sets for which malignancy metadata is available. We mine multi-variate predictive models by fitting shallow decision trees to the malignancy stratified datasets and interrogate a range of metrics to determine the best model. Our best decision tree model achieves sensitivity and specificity of 86.7% and 80.0% respectively with a positive predictive value of 92.9%. Decision trees mined using this method may be considered as a starting point for refinement into clinically useful multi-variate lung cancer malignancy models for implementation as a workflow augmentation tool to improve the efficiency of human radiologists.
翻译:肺癌是全世界癌症死亡和发病的主因。许多研究表明,机器学习模型在从胸前X射线图像中检测肺结核方面是有效的。然而,由于深层学习模型的黑箱性质造成若干实际、伦理和监管限制,医疗界尚未接受这些技术。此外,胸前X射线上可见的多数肺结核是良性的;因此,基于计算机视线的肺结核检测任务不能等同于自动检测肺癌。针对这两个关切,本研究引入了一种新的混合深层学习和决策树型计算机愿景模型,该模型将肺癌恶性恶性肿瘤预测作为可解释的决策树。这一进程的深学习组成部分尚未被医疗界接受,因为由于与肺癌有关的病理学生物标志的黑箱性质,这些技术尚未被接受。此外,在胸部X射线图像中,大多数可见的胸部胸部结核结核是无害的;因此,不能将基于计算机视线的肺癌检测任务与基于轻度决定的树木和基于可解释性肿瘤的肿瘤的肿瘤病理学变模型相结合,并询问一系列指标,以确定最佳模型的模型。我们的最佳决定模型将具备了88-9号的直径的直径精度精确度,其精确特性,将可被用于用于80.0的根根根根根根根的直根的直径的直径的精确度和直根根根根。