Tuberculosis (TB) is a top-10 cause of death worldwide. Though the WHO recommends chest radiographs (CXRs) for TB screening, the limited availability of CXR interpretation is a barrier. We trained a deep learning system (DLS) to detect active pulmonary TB using CXRs from 9 countries across Africa, Asia, and Europe, and utilized large-scale CXR pretraining, attention pooling, and noisy student semi-supervised learning. Evaluation was on (1) a combined test set spanning China, India, US, and Zambia, and (2) an independent mining population in South Africa. Given WHO targets of 90% sensitivity and 70% specificity, the DLS's operating point was prespecified to favor sensitivity over specificity. On the combined test set, the DLS's ROC curve was above all 9 India-based radiologists, with an AUC of 0.90 (95%CI 0.87-0.92). The DLS's sensitivity (88%) was higher than the India-based radiologists (75% mean sensitivity), p<0.001 for superiority; and its specificity (79%) was non-inferior to the radiologists (84% mean specificity), p=0.004. Similar trends were observed within HIV positive and sputum smear positive sub-groups, and in the South Africa test set. We found that 5 US-based radiologists (where TB isn't endemic) were more sensitive and less specific than the India-based radiologists (where TB is endemic). The DLS also remained non-inferior to the US-based radiologists. In simulations, using the DLS as a prioritization tool for confirmatory testing reduced the cost per positive case detected by 40-80% compared to using confirmatory testing alone. To conclude, our DLS generalized to 5 countries, and merits prospective evaluation to assist cost-effective screening efforts in radiologist-limited settings. Operating point flexibility may permit customization of the DLS to account for site-specific factors such as TB prevalence, demographics, clinical resources, and customary practice patterns.
翻译:肺结核(TB)是全世界10个死亡原因之一。 尽管世卫组织推荐胸腔放射线(CXRs)进行肺结核筛查,但CXR解释有限是一个障碍。 我们训练了一个深度学习系统(DLS)使用非洲、亚洲和欧洲9个国家的CXR(CXR)检测活跃的肺部肺结核,并使用了大型CXR(CXR)预培训、关注集合和吵闹学生半监督学习。评价在 (1) 测试集成于中国、印度、美国和赞比亚,以及(2) 南非的独立采矿人口。鉴于世卫组织90%敏感度和70%特殊性的目标,DLS的操作点预设有利于对特殊性的敏感度。 在综合测试集中,DLS的ROC曲线比全部9个印度的放射线(95 % CI 0.87-0.92 ) 。 DLS的敏感度(88%)比基于印度的放射线(75 % 平均敏感度)、 p <0.001 (P0.001) 优越性;DLS(79 %) 常规测试比正常的温度(DRal4) 测试比非洲更低。