Computer-Aided Diagnosis (CAD) systems for chest radiographs using artificial intelligence (AI) have recently shown a great potential as a second opinion for radiologists. The performances of such systems, however, were mostly evaluated on a fixed dataset in a retrospective manner and, thus, far from the real performances in clinical practice. In this work, we demonstrate a mechanism for validating an AI-based system for detecting abnormalities on X-ray scans, VinDr-CXR, at the Phu Tho General Hospital - a provincial hospital in the North of Vietnam. The AI system was directly integrated into the Picture Archiving and Communication System (PACS) of the hospital after being trained on a fixed annotated dataset from other sources. The performance of the system was prospectively measured by matching and comparing the AI results with the radiology reports of 6,285 chest X-ray examinations extracted from the Hospital Information System (HIS) over the last two months of 2020. The normal/abnormal status of a radiology report was determined by a set of rules and served as the ground truth. Our system achieves an F1 score - the harmonic average of the recall and the precision - of 0.653 (95% CI 0.635, 0.671) for detecting any abnormalities on chest X-rays. Despite a significant drop from the in-lab performance, this result establishes a high level of confidence in applying such a system in real-life situations.
翻译:最近,利用人工智能(AI)对胸腔射线系统计算机辅助诊断(CAD)的计算机辅助诊断(CAD)系统显示出作为放射学家的第二个意见的巨大潜力,然而,这些系统的性能大多以回顾性的方式在固定的数据集中进行了评估,因此与临床实践的实际表现相去甚远。在这项工作中,我们展示了一种机制,用以验证用于检测X射线扫描异常的基于AI的系统,即越南北部省立医院Phu Tho总医院的VinDr-CXR。AI系统直接融入了医院的图片存档和通信系统(PACS),在接受其他来源固定附加说明数据集的培训后,该系统的性能得到了评估,通过将AI的结果与2020年最后两个月从医院信息系统(HIS)提取的6,285胸部X光检查报告进行对比和比较,放射科报告的正常/不正常状况由一套规则确定,并用作地面真相。我们的系统从0.65和0.65中得出了甚高的准确性能。在0.65中,从0.65中得出了甚高的精确度,在0.65中得出了0.65的准确性记录。