The early detection of glaucoma is essential in preventing visual impairment. Artificial intelligence (AI) can be used to analyze color fundus photographs (CFPs) in a cost-effective manner, making glaucoma screening more accessible. While AI models for glaucoma screening from CFPs have shown promising results in laboratory settings, their performance decreases significantly in real-world scenarios due to the presence of out-of-distribution and low-quality images. To address this issue, we propose the Artificial Intelligence for Robust Glaucoma Screening (AIROGS) challenge. This challenge includes a large dataset of around 113,000 images from about 60,000 patients and 500 different screening centers, and encourages the development of algorithms that are robust to ungradable and unexpected input data. We evaluated solutions from 14 teams in this paper, and found that the best teams performed similarly to a set of 20 expert ophthalmologists and optometrists. The highest-scoring team achieved an area under the receiver operating characteristic curve of 0.99 (95% CI: 0.98-0.99) for detecting ungradable images on-the-fly. Additionally, many of the algorithms showed robust performance when tested on three other publicly available datasets. These results demonstrate the feasibility of robust AI-enabled glaucoma screening.
翻译:早期检测青光眼对于防止视觉损伤至关重要。人工智能(AIA)可用于以具有成本效益的方式分析彩金照片(CFPs),使青光眼筛查更容易获得。虽然在实验室环境中,来自青光眼检测的AI模型显示实验室环境中的青光眼筛查结果有希望,但在现实世界情景中,其性能由于分布不均和低质量图像的存在而显著下降。为了解决这一问题,我们提议为Robust Glauecoma筛选(AIROGS)提供人工智能。这项挑战包括由大约60 000名病人和500个不同的筛查中心提供的约113 000张图像组成的大型数据集,并鼓励开发对不可降解和意外输入数据具有活力的算法。我们评估了本文中14个小组的解决方案,发现最佳团队与一组20名专业眼科医生和低质量图像类似。在接收器运行特征曲线为0.99(95% CI:0.98-0.99)下的一个区域得到了一个区域。这个区域,用于在可公开测试的AI-A3级可靠数据时探测不可降解的图像。