Automatic diabetic retinopathy (DR) grading based on fundus photography has been widely explored to benefit the routine screening and early treatment. Existing researches generally focus on single-field fundus images, which have limited field of view for precise eye examinations. In clinical applications, ophthalmologists adopt two-field fundus photography as the dominating tool, where the information from each field (i.e.,macula-centric and optic disc-centric) is highly correlated and complementary, and benefits comprehensive decisions. However, automatic DR grading based on two-field fundus photography remains a challenging task due to the lack of publicly available datasets and effective fusion strategies. In this work, we first construct a new benchmark dataset (DRTiD) for DR grading, consisting of 3,100 two-field fundus images. To the best of our knowledge, it is the largest public DR dataset with diverse and high-quality two-field images. Then, we propose a novel DR grading approach, namely Cross-Field Transformer (CrossFiT), to capture the correspondence between two fields as well as the long-range spatial correlations within each field. Considering the inherent two-field geometric constraints, we particularly define aligned position embeddings to preserve relative consistent position in fundus. Besides, we perform masked cross-field attention during interaction to flter the noisy relations between fields. Extensive experiments on our DRTiD dataset and a public DeepDRiD dataset demonstrate the effectiveness of our CrossFiT network. The new dataset and the source code of CrossFiT will be publicly available at https://github.com/FDU-VTS/DRTiD.
翻译:在临床应用中,眼科医生采用两地基金摄影作为主导工具,每个领域的信息(即,麦库拉中心与光碟中心)高度关联和互补,并有利于全面决策。然而,基于两地基金摄影的自动DR评级仍是一项具有挑战性的任务,因为缺乏公开的数据集和有效的聚合战略。在这项工作中,我们首先为DR定级建立一个新的基准数据集(DRD),由3 100个两地基金图像组成。据我们所知,这是最大的公共DR数据集,拥有不同和高品质的两地图像。然后,我们提出一个新的DRS定级方法,即跨地基金摄影机(CrossFiet),以记录两个领域之间的通信,作为长期的跨地数据定义。我们长期的SDRF关系,特别是长期的SDRF关系。我们在长期的SDR关系中,将显示我们内部的跨地勤数据定位。