Localizing anatomical landmarks are important tasks in medical image analysis. However, the landmarks to be localized often lack prominent visual features. Their locations are elusive and easily confused with the background, and thus precise localization highly depends on the context formed by their surrounding areas. In addition, the required precision is usually higher than segmentation and object detection tasks. Therefore, localization has its unique challenges different from segmentation or detection. In this paper, we propose a zoom-in attentive network (ZIAN) for anatomical landmark localization in ocular images. First, a coarse-to-fine, or "zoom-in" strategy is utilized to learn the contextualized features in different scales. Then, an attentive fusion module is adopted to aggregate multi-scale features, which consists of 1) a co-attention network with a multiple regions-of-interest (ROIs) scheme that learns complementary features from the multiple ROIs, 2) an attention-based fusion module which integrates the multi-ROIs features and non-ROI features. We evaluated ZIAN on two open challenge tasks, i.e., the fovea localization in fundus images and scleral spur localization in AS-OCT images. Experiments show that ZIAN achieves promising performances and outperforms state-of-the-art localization methods. The source code and trained models of ZIAN are available at https://github.com/leixiaofeng-astar/OMIA9-ZIAN.
翻译:局部化解剖地标是医学图像分析中的重要任务。 然而, 本地化的地标往往缺乏突出的视觉特征。 它们的位置难以捉摸, 很容易与背景混淆, 因此精确的本地化在很大程度上取决于周围地区形成的背景。 此外, 所需的精确度通常高于分解和物体探测任务。 因此, 本地化具有不同于分解或检测的独特挑战性。 在本文中, 我们提议在视觉图像中为解剖地标本地化配置一个关注地点网络( ZIAN ) 。 首先, 使用粗到直线或“ 直线化” 战略在不同尺度上学习背景特征。 然后, 采用一个关注的集成模块, 集成多尺度特征, 包括:1) 共同使用多区域利益区(ROIS) 的共享网络, 2 以关注为主点的聚合模块, 整合多ROIIS的特性和非源化。 我们评估了ZIAN- 的两种公开挑战性任务, i. de. fovieal- OS 和 ARIalalalalizalal 的本地化图象, 在磁盘中, SAIAZ 上展示了有前景的系统化的系统。