Semantic segmentation in cataract surgery has a wide range of applications contributing to surgical outcome enhancement and clinical risk reduction. However, the varying issues in segmenting the different relevant structures in these surgeries make the designation of a unique network quite challenging. This paper proposes a semantic segmentation network, termed DeepPyramid, that can deal with these challenges using three novelties: (1) a Pyramid View Fusion module which provides a varying-angle global view of the surrounding region centering at each pixel position in the input convolutional feature map; (2) a Deformable Pyramid Reception module which enables a wide deformable receptive field that can adapt to geometric transformations in the object of interest; and (3) a dedicated Pyramid Loss that adaptively supervises multi-scale semantic feature maps. Combined, we show that these modules can effectively boost semantic segmentation performance, especially in the case of transparency, deformability, scalability, and blunt edges in objects. We demonstrate that our approach performs at a state-of-the-art level and outperforms a number of existing methods with a large margin (3.66% overall improvement in intersection over union compared to the best rival approach).
翻译:白内障外科的静默分解应用范围广泛,有助于外科结果增强和临床风险降低。然而,这些外科手术中不同相关结构分解的不同问题使指定一个独特的网络变得相当具有挑战性。本文件提出一个称为DeepPyramid的静默分解网络,可以用三个新颖之处来应对这些挑战:(1) 金字塔视图分解模块,该模块提供以输入导体特征图中每个像素位置为核心的周围区域不同角的全局视图;(2)变形金字塔接收模块,该模块使一个可广泛变形的接受场能够适应利益对象的几何变形变形变形;(3) 一个专门的金字塔损失网络,可适应性地监督多尺度的语谱图。加在一起,我们表明这些模块能够有效地提高静态分解的性能,特别是在透明、变形性、可缩性、可缩缩放性和钝性以及物体的钝性边缘方面。我们展示了我们的方法在最先进的艺术水平和外形化的接收场外形方法,在最大交点上比重(3.66)。