Most of the existing object detection methods generate poor glass detection results, due to the fact that the transparent glass shares the same appearance with arbitrary objects behind it in an image. Different from traditional deep learning-based wisdoms that simply use the object boundary as auxiliary supervision, we exploit label decoupling to decompose the original labeled ground-truth (GT) map into an interior-diffusion map and a boundary-diffusion map. The GT map in collaboration with the two newly generated maps breaks the imbalanced distribution of the object boundary, leading to improved glass detection quality. We have three key contributions to solve the transparent glass detection problem: (1) We propose a three-stream neural network (call GlassNet for short) to fully absorb beneficial features in the three maps. (2) We design a multi-scale interactive dilation module to explore a wider range of contextual information. (3) We develop an attention-based boundary-aware feature Mosaic module to integrate multi-modal information. Extensive experiments on the benchmark dataset exhibit clear improvements of our method over SOTAs, in terms of both the overall glass detection accuracy and boundary clearness.
翻译:由于透明玻璃与图像后面的任意物体具有相同的外观,因此大多数现有物体探测方法都产生不良的玻璃探测结果,因为透明玻璃与图像中的任意物体具有相同的外观。不同于传统的基于学习的智慧,这些智慧只是将物体边界作为辅助性监督,我们利用标签脱钩法将原贴有标签的地面真相(GT)地图分解成内部扩散地图和边界扩散地图。GT地图与新绘制的两张地图合作,打破了物体边界的不平衡分布,导致玻璃探测质量的提高。我们有三个关键贡献来解决透明的玻璃探测问题:(1) 我们提议建立一个三流神经网络(用GlassNet作为简称),以充分吸收这三幅地图中的有益特征。(2) 我们设计了一个多尺度的互动比方模块,以探讨更广泛的背景信息。(3) 我们开发了一个基于注意的边界觉特征摩西化模块,以综合多模式信息。关于基准数据集的广泛实验表明,我们在整体玻璃探测准确性和边界清晰度方面对SOTA的方法有了明显改进。