Detecting the marking characters of industrial metal parts remains challenging due to low visual contrast, uneven illumination, corroded character structures, and cluttered background of metal part images. Affected by these factors, bounding boxes generated by most existing methods locate low-contrast text areas inaccurately. In this paper, we propose a refined feature-attentive network (RFN) to solve the inaccurate localization problem. Specifically, we design a parallel feature integration mechanism to construct an adaptive feature representation from multi-resolution features, which enhances the perception of multi-scale texts at each scale-specific level to generate a high-quality attention map. Then, an attentive refinement network is developed by the attention map to rectify the location deviation of candidate boxes. In addition, a re-scoring mechanism is designed to select text boxes with the best rectified location. Moreover, we construct two industrial scene text datasets, including a total of 102156 images and 1948809 text instances with various character structures and metal parts. Extensive experiments on our dataset and four public datasets demonstrate that our proposed method achieves the state-of-the-art performance.
翻译:由于视觉对比低、光化不均、性格结构腐蚀、金属部分图象背景混乱,检测工业金属部分的标记特征仍然具有挑战性。受这些因素影响,大多数现有方法产生的捆绑盒不准确地定位低调文本区域。在本文中,我们建议一个精细的特征强化网络(RFN)以解决不准确的本地化问题。具体地说,我们设计一个平行特征整合机制,从多分辨率特征中构建一个适应性特征代表器,以强化对每个特定尺度级别多尺度文本的感知,以产生高质量的关注地图。然后,通过关注图开发一个仔细的精细化网络,以纠正候选框的位置偏差。此外,我们设计了一个重新校正机制,以选择具有最佳校正位置的文本框。此外,我们建造了两个工业场文本数据集,包括总共102156个图像和1948-809个文本实例,其中含有各种字符结构和金属部分。关于我们的数据集和四个公共数据集的广泛实验表明我们提议的方法达到了状态。