Recently, segmentation-based scene text detection methods have drawn extensive attention in the scene text detection field, because of their superiority in detecting the text instances of arbitrary shapes and extreme aspect ratios, profiting from the pixel-level descriptions. However, the vast majority of the existing segmentation-based approaches are limited to their complex post-processing algorithms and the scale robustness of their segmentation models, where the post-processing algorithms are not only isolated to the model optimization but also time-consuming and the scale robustness is usually strengthened by fusing multi-scale feature maps directly. In this paper, we propose a Differentiable Binarization (DB) module that integrates the binarization process, one of the most important steps in the post-processing procedure, into a segmentation network. Optimized along with the proposed DB module, the segmentation network can produce more accurate results, which enhances the accuracy of text detection with a simple pipeline. Furthermore, an efficient Adaptive Scale Fusion (ASF) module is proposed to improve the scale robustness by fusing features of different scales adaptively. By incorporating the proposed DB and ASF with the segmentation network, our proposed scene text detector consistently achieves state-of-the-art results, in terms of both detection accuracy and speed, on five standard benchmarks.
翻译:最近,基于分解的现场文字探测方法在现场文本探测场中引起了广泛的注意,因为这些方法在发现任意形状和极端方面比率的文字实例方面具有优势,从像素级说明中获益,但是,绝大多数现有的分解方法都局限于复杂的后处理算法及其分解模型的规模稳健性,在这些模型中,后处理算法不仅与模型优化分离,而且耗时和比例稳健性通常通过直接使用多尺度地貌图而得到加强。在本文件中,我们建议采用一个可区分的比亚化(DB)模块,将二进制进程(后处理程序中最重要的步骤之一)纳入分解网络。与拟议的DB模块一起,分解网络可以产生更准确的结果,从而通过简单的管道提高文本检测的准确性。此外,还提议了一个高效的适应性比例调控调系统(ASF)模块,通过调控不同尺度的特征来提高比例的稳健性。通过将拟议的DB和ASF系统(后处理程序最重要的步骤之一)纳入后处理程序中的二进制过程,并在拟议的分解速度网络中持续地标中实现我们拟议的DB和自动检测的分解速度基准。