Arbitrary shape text detection is a challenging task due to the high complexity and variety of scene texts. In this work, we propose a novel adaptive boundary proposal network for arbitrary shape text detection, which can learn to directly produce accurate boundary for arbitrary shape text without any post-processing. Our method mainly consists of a boundary proposal model and an innovative adaptive boundary deformation model. The boundary proposal model constructed by multi-layer dilated convolutions is adopted to produce prior information (including classification map, distance field, and direction field) and coarse boundary proposals. The adaptive boundary deformation model is an encoder-decoder network, in which the encoder mainly consists of a Graph Convolutional Network (GCN) and a Recurrent Neural Network (RNN). It aims to perform boundary deformation in an iterative way for obtaining text instance shape guided by prior information from the boundary proposal model. In this way, our method can directly and efficiently generate accurate text boundaries without complex post-processing. Extensive experiments on publicly available datasets demonstrate the state-of-the-art performance of our method.
翻译:由于现场文本的高度复杂性和多样性,任意形状文本探测是一项具有挑战性的任务。在这项工作中,我们建议建立一个新的适应性边界建议网络,用于任意形状的探测,可以学习如何直接为任意形状文本制作准确的边界线,而无需经过任何处理。我们的方法主要包括一个边界建议模型和一个创新的适应性边界变形模型。由多层变形构建的边界建议模型,可以产生事先的信息(包括分类图、距离字段和方向字段)和粗糙的边界建议。适应性边界变形模型是一个编码器-解码器网络,其中编码器主要由一个图集化网络和一个常规神经网络组成。它旨在以迭接方式进行边界变形,以获得由边界建议模型先前信息指导的文字形状。这样,我们的方法可以直接而有效地产生准确的文本边界线,而无需复杂的后处理。对公开提供的数据集进行广泛的实验,显示了我们方法的先进性。