Arbitrary-shaped text detection has recently attracted increasing interests and witnessed rapid development with the popularity of deep learning algorithms. Nevertheless, existing approaches often obtain inaccurate detection results, mainly due to the relatively weak ability to utilize context information and the inappropriate choice of offset references. This paper presents a novel text instance expression which integrates both foreground and background information into the pipeline, and naturally uses the pixels near text boundaries as the offset starts. Besides, a corresponding post-processing algorithm is also designed to sequentially combine the four prediction results and reconstruct the text instance accurately. We evaluate our method on several challenging scene text benchmarks, including both curved and multi-oriented text datasets. Experimental results demonstrate that the proposed approach obtains superior or competitive performance compared to other state-of-the-art methods, e.g., 83.4% F-score for Total-Text, 82.4% F-score for MSRA-TD500, etc.
翻译:任意形状的文本探测最近引起了越来越多的兴趣,并且随着深层学习算法的普及而迅速发展,然而,现有方法往往获得不准确的检测结果,这主要是因为使用背景信息的能力相对薄弱,而且不适当地选择了抵消参考。本文件展示了一个新的文本实例表达方式,将前景和背景信息纳入管道,并自然地使用接近文本边界的像素作为抵消的起点。此外,相应的后处理算法还旨在按顺序将四个预测结果结合起来,并准确地重建文本实例。我们评估了我们在若干具有挑战性的现场文本基准上的方法,包括曲线和多方向的文本数据集。实验结果表明,与其它最先进的方法相比,拟议方法取得了优劣或竞争性的性能,例如,总计图本83.4%的F-芯,MSRA-TD500等82.4%的F-芯。