In this paper, we propose a novel scene text detection method named TextMountain. The key idea of TextMountain is making full use of border-center information. Different from previous works that treat center-border as a binary classification problem, we predict text center-border probability (TCBP) and text center-direction (TCD). The TCBP is just like a mountain whose top is text center and foot is text border. The mountaintop can separate text instances which cannot be easily achieved using semantic segmentation map and its rising direction can plan a road to top for each pixel on mountain foot at the group stage. The TCD helps TCBP learning better. Our label rules will not lead to the ambiguous problem with the transformation of angle, so the proposed method is robust to multi-oriented text and can also handle well with curved text. In inference stage, each pixel at the mountain foot needs to search the path to the mountaintop and this process can be efficiently completed in parallel, yielding the efficiency of our method compared with others. The experiments on MLT, ICDAR2015, RCTW-17 and SCUT-CTW1500 databases demonstrate that the proposed method achieves better or comparable performance in terms of both accuracy and efficiency. It is worth mentioning our method achieves an F-measure of 76.85% on MLT which outperforms the previous methods by a large margin. Code will be made available.
翻译:在本文中, 我们提议了一个名为 TextMountain 的新的现场文本检测方法。 TextMountain 的关键理念是充分利用边际中心信息。 不同于以前将中边界视为二进分类问题的工作, 我们预测了文本中边界概率( TCBP) 和文本中方向( TCD ) 。 TCBP就像一座山, 山顶是文字中方, 脚是文字边界。 山顶可以将无法轻易实现的文本检测实例区分开来。 山顶可以使用语义分隔图, 它的上升方向可以规划一个通往山脚上每个像素顶部的路径。 TCD 帮助TCBP 学习得更好。 我们的标签规则不会随着角度的转换而导致模糊不清的问题, 因此建议的方法对多方向文本是健全的, 也可以用曲线的文字处理。 在推断阶段中, 山脚上的每个像素需要搜索通往山顶的道路, 这个过程可以同时有效地完成, 使我们的方法与其他方法具有更高的效率。 在 MLT、 ICD- 2015 、 RCT-17 和 CUTF- 15 中的拟议方法中, 都能够 实现一个比较的精确化方法。