Arbitrary shape text detection is a challenging task due to the significantly varied sizes and aspect ratios, arbitrary orientations or shapes, inaccurate annotations, etc. Due to the scalability of pixel-level prediction, segmentation-based methods can adapt to various shape texts and hence attracted considerable attention recently. However, accurate pixel-level annotations of texts are formidable, and the existing datasets for scene text detection only provide coarse-grained boundary annotations. Consequently, numerous misclassified text pixels or background pixels inside annotations always exist, degrading the performance of segmentation-based text detection methods. Generally speaking, whether a pixel belongs to text or not is highly related to the distance with the adjacent annotation boundary. With this observation, in this paper, we propose an innovative and robust segmentation-based detection method via probability maps for accurately detecting text instances. To be concrete, we adopt a Sigmoid Alpha Function (SAF) to transfer the distances between boundaries and their inside pixels to a probability map. However, one probability map can not cover complex probability distributions well because of the uncertainty of coarse-grained text boundary annotations. Therefore, we adopt a group of probability maps computed by a series of Sigmoid Alpha Functions to describe the possible probability distributions. In addition, we propose an iterative model to learn to predict and assimilate probability maps for providing enough information to reconstruct text instances. Finally, simple region growth algorithms are adopted to aggregate probability maps to complete text instances. Experimental results demonstrate that our method achieves state-of-the-art performance in terms of detection accuracy on several benchmarks.
翻译:任意的形状文本检测是一项具有挑战性的任务,因为其大小和侧比、任意方向或形状、不准确的注释等差异很大。 由于像素级预测的可缩放性,基于分解的方法可以适应各种形状文本,因此最近引起相当大的关注。 然而,精确像素级文本的注释非常艰巨,而现有的现场文本检测数据集只能提供粗化的边界说明。因此,在说明中存在许多错误分类的文本像素或背景像素,降低以分解为基础的文本检测方法的性能。一般而言,一个像素是否属于文本或不是与相邻的注释边界的距离密切相关。在本文中,我们提出一个创新和稳健的分解检测方法,通过概率图来准确检测文本实例。具体地说,我们采用了一个Sigmidal Alpha 函数(SAFAF) 来将边界与内部像素等素之间的距离转移到概率地图。然而,一个概率状态地图不能覆盖复杂的概率分布,因为简单的像素系是否与相邻的注释性说明边界测测测测测测测测度,因此,我们用一个精确的概率测测测测测测测测测测测测测测测测的概率的概率的图表,我们的方法将一个概率测测算的概率测测算的概率测测算方法,我们用一个比的概率测测测测测算法的概率测测测测算的概率测测算法到一个概率测测测测测测度到一个概率测测测测测测测测测测测测测测测测测测测测测测测测测测测测测测测测测测测测测测测测测测测测测测测测测测测测测测测测测测测测测测测测测测测测测测测测测测测测测测测测测测测测测测测测测的概率的概率。