Segmentation-based methods are widely used for scene text detection due to their superiority in describing arbitrary-shaped text instances. However, two major problems still exist: 1) current label generation techniques are mostly empirical and lack theoretical support, discouraging elaborate label design; 2) as a result, most methods rely heavily on text kernel segmentation which is unstable and requires deliberate tuning. To address these challenges, we propose a human cognition-inspired framework, termed, Conceptual Text Region Network (CTRNet). The framework utilizes Conceptual Text Regions (CTRs), which is a class of cognition-based tools inheriting good mathematical properties, allowing for sophisticated label design. Another component of CTRNet is an inference pipeline that, with the help of CTRs, completely omits the need for text kernel segmentation. Compared with previous segmentation-based methods, our approach is not only more interpretable but also more accurate. Experimental results show that CTRNet achieves state-of-the-art performance on benchmark CTW1500, Total-Text, MSRA-TD500, and ICDAR 2015 datasets, yielding performance gains of up to 2.0%. Notably, to the best of our knowledge, CTRNet is among the first detection models to achieve F-measures higher than 85.0% on all four of the benchmarks, with remarkable consistency and stability.
翻译:由于在描述任意形状文本实例方面的优势,现场文本检测广泛使用基于分解的方法。然而,仍然存在两个主要问题:1)当前标签生成技术大多是经验性的,缺乏理论支持,不利于周密的标签设计;2)因此,大多数方法都严重依赖不稳定和需要审慎调整的文本内核分解。为了应对这些挑战,我们提议了一个人类认知激励框架,称为概念文本区域网络(CTRNet)。框架利用概念文本区域(CTRs),这是一个基于认知的工具类别,继承了良好的数学属性,允许设计复杂的标签。CTRNet的另一个组成部分是一种推断管道,在CTRs的帮助下,完全省略了对文本内核分解的需要。与以前基于分解的方法相比,我们的方法不仅更易解,而且更准确。实验结果显示,CTRNet在基准CTW1500、Text、MSRA-TD500和ICDAR网络的状态表现方面达到了最新的成绩,在CTR2015年的顶级基准中,在Sentral-roblegyal as the regalalalalal as regyal regress regyal to the fal degres to fal degreal degal degy mades to firstaldaliz the firstaldaldaldaldalizs to faldaldaldaliz thes frofaliz faliz faliz faliztaliz falizs to fals to faliztaliztaldals faldaldaldaliztaliztaliztalizs to faldaldaldaldalizs fals tos to falizs faldals to fals toss falds fals faldaldaldalds falds fals fals falds faldaldaldaldaldaldaldaldaldaldaldaldaldaldaldaldaldaldaldaldaldalds 上, 在我们所有的模型中,在2015所有的模型上取得的成绩上取得的成绩上取得的成绩,在2015所有的成绩上取得的成绩成绩到