Over the past few years, the field of scene text detection has progressed rapidly that modern text detectors are able to hunt text in various challenging scenarios. However, they might still fall short when handling text instances of extreme aspect ratios and varying scales. To tackle such difficulties, we propose in this paper a new algorithm for scene text detection, which puts forward a set of strategies to significantly improve the quality of text localization. Specifically, a Text Feature Alignment Module (TFAM) is proposed to dynamically adjust the receptive fields of features based on initial raw detections; a Position-Aware Non-Maximum Suppression (PA-NMS) module is devised to selectively concentrate on reliable raw detections and exclude unreliable ones; besides, we propose an Instance-wise IoU loss for balanced training to deal with text instances of different scales. An extensive ablation study demonstrates the effectiveness and superiority of the proposed strategies. The resulting text detection system, which integrates the proposed strategies with a leading scene text detector EAST, achieves state-of-the-art or competitive performance on various standard benchmarks for text detection while keeping a fast running speed.
翻译:过去几年来,现场文本探测领域进展迅速,现代文本探测器能够在各种富有挑战的情景中捕捉文字,然而,在处理极端方面比率和不同尺度的文本实例时,它们可能仍然不尽人意。为了解决这些困难,我们在本文件中提议为现场文本探测提出一种新的算法,提出一套战略,以显著提高文本定位的质量。具体地说,提议建立一个文本特征调整模块(TFAM),以动态地调整以初步原始检测为基础的可接收功能领域;设计了一个定位软件非马克西穆禁止模块,有选择地集中于可靠的原始检测,排除不可靠的检测;此外,我们提议采用实例的IOU损失,进行平衡的培训,处理不同尺度的文本实例。一项广泛的调整研究显示拟议战略的有效性和优越性。由此形成的文本探测系统,将拟议的战略与领先的现场文本检测仪东帝汶,在保持快速运行的同时,在各种文本检测标准基准上取得最新或竞争性的表现。