Existing real-time text detectors reconstruct text contours by shrink-masks directly, which simplifies the framework and can make the model run fast. However, the strong dependence on predicted shrink-masks leads to unstable detection results. Moreover, the discrimination of shrink-masks is a pixelwise prediction task. Supervising the network by shrink-masks only will lose much semantic context, which leads to the false detection of shrink-masks. To address these problems, we construct an efficient text detection network, Adaptive Shrink-Mask for Text Detection (ASMTD), which improves the accuracy during training and reduces the complexity of the inference process. At first, the Adaptive Shrink-Mask (ASM) is proposed to represent texts by shrink-masks and independent adaptive offsets. It weakens the coupling of texts to shrink-masks, which improves the robustness of detection results. Then, the Super-pixel Window (SPW) is designed to supervise the network. It utilizes the surroundings of each pixel to improve the reliability of predicted shrink-masks and does not appear during testing. In the end, a lightweight feature merging branch is constructed to reduce the computational cost. As demonstrated in the experiments, our method is superior to existing state-of-the-art (SOTA) methods in both detection accuracy and speed on multiple benchmarks.
翻译:现有实时文本探测器直接通过缩压质片重建文本轮廓,这简化了框架,并使模型运行速度很快。 但是,对预测的缩压质片的强烈依赖导致检测结果不稳定。 此外,对缩压质片的区别是一个像素预测任务。 仅用缩压质片对网络进行监管会失去许多语义环境, 从而导致对缩压质片进行虚假检测。 为了解决这些问题, 我们建立了一个高效的文本检测网络, 适应性最小质片- 用于文本检测( ASMTD ), 提高培训期间的准确性, 并降低推断过程的复杂性。 首先, 调整性缩压质片( ASM) 提议通过缩压质片和独立的适应性抵消来代表文本。 它会削弱将文字与缩压质片的连接, 从而提高检测结果的稳健性。 然后, 超级像素窗口( SPW) 旨在监督网络。 它利用每个像素的周围提高预测的准确性, 并降低推断过程的复杂性。 首先, 调微量- Masks( ) 在测试中, 演示中, 模拟级测试中, 将显示的系统的升级到当前测算方法将降低成本。