It is crucial to understand the robustness of text detection models with regard to extensive corruptions, since scene text detection techniques have many practical applications. For systematically exploring this problem, we propose two datasets from which to evaluate scene text detection models: ICDAR2015-C (IC15-C) and CTW1500-C (CTW-C). Our study extends the investigation of the performance and robustness of the proposed region proposal, regression and segmentation-based scene text detection frameworks. Furthermore, we perform a robustness analysis of six key components: pre-training data, backbone, feature fusion module, multi-scale predictions, representation of text instances and loss function. Finally, we present a simple yet effective data-based method to destroy the smoothness of text regions by merging background and foreground, which can significantly increase the robustness of different text detection networks. We hope that this study will provide valid data points as well as experience for future research. Benchmark, code and data will be made available at \url{https://github.com/wushilian/robust-scene-text-detection-benchmark}.
翻译:由于现场文本探测技术有许多实际应用,因此,了解对广泛腐败的文本检测模型的稳健性至关重要,因为现场文本检测技术有许多实际应用。为了系统地探索这一问题,我们建议使用两个数据集来评估现场文本检测模型:ICDAR2015-C(IC15-C)和CTW1500-C(CTW-C)。我们的研究扩展了对拟议区域提案、回归和分解的现场文本检测框架的绩效和稳健性的调查。此外,我们对六个关键组成部分进行了稳健性分析:培训前数据、主干线、特征聚合模块、多尺度预测、文本实例和损失功能的表述。最后,我们提出了一个简单而有效的基于数据的方法,通过合并背景和前景来摧毁文本区域的顺畅。这种方法可以大大提高不同文本检测网络的稳健性。我们希望这项研究将提供有效的数据点和今后研究的经验。基准、代码和数据将在以下网站提供:<url{https://github.com/wushlian/robust-stene-text-dection-dection-stemage-stampt-stampt}。