为欠发达语言利用机器学习:乌尔都文文本探测进展 (Leveraging machine learning for less developed languages: Progress on Urdu text detection)

Text detection in natural scene images has applications for autonomous driving, navigation help for elderly and blind people. However, the research on Urdu text detection is usually hindered by lack of data resources. We have developed a dataset of scene images with Urdu text. We present the use of machine learning methods to perform detection of Urdu text from the scene images. We extract text regions using channel enhanced Maximally Stable Extremal Region (MSER) method. First, we classify text and noise based on their geometric properties. Next, we use a support vector machine for early discarding of non-text regions. To further remove the non-text regions, we use histogram of oriented gradients (HoG) features obtained and train a second SVM classifier. This improves the overall performance on text region detection within the scene images. To support research on Urdu text, We aim to make the data freely available for research use. We also aim to highlight the challenges and the research gap for Urdu text detection.

翻译：自然场景图像中的文本检测应用了自主驱动、导航帮助老年人和盲人。然而,对乌尔都语文本检测的研究通常由于缺乏数据资源而受阻。我们开发了带有乌尔都文字的场景图像数据集。我们展示了从场景图像中探测乌尔都文字的机器学习方法。我们使用增强的频道最大稳定Extremal区域(MSER)方法提取文本区域。首先,我们根据它们的几何特性对文本和噪音进行分类。接下来,我们使用支持性矢量机来及早丢弃非文本区域。为了进一步去除非文本区域,我们使用了获得的定向梯度特征的直方图,并培训了第二个SVM分类器。这提高了现场图像中文本检测的总体性能。为了支持对乌尔都文字的研究,我们力求将数据免费提供给研究使用。我们还力求突出乌尔都文字检测的挑战和研究差距。

相关内容

Machine Learning

关注 2241

机器学习（Machine Learning）是一个研究计算学习方法的国际论坛。该杂志发表文章，报告广泛的学习方法应用于各种学习问题的实质性结果。该杂志的特色论文描述研究的问题和方法，应用研究和研究方法的问题。有关学习问题或方法的论文通过实证研究、理论分析或与心理现象的比较提供了坚实的支持。应用论文展示了如何应用学习方法来解决重要的应用问题。研究方法论文改进了机器学习的研究方法。所有的论文都以其他研究人员可以验证或复制的方式描述了支持证据。论文还详细说明了学习的组成部分，并讨论了关于知识表示和性能任务的假设。官网地址：http://dblp.uni-trier.de/db/journals/ml/

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

专知会员服务

39+阅读 · 2020年11月3日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation