中文本识别基准:数据集、基线和实证研究 (Benchmarking Chinese Text Recognition: Datasets, Baselines, and an Empirical Study) - 专知论文

会员服务 ·

0

CTR · 数据集 · Performer · Extensibility · 基准 ·

2022 年 11 月 25 日

Benchmarking Chinese Text Recognition: Datasets, Baselines, and an Empirical Study

翻译：中文本识别基准:数据集、基线和实证研究

Haiyang Yu,Jingye Chen,Bin Li,Jianqi Ma,Mengnan Guan,Xixi Xu,Xiaocong Wang,Shaobo Qu,Xiangyang Xue

from arxiv, Code is available at https://github.com/FudanVI/benchmarking-chinese-text-recognition

The flourishing blossom of deep learning has witnessed the rapid development of text recognition in recent years. However, the existing text recognition methods are mainly proposed for English texts. As another widely-spoken language, Chinese text recognition (CTR) in all ways has extensive application markets. Based on our observations, we attribute the scarce attention on CTR to the lack of reasonable dataset construction standards, unified evaluation protocols, and results of the existing baselines. To fill this gap, we manually collect CTR datasets from publicly available competitions, projects, and papers. According to application scenarios, we divide the collected datasets into four categories including scene, web, document, and handwriting datasets. Besides, we standardize the evaluation protocols in CTR. With unified evaluation protocols, we evaluate a series of representative text recognition methods on the collected datasets to provide baselines. The experimental results indicate that the performance of baselines on CTR datasets is not as good as that on English datasets due to the characteristics of Chinese texts that are quite different from the Latin alphabet. Moreover, we observe that by introducing radical-level supervision as an auxiliary task, the performance of baselines can be further boosted. The code and datasets are made publicly available at https://github.com/FudanVI/benchmarking-chinese-text-recognition

翻译：近些年来,深层学习的蓬勃发展,见证了文本识别的迅速发展;然而,现有的文本识别方法主要针对英文文本提出。作为另一种广泛语言,中国文本识别(CTR)在所有方面都有广泛的应用市场。根据我们的观察,我们对CTR的很少关注归因于缺乏合理的数据集构建标准、统一的评估协议和现有基线的结果。为了填补这一空白,我们从公开提供的竞赛、项目和文件中手工收集CTR数据集。根据应用设想,我们将收集的数据集分为四类,包括场景、网络、文档和笔迹数据集。此外,我们还将CTR的评价协议标准化。通过统一的评估协议,我们评估所收集的数据集上的一系列具有代表性的文本识别方法,以提供基线。实验结果表明,CTR数据集基线的性能不如英文数据集的好,因为中文文本的特点与拉丁字母大不相同。此外,我们发现,通过引入激进级别的监督作为辅助任务,可将基准/文本数据集的性能进一步提升。

1

相关内容

CTR

ICLR 2021杰出论文奖出炉，8篇论文上榜！

专知会员服务

26+阅读 · 2021年4月2日

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

专知会员服务

44+阅读 · 2020年11月2日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

ABCG2胞间转移瞬时保护肿瘤细胞逃避化疗杀伤的作用机理研究

国家自然科学基金

0+阅读 · 2015年12月31日

GI介导干旱胁迫响应和干旱逃逸的分子机理

国家自然科学基金

0+阅读 · 2014年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

地基InSAR高边坡三维变形提取方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

Tip60在oxLDL诱导的血管平滑肌细胞自噬及增殖中的作用机制

国家自然科学基金

0+阅读 · 2013年12月31日

基于DR/ICT的BGA封装器件内部缺陷检测技术

国家自然科学基金

0+阅读 · 2012年12月31日

基于TDLAS技术的温度场和浓度场“CT”模型重建方法研究

国家自然科学基金

1+阅读 · 2012年12月31日

发光二极管LED非相干宽带腔增强吸收光谱技术对大气HONO的定量方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

过渡金属掺杂AlN单晶生长与物性研究

国家自然科学基金

0+阅读 · 2011年12月31日

基于list-mode数据的快速SART真3D PET断层重建算法的研究

国家自然科学基金

0+阅读 · 2011年12月31日

A Study of Slang Representation Methods

Arxiv

0+阅读 · 2023年1月27日

A Comparative Study of Pretrained Language Models for Long Clinical Text

Arxiv

0+阅读 · 2023年1月27日

LoRaLay: A Multilingual and Multimodal Dataset for Long Range and Layout-Aware Summarization

LoRaLay: A Multilingual and Multimodal Dataset for Long Range and Layout-Aware Summarization

Arxiv

1+阅读 · 2023年1月26日

MS-Shift: An Analysis of MS MARCO Distribution Shifts on Neural Retrieval

Arxiv

0+阅读 · 2023年1月25日

Cross-lingual Argument Mining in the Medical Domain

Arxiv

1+阅读 · 2023年1月25日

An Experimental Study on Pretraining Transformers from Scratch for IR

Arxiv

0+阅读 · 2023年1月25日

An Overview on Machine Translation Evaluation

An Overview on Machine Translation Evaluation

Arxiv

14+阅读 · 2022年2月22日

A Battle of Network Structures: An Empirical Study of CNN, Transformer, and MLP

Arxiv

12+阅读 · 2021年8月30日

Text Detection and Recognition in the Wild: A Review

Arxiv

20+阅读 · 2020年6月8日

Incorporating Dictionaries into Deep Neural Networks for the Chinese Clinical Named Entity Recognition

Arxiv

12+阅读 · 2018年4月13日

VIP会员

文章信息

相关主题

相关VIP内容

ICLR 2021杰出论文奖出炉，8篇论文上榜！

专知会员服务

26+阅读 · 2021年4月2日

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

专知会员服务

44+阅读 · 2020年11月2日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

热门VIP内容

开通专知VIP会员享更多权益服务

【CMU博士论文】数据驱动决策中的激励、信息与不确定性

DGP双粒度提示框架：图增强大模型助力欺诈检测

【ICCV2025】ESSENTIAL：用于视频类增量学习的情景记忆与语义记忆整合

唯快不破：大型语言模型高效架构综述

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

A Study of Slang Representation Methods

Arxiv

0+阅读 · 2023年1月27日

A Comparative Study of Pretrained Language Models for Long Clinical Text

Arxiv

0+阅读 · 2023年1月27日

LoRaLay: A Multilingual and Multimodal Dataset for Long Range and Layout-Aware Summarization

LoRaLay: A Multilingual and Multimodal Dataset for Long Range and Layout-Aware Summarization

Arxiv

1+阅读 · 2023年1月26日

MS-Shift: An Analysis of MS MARCO Distribution Shifts on Neural Retrieval

Arxiv

0+阅读 · 2023年1月25日

Cross-lingual Argument Mining in the Medical Domain

Arxiv

1+阅读 · 2023年1月25日

An Experimental Study on Pretraining Transformers from Scratch for IR

Arxiv

0+阅读 · 2023年1月25日

An Overview on Machine Translation Evaluation

An Overview on Machine Translation Evaluation

Arxiv

14+阅读 · 2022年2月22日

A Battle of Network Structures: An Empirical Study of CNN, Transformer, and MLP

Arxiv

12+阅读 · 2021年8月30日

Text Detection and Recognition in the Wild: A Review

Arxiv

20+阅读 · 2020年6月8日

Incorporating Dictionaries into Deep Neural Networks for the Chinese Clinical Named Entity Recognition

Arxiv

12+阅读 · 2018年4月13日

相关基金

ABCG2胞间转移瞬时保护肿瘤细胞逃避化疗杀伤的作用机理研究

国家自然科学基金

0+阅读 · 2015年12月31日

GI介导干旱胁迫响应和干旱逃逸的分子机理

国家自然科学基金

0+阅读 · 2014年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

地基InSAR高边坡三维变形提取方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

Tip60在oxLDL诱导的血管平滑肌细胞自噬及增殖中的作用机制

国家自然科学基金

0+阅读 · 2013年12月31日

基于DR/ICT的BGA封装器件内部缺陷检测技术

国家自然科学基金

0+阅读 · 2012年12月31日

基于TDLAS技术的温度场和浓度场“CT”模型重建方法研究

国家自然科学基金

1+阅读 · 2012年12月31日

发光二极管LED非相干宽带腔增强吸收光谱技术对大气HONO的定量方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

过渡金属掺杂AlN单晶生长与物性研究

国家自然科学基金

0+阅读 · 2011年12月31日

基于list-mode数据的快速SART真3D PET断层重建算法的研究

国家自然科学基金

0+阅读 · 2011年12月31日

微信扫码咨询专知VIP会员