最佳框框:通过强化学习调整附加说明的圆环框,促进端到端的场景文字识别 (Optimal Boxes: Boosting End-to-End Scene Text Recognition by Adjusting Annotated Bounding Boxes via Reinforcement Learning) - 专知论文

会员服务 ·

0

边界框 · F分数 · OCR · 端到端 · 优化器 ·

2022 年 7 月 26 日

Optimal Boxes: Boosting End-to-End Scene Text Recognition by Adjusting Annotated Bounding Boxes via Reinforcement Learning

翻译：最佳框框:通过强化学习调整附加说明的圆环框,促进端到端的场景文字识别

Jingqun Tang,Wenming Qian,Luchuan Song,Xiena Dong,Lan Li,Xiang Bai

from arxiv, Accepted by ECCV 2022

Text detection and recognition are essential components of a modern OCR system. Most OCR approaches attempt to obtain accurate bounding boxes of text at the detection stage, which is used as the input of the text recognition stage. We observe that when using tight text bounding boxes as input, a text recognizer frequently fails to achieve optimal performance due to the inconsistency between bounding boxes and deep representations of text recognition. In this paper, we propose Box Adjuster, a reinforcement learning-based method for adjusting the shape of each text bounding box to make it more compatible with text recognition models. Additionally, when dealing with cross-domain problems such as synthetic-to-real, the proposed method significantly reduces mismatches in domain distribution between the source and target domains. Experiments demonstrate that the performance of end-to-end text recognition systems can be improved when using the adjusted bounding boxes as the ground truths for training. Specifically, on several benchmark datasets for scene text understanding, the proposed method outperforms state-of-the-art text spotters by an average of 2.0% F-Score on end-to-end text recognition tasks and 4.6% F-Score on domain adaptation tasks.

翻译：文本检测和识别是现代 OCR 系统的基本组成部分。多数 OCR 方法试图在检测阶段获得精确的文本捆绑框,用作文本识别阶段的输入。我们观察到,当使用紧的文本捆绑框作为输入时,一个文本识别器往往无法取得最佳性能,因为捆绑框与对文本识别的深度表达不一致。在本文中,我们提议了一个基于强化学习的调整每个文本捆绑盒形状的方法,以使之与文本识别模型更加兼容。此外,在处理合成到现实等跨领域问题时,拟议方法极大地减少了源与目标领域之间域分布不匹配的情况。实验表明,当使用调整后的绑框作为地面真实性培训时,端到端文本识别系统的性能是可以改进的。具体地说,关于现场文本理解的若干基准数据集,拟议方法以终端到端文本识别任务的2.0% F-Sricol 和域适应任务的4.6% F-Sric 的F-Sric 显示器,以平均2.0% F-Sric 的F-S-Sric recol 任务为模式, 。

0

相关内容

边界框

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

【ICIG2021】Latest News & Announcements of the Industry Talk2

【ICIG2021】Latest News & Announcements of the Industry Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年7月29日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

双光子荧光成像脑组织中半胱天蛋白酶活性

国家自然科学基金

0+阅读 · 2015年12月31日

牛磺酸抑制AS肉鸡右心肥大过程中calpains介导细胞凋亡作用的研究

国家自然科学基金

0+阅读 · 2015年12月31日

P-糖蛋白介导的大黄素对有机磷农药在草鱼体内相互作用研究

国家自然科学基金

0+阅读 · 2013年12月31日

蛋白激酶LIMK1活性在小鼠卵母细胞染色体分离过程中的作用和分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

基于肺内AM、IM迁移的归肺经中药升降浮沉药性辨识方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

泛素蛋白酶体通路基因SNPs与晚期食管鳞癌紫杉醇敏感性

国家自然科学基金

0+阅读 · 2012年12月31日

LIMK1：罗格列酮抑制人胃癌细胞增殖、迁移及侵袭的作用靶点

国家自然科学基金

0+阅读 · 2012年12月31日

5，7-二羟基黄酮类化合物抑制柑橘青霉病菌作用靶点及分子机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于Decorin基因甲基化调控的非小细胞肺癌转移的分子机制

国家自然科学基金

0+阅读 · 2011年12月31日

离子液体作介质色谱法测定水中痕量有机胺和有机酚的研究

国家自然科学基金

0+阅读 · 2009年12月31日

MSVIPER: Improved Policy Distillation for Reinforcement-Learning-Based Robot Navigation

MSVIPER: Improved Policy Distillation for Reinforcement-Learning-Based Robot Navigation

Arxiv

0+阅读 · 2022年9月19日

A Transferable and Automatic Tuning of Deep Reinforcement Learning for Cost Effective Phishing Detection

A Transferable and Automatic Tuning of Deep Reinforcement Learning for Cost Effective Phishing Detection

Arxiv

0+阅读 · 2022年9月19日

Self-Supervised Texture Image Anomaly Detection By Fusing Normalizing Flow and Dictionary Learning

Arxiv

0+阅读 · 2022年9月19日

ActiveNeRF: Learning where to See with Uncertainty Estimation

Arxiv

0+阅读 · 2022年9月18日

Detection of Interacting Variables for Generalized Linear Models via Neural Networks

Detection of Interacting Variables for Generalized Linear Models via Neural Networks

Arxiv

0+阅读 · 2022年9月16日

Multi-Task Learning for Visual Scene Understanding

Arxiv

29+阅读 · 2022年3月28日

Contrastive Spatio-Temporal Pretext Learning for Self-supervised Video Representation

Arxiv

11+阅读 · 2021年12月16日

Image-to-Image Retrieval by Learning Similarity between Scene Graphs

Arxiv

21+阅读 · 2020年12月29日

Scene Text Detection and Recognition: The Deep Learning Era

Scene Text Detection and Recognition: The Deep Learning Era

Arxiv

27+阅读 · 2019年9月5日

Rotation-Sensitive Regression for Oriented Scene Text Detection

Arxiv

13+阅读 · 2018年3月14日

VIP会员

文章信息

相关主题

相关VIP内容

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《关于俄乌战争的系列文章》2025最新70页

《军事行动中的人机AI编队本体模型》

更智能的人工智能实现更快速的电磁辐射控制（EMCON）

《俄罗斯常规军队能力现状及重建》2025最新124页

相关资讯

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

【ICIG2021】Latest News & Announcements of the Industry Talk2

【ICIG2021】Latest News & Announcements of the Industry Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年7月29日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

相关论文

MSVIPER: Improved Policy Distillation for Reinforcement-Learning-Based Robot Navigation

MSVIPER: Improved Policy Distillation for Reinforcement-Learning-Based Robot Navigation

Arxiv

0+阅读 · 2022年9月19日

A Transferable and Automatic Tuning of Deep Reinforcement Learning for Cost Effective Phishing Detection

A Transferable and Automatic Tuning of Deep Reinforcement Learning for Cost Effective Phishing Detection

Arxiv

0+阅读 · 2022年9月19日

Self-Supervised Texture Image Anomaly Detection By Fusing Normalizing Flow and Dictionary Learning

Arxiv

0+阅读 · 2022年9月19日

ActiveNeRF: Learning where to See with Uncertainty Estimation

Arxiv

0+阅读 · 2022年9月18日

Detection of Interacting Variables for Generalized Linear Models via Neural Networks

Detection of Interacting Variables for Generalized Linear Models via Neural Networks

Arxiv

0+阅读 · 2022年9月16日

Multi-Task Learning for Visual Scene Understanding

Arxiv

29+阅读 · 2022年3月28日

Contrastive Spatio-Temporal Pretext Learning for Self-supervised Video Representation

Arxiv

11+阅读 · 2021年12月16日

Image-to-Image Retrieval by Learning Similarity between Scene Graphs

Arxiv

21+阅读 · 2020年12月29日

Scene Text Detection and Recognition: The Deep Learning Era

Scene Text Detection and Recognition: The Deep Learning Era

Arxiv

27+阅读 · 2019年9月5日

Rotation-Sensitive Regression for Oriented Scene Text Detection

Arxiv

13+阅读 · 2018年3月14日

相关基金

双光子荧光成像脑组织中半胱天蛋白酶活性

国家自然科学基金

0+阅读 · 2015年12月31日

牛磺酸抑制AS肉鸡右心肥大过程中calpains介导细胞凋亡作用的研究

国家自然科学基金

0+阅读 · 2015年12月31日

P-糖蛋白介导的大黄素对有机磷农药在草鱼体内相互作用研究

国家自然科学基金

0+阅读 · 2013年12月31日

蛋白激酶LIMK1活性在小鼠卵母细胞染色体分离过程中的作用和分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

基于肺内AM、IM迁移的归肺经中药升降浮沉药性辨识方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

泛素蛋白酶体通路基因SNPs与晚期食管鳞癌紫杉醇敏感性

国家自然科学基金

0+阅读 · 2012年12月31日

LIMK1：罗格列酮抑制人胃癌细胞增殖、迁移及侵袭的作用靶点

国家自然科学基金

0+阅读 · 2012年12月31日

5，7-二羟基黄酮类化合物抑制柑橘青霉病菌作用靶点及分子机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于Decorin基因甲基化调控的非小细胞肺癌转移的分子机制

国家自然科学基金

0+阅读 · 2011年12月31日

离子液体作介质色谱法测定水中痕量有机胺和有机酚的研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员