文本识别的隐含胶体注意 (Self-supervised Implicit Glyph Attention for Text Recognition) - 专知论文

会员服务 ·

0

Attention · Performer · 监督 · Learning · Better ·

2022 年 8 月 16 日

Self-supervised Implicit Glyph Attention for Text Recognition

翻译：文本识别的隐含胶体注意

Tongkun Guan,Chaochen Gu,Jingzheng Tu,Xue Yang,Qi Feng,Yudi Zhao,Wei Shen

The attention mechanism has become the de facto module in scene text recognition (STR) methods, due to its capability of extracting character-level representations. These methods can be summarized into implicit attention based and supervised attention based, depended on how the attention is computed, i.e., implicit attention and supervised attention are learned from sequence-level text annotations and character-level bounding box annotations, respectively. Implicit attention, as it may extract coarse or even incorrect spatial regions as character attention, is prone to suffering from an alignment-drifted issue. Supervised attention can alleviate the above issue, but it is category-specific, which requires extra laborious character-level bounding box annotations and would be memory-intensive when the number of character categories is large. To address the aforementioned issues, we propose a novel attention mechanism for STR, self-supervised implicit glyph attention (SIGA). SIGA delineates the glyph structures of text images by jointly self-supervised text segmentation and implicit attention alignment, which serve as the supervision to improve attention correctness without extra character-level annotations. Experimental results demonstrate that SIGA performs consistently and significantly better than previous attention-based STR methods, in terms of both attention correctness and final recognition performance on publicly available context benchmarks and our contributed contextless benchmarks.

翻译：注意机制已成为现场文本识别方法中事实上的模块,因为其具有提取性格表示的能力,这些方法可以归纳为隐性注意,并有监督的注意,取决于注意的计算方式,即从顺序层次的文字说明和字符层次的捆绑框说明中分别得到隐性注意和监督的注意,隐性注意,因为它可能提取粗糙甚至不正确的空间区域,作为性格注意,容易受到调整性调整问题的影响。受到监督的注意可以缓解上述问题,但属于特定类别,需要非常艰苦的性格约束框说明,在性质类别数量大时,这种方法将具有记忆密集性。为了解决上述问题,我们建议对自上而下的隐性约束性约束性方框说明(SIGA)采取新的注意机制,因为它可能提取粗略的、甚至不正确的空间区域,因为它会通过自上而上而下的文字分解和隐性注意的调,作为在不增加性格说明的情况下提高注意的注意程度的监督。实验结果表明,SIGA在公开确认业绩基准方面一贯地和大大改进了我们现有的最后的注意基准。

0

相关内容

Attention

【CVPR 2022】基于层次化视觉语言知识蒸馏的开放词汇单阶段检测，Improving Visual Grounding with Visual-Linguistic Verification and Iterative Reasoning

【CVPR 2022】基于层次化视觉语言知识蒸馏的开放词汇单阶段检测，Improving Visual Grounding with Visual-Linguistic Verification and Iterative Reasoning

专知会员服务

7+阅读 · 2022年3月19日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

【CVPR 2022】利用变分图信息瓶颈改进子图识别，Improving Subgraph Recognition with Variational Graph Information Bottleneck

【CVPR 2022】利用变分图信息瓶颈改进子图识别，Improving Subgraph Recognition with Variational Graph Information Bottleneck

专知会员服务

11+阅读 · 2022年3月12日

【CVPR2020】语义增强的场景文本识别的编码-解码器框架，SEED: Semantics Enhanced Encoder-Decoder Framework for Scene Text Recognition

【CVPR2020】语义增强的场景文本识别的编码-解码器框架，SEED: Semantics Enhanced Encoder-Decoder Framework for Scene Text Recognition

专知会员服务

25+阅读 · 2020年5月22日

【CVPR2020-中科院计算所】弱监督语义分割的自监督等价注意力机制，Self-supervised Equivariant Attention Mechanism for Weakly Supervised Semantic Segmentation

【CVPR2020-中科院计算所】弱监督语义分割的自监督等价注意力机制，Self-supervised Equivariant Attention Mechanism for Weakly Supervised Semantic Segmentation

专知会员服务

76+阅读 · 2020年4月10日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【CVPR2020】用于细粒度动作识别的多模式域自适应，Multi-Modal Domain Adaptation for Fine-Grained Action Recognition

【CVPR2020】用于细粒度动作识别的多模式域自适应，Multi-Modal Domain Adaptation for Fine-Grained Action Recognition

专知会员服务

78+阅读 · 2020年2月25日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文推荐】最新5篇目标跟踪（Object Tracking）相关论文—并行跟踪和验证、光流、自动跟踪、相关滤波集成、CFNet

【论文推荐】最新5篇目标跟踪（Object Tracking）相关论文—并行跟踪和验证、光流、自动跟踪、相关滤波集成、CFNet

专知

25+阅读 · 2018年2月6日

可解释的CNN

可解释的CNN

CreateAMind

17+阅读 · 2017年10月5日

长链非编码RNA GAS5调控基因翻译的分子机制及其在膀胱癌发展中的作用

国家自然科学基金

0+阅读 · 2015年12月31日

线粒体定位的MICAL2基因选择性剪接体调控肺癌细胞凋亡的分子机制

国家自然科学基金

0+阅读 · 2015年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

肝癌细胞上皮间质转化过程中Snai1介导的染色质长程作用与转录抑制

国家自然科学基金

0+阅读 · 2013年12月31日

拟南芥DIF（DRIP1-Interacting Factor）在胁迫信号应答中的功能分析

国家自然科学基金

0+阅读 · 2012年12月31日

长链非编码RNA HOTTIP参与小细胞肺癌耐药的分子机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

适应反应基因ATF3调控细胞骨架重构抑制膀胱癌转移的作用机制

国家自然科学基金

0+阅读 · 2012年12月31日

烟酰胺磷酸核糖转移酶在膀胱癌中的标记作用及其分子机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

CA916798基因通过PI3K/AKT通路参与顺铂耐药的机制研究

国家自然科学基金

0+阅读 · 2009年12月31日

食管癌中靶向调控fascin基因的 miRNA的鉴定及其表达调控机制

国家自然科学基金

0+阅读 · 2009年12月31日

Implicit Neural Deformation for Sparse-View Face Reconstruction

Arxiv

0+阅读 · 2022年10月3日

Spectral Augmentation for Self-Supervised Learning on Graphs

Arxiv

0+阅读 · 2022年10月2日

Dual Progressive Transformations for Weakly Supervised Semantic Segmentation

Arxiv

0+阅读 · 2022年9月30日

Husformer: A Multi-Modal Transformer for Multi-Modal Human State Recognition

Arxiv

0+阅读 · 2022年9月30日

Adaptive Sparse and Monotonic Attention for Transformer-based Automatic Speech Recognition

Arxiv

0+阅读 · 2022年9月30日

Look-into-Object: Self-supervised Structure Modeling for Object Recognition

Look-into-Object: Self-supervised Structure Modeling for Object Recognition

Arxiv

15+阅读 · 2020年3月31日

Knowledge Graph Transfer Network for Few-Shot Recognition

Arxiv

15+阅读 · 2019年11月21日

DAGCN: Dual Attention Graph Convolutional Networks

Arxiv

16+阅读 · 2019年4月4日

CAN-NER: Convolutional Attention Network forChinese Named Entity Recognition

Arxiv

16+阅读 · 2019年4月3日

CNN+CNN: Convolutional Decoders for Image Captioning

Arxiv

21+阅读 · 2018年5月23日

VIP会员

文章信息

相关主题

相关VIP内容

【CVPR 2022】基于层次化视觉语言知识蒸馏的开放词汇单阶段检测，Improving Visual Grounding with Visual-Linguistic Verification and Iterative Reasoning

【CVPR 2022】基于层次化视觉语言知识蒸馏的开放词汇单阶段检测，Improving Visual Grounding with Visual-Linguistic Verification and Iterative Reasoning

专知会员服务

7+阅读 · 2022年3月19日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

【CVPR 2022】利用变分图信息瓶颈改进子图识别，Improving Subgraph Recognition with Variational Graph Information Bottleneck

【CVPR 2022】利用变分图信息瓶颈改进子图识别，Improving Subgraph Recognition with Variational Graph Information Bottleneck

专知会员服务

11+阅读 · 2022年3月12日

【CVPR2020】语义增强的场景文本识别的编码-解码器框架，SEED: Semantics Enhanced Encoder-Decoder Framework for Scene Text Recognition

【CVPR2020】语义增强的场景文本识别的编码-解码器框架，SEED: Semantics Enhanced Encoder-Decoder Framework for Scene Text Recognition

专知会员服务

25+阅读 · 2020年5月22日

【CVPR2020-中科院计算所】弱监督语义分割的自监督等价注意力机制，Self-supervised Equivariant Attention Mechanism for Weakly Supervised Semantic Segmentation

【CVPR2020-中科院计算所】弱监督语义分割的自监督等价注意力机制，Self-supervised Equivariant Attention Mechanism for Weakly Supervised Semantic Segmentation

专知会员服务

76+阅读 · 2020年4月10日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【CVPR2020】用于细粒度动作识别的多模式域自适应，Multi-Modal Domain Adaptation for Fine-Grained Action Recognition

【CVPR2020】用于细粒度动作识别的多模式域自适应，Multi-Modal Domain Adaptation for Fine-Grained Action Recognition

专知会员服务

78+阅读 · 2020年2月25日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《人与智能体在系统工程建模语言V2任务中的性能表现：基于用户中心化的评估方法》308页

《数据安全国家标准体系（2025版）》征求意见稿

AlphaMosaic：人工智能赋能的作战管理系统

《军事行动中通信平台的战略价值：提升战术效能与作战优势》

相关资讯

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文推荐】最新5篇目标跟踪（Object Tracking）相关论文—并行跟踪和验证、光流、自动跟踪、相关滤波集成、CFNet

【论文推荐】最新5篇目标跟踪（Object Tracking）相关论文—并行跟踪和验证、光流、自动跟踪、相关滤波集成、CFNet

专知

25+阅读 · 2018年2月6日

可解释的CNN

可解释的CNN

CreateAMind

17+阅读 · 2017年10月5日

相关论文

Implicit Neural Deformation for Sparse-View Face Reconstruction

Arxiv

0+阅读 · 2022年10月3日

Spectral Augmentation for Self-Supervised Learning on Graphs

Arxiv

0+阅读 · 2022年10月2日

Dual Progressive Transformations for Weakly Supervised Semantic Segmentation

Arxiv

0+阅读 · 2022年9月30日

Husformer: A Multi-Modal Transformer for Multi-Modal Human State Recognition

Arxiv

0+阅读 · 2022年9月30日

Adaptive Sparse and Monotonic Attention for Transformer-based Automatic Speech Recognition

Arxiv

0+阅读 · 2022年9月30日

Look-into-Object: Self-supervised Structure Modeling for Object Recognition

Look-into-Object: Self-supervised Structure Modeling for Object Recognition

Arxiv

15+阅读 · 2020年3月31日

Knowledge Graph Transfer Network for Few-Shot Recognition

Arxiv

15+阅读 · 2019年11月21日

DAGCN: Dual Attention Graph Convolutional Networks

Arxiv

16+阅读 · 2019年4月4日

CAN-NER: Convolutional Attention Network forChinese Named Entity Recognition

Arxiv

16+阅读 · 2019年4月3日

CNN+CNN: Convolutional Decoders for Image Captioning

Arxiv

21+阅读 · 2018年5月23日

相关基金

长链非编码RNA GAS5调控基因翻译的分子机制及其在膀胱癌发展中的作用

国家自然科学基金

0+阅读 · 2015年12月31日

线粒体定位的MICAL2基因选择性剪接体调控肺癌细胞凋亡的分子机制

国家自然科学基金

0+阅读 · 2015年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

肝癌细胞上皮间质转化过程中Snai1介导的染色质长程作用与转录抑制

国家自然科学基金

0+阅读 · 2013年12月31日

拟南芥DIF（DRIP1-Interacting Factor）在胁迫信号应答中的功能分析

国家自然科学基金

0+阅读 · 2012年12月31日

长链非编码RNA HOTTIP参与小细胞肺癌耐药的分子机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

适应反应基因ATF3调控细胞骨架重构抑制膀胱癌转移的作用机制

国家自然科学基金

0+阅读 · 2012年12月31日

烟酰胺磷酸核糖转移酶在膀胱癌中的标记作用及其分子机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

CA916798基因通过PI3K/AKT通路参与顺铂耐药的机制研究

国家自然科学基金

0+阅读 · 2009年12月31日

食管癌中靶向调控fascin基因的 miRNA的鉴定及其表达调控机制

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员