DAN:一个无分割文件文件注意网络,以确认手写文件的识别 (DAN: a Segmentation-free Document Attention Network for Handwritten Document Recognition) - 专知论文

会员服务 ·

0

文档识别 · MoDELS · Networking · 注意力机制 · 标注 ·

2022 年 3 月 23 日

DAN: a Segmentation-free Document Attention Network for Handwritten Document Recognition

翻译：DAN:一个无分割文件文件注意网络,以确认手写文件的识别

Denis Coquenet,Clément Chatelain,Thierry Paquet

Unconstrained handwritten document recognition is a challenging computer vision task. It is traditionally handled by a two-step approach combining line segmentation followed by text line recognition. For the first time, we propose an end-to-end segmentation-free architecture for the task of handwritten document recognition: the Document Attention Network. In addition to the text recognition, the model is trained to label text parts using begin and end tags in an XML-like fashion. This model is made up of an FCN encoder for feature extraction and a stack of transformer decoder layers for a recurrent token-by-token prediction process. It takes whole text documents as input and sequentially outputs characters, as well as logical layout tokens. Contrary to the existing segmentation-based approaches, the model is trained without using any segmentation label. We achieve competitive results on the READ dataset at page level, as well as double-page level with a CER of 3.53% and 3.69%, respectively. We also provide results for the RIMES dataset at page level, reaching 4.54% of CER. We provide all source code and pre-trained model weights at https://github.com/FactoDeepLearning/DAN.

翻译：不受限制的手写文档识别是一项具有挑战性的计算机愿景任务。它传统上由两步方法处理,将线条分割合并,然后是文字线辨识。我们第一次为手写文件识别任务建议一个不端到端的分解结构: 文件注意网络。除了文本识别, 模型还受过培训, 使用类似XML的起始和结束标记对文本部分进行标签标签。这个模型由用于特征提取的FCN编码器组成, 以及一系列变异器解码层组成, 用于经常性的象征性预测进程。它把整个文本文档作为输入和顺序输出字符, 以及逻辑布局符号。与现有的分解法不同, 该模型在不使用任何分解标签的情况下受到培训。我们实现了页面级的READ数据集的竞争性结果, 以及双页级的CER分别为3.53%和3. 69%。我们还为页面级的RIMES数据集提供了结果, 达到CER的4.54%。我们提供了所有源码, 和在 http://Das/develainFA/FAs.

0

相关内容

文档识别

文档识别主要应用于学习工作等一些关于文档处理的办公领域，可以快速高效利用OCR技术对文案文档、证书、票据、病历、说明书、简历、合同等各类纸质文档进行识别，另外可以通过云端技术将识别后的内容以及图像上传到服务器进行备份储存，并具备方便的检索功能，可以使用户简单方便的找到备份的内容。

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

【ICDAR2019教程】用于文档分析、文本识别和语言建模的深度学习（Deep Learning for Document Analysis, Text Recognition, and Language Modeling）

【ICDAR2019教程】用于文档分析、文本识别和语言建模的深度学习（Deep Learning for Document Analysis, Text Recognition, and Language Modeling）

专知会员服务

22+阅读 · 2019年12月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

超强合集：OCR文本检测干货汇总（含论文、源码、demo等资源）

超强合集：OCR文本检测干货汇总（含论文、源码、demo等资源）

极市平台

33+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

专知

13+阅读 · 2018年6月24日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

全球人工智能

20+阅读 · 2017年12月17日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

基于深度学习的乳腺癌分子生物信息的文本挖掘研究

国家自然科学基金

1+阅读 · 2015年12月31日

动力学涨落对网络结构的影响

国家自然科学基金

0+阅读 · 2015年12月31日

面向混合内存的系统软件机理和关键技术研究

国家自然科学基金

1+阅读 · 2014年12月31日

面向中文文本的事件时空语义解析方法研究

国家自然科学基金

3+阅读 · 2013年12月31日

虚拟化云计算平台内存资源调度技术研究

国家自然科学基金

2+阅读 · 2012年12月31日

云服务环境下服务选择与组合优化方法

国家自然科学基金

0+阅读 · 2011年12月31日

中国典型海洋环境中病毒对细菌多样性和种群结构的影响

国家自然科学基金

0+阅读 · 2009年12月31日

面向GIS的文本空间关系解析机制研究

国家自然科学基金

1+阅读 · 2009年12月31日

基于多层次语言粒度的文本情感分类研究

国家自然科学基金

1+阅读 · 2008年12月31日

木刻藏文经书识别系统中特征提取算法的研究

国家自然科学基金

1+阅读 · 2008年12月31日

FocusNet: Classifying Better by Focusing on Confusing Classes

Arxiv

0+阅读 · 2022年4月20日

Multi-View Spatial-Temporal Network for Continuous Sign Language Recognition

Arxiv

0+阅读 · 2022年4月19日

A Hierarchical Terminal Recognition Approach based on Network Traffic Analysis

Arxiv

0+阅读 · 2022年4月16日

ERGO: Event Relational Graph Transformer for Document-level Event Causality Identification

ERGO: Event Relational Graph Transformer for Document-level Event Causality Identification

Arxiv

0+阅读 · 2022年4月15日

Text Detection and Recognition in the Wild: A Review

Arxiv

20+阅读 · 2020年6月8日

Look-into-Object: Self-supervised Structure Modeling for Object Recognition

Look-into-Object: Self-supervised Structure Modeling for Object Recognition

Arxiv

15+阅读 · 2020年3月31日

CAN-NER: Convolutional Attention Network forChinese Named Entity Recognition

Arxiv

16+阅读 · 2019年4月3日

Graph Convolutional Networks for Text Classification

Arxiv

31+阅读 · 2018年11月13日

Reinforced Self-Attention Network: a Hybrid of Hard and Soft Attention for Sequence Modeling

Arxiv

16+阅读 · 2018年1月31日

Distance-based Self-Attention Network for Natural Language Inference

Arxiv

10+阅读 · 2017年12月6日

VIP会员

文章信息

相关主题

注意力机制

相关VIP内容

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

【ICDAR2019教程】用于文档分析、文本识别和语言建模的深度学习（Deep Learning for Document Analysis, Text Recognition, and Language Modeling）

【ICDAR2019教程】用于文档分析、文本识别和语言建模的深度学习（Deep Learning for Document Analysis, Text Recognition, and Language Modeling）

专知会员服务

22+阅读 · 2019年12月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《太空边缘（临近空间）的武器化？军事高空平台的进展与前景》

《利用星基增强系统（SBAS）信号进行射频干扰（RFI）检测与特征分析》

美陆军在“艾布拉姆斯”坦克与“布拉德利”步战车上测试“牛蛙”反无人机炮塔

《军事领域特性及其对军事人工智能应用的影响》

相关资讯

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

超强合集：OCR文本检测干货汇总（含论文、源码、demo等资源）

超强合集：OCR文本检测干货汇总（含论文、源码、demo等资源）

极市平台

33+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

专知

13+阅读 · 2018年6月24日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

全球人工智能

20+阅读 · 2017年12月17日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

相关论文

FocusNet: Classifying Better by Focusing on Confusing Classes

Arxiv

0+阅读 · 2022年4月20日

Multi-View Spatial-Temporal Network for Continuous Sign Language Recognition

Arxiv

0+阅读 · 2022年4月19日

A Hierarchical Terminal Recognition Approach based on Network Traffic Analysis

Arxiv

0+阅读 · 2022年4月16日

ERGO: Event Relational Graph Transformer for Document-level Event Causality Identification

ERGO: Event Relational Graph Transformer for Document-level Event Causality Identification

Arxiv

0+阅读 · 2022年4月15日

Text Detection and Recognition in the Wild: A Review

Arxiv

20+阅读 · 2020年6月8日

Look-into-Object: Self-supervised Structure Modeling for Object Recognition

Look-into-Object: Self-supervised Structure Modeling for Object Recognition

Arxiv

15+阅读 · 2020年3月31日

CAN-NER: Convolutional Attention Network forChinese Named Entity Recognition

Arxiv

16+阅读 · 2019年4月3日

Graph Convolutional Networks for Text Classification

Arxiv

31+阅读 · 2018年11月13日

Reinforced Self-Attention Network: a Hybrid of Hard and Soft Attention for Sequence Modeling

Arxiv

16+阅读 · 2018年1月31日

Distance-based Self-Attention Network for Natural Language Inference

Arxiv

10+阅读 · 2017年12月6日

相关基金

基于深度学习的乳腺癌分子生物信息的文本挖掘研究

国家自然科学基金

1+阅读 · 2015年12月31日

动力学涨落对网络结构的影响

国家自然科学基金

0+阅读 · 2015年12月31日

面向混合内存的系统软件机理和关键技术研究

国家自然科学基金

1+阅读 · 2014年12月31日

面向中文文本的事件时空语义解析方法研究

国家自然科学基金

3+阅读 · 2013年12月31日

虚拟化云计算平台内存资源调度技术研究

国家自然科学基金

2+阅读 · 2012年12月31日

云服务环境下服务选择与组合优化方法

国家自然科学基金

0+阅读 · 2011年12月31日

中国典型海洋环境中病毒对细菌多样性和种群结构的影响

国家自然科学基金

0+阅读 · 2009年12月31日

面向GIS的文本空间关系解析机制研究

国家自然科学基金

1+阅读 · 2009年12月31日

基于多层次语言粒度的文本情感分类研究

国家自然科学基金

1+阅读 · 2008年12月31日

木刻藏文经书识别系统中特征提取算法的研究

国家自然科学基金

1+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员