DAN:一个无分割文件文件注意网络,以确认手写文件的识别 (DAN: a Segmentation-free Document Attention Network for Handwritten Document Recognition) - 专知论文

会员服务 ·

0

文档识别 · MoDELS · Networking · 注意力机制 · 标注 ·

2022 年 4 月 7 日

DAN: a Segmentation-free Document Attention Network for Handwritten Document Recognition

翻译：DAN:一个无分割文件文件注意网络,以确认手写文件的识别

Denis Coquenet,Clément Chatelain,Thierry Paquet

Unconstrained handwritten text recognition is a challenging computer vision task. It is traditionally handled by a two-step approach combining line segmentation followed by text line recognition. For the first time, we propose an end-to-end segmentation-free architecture for the task of handwritten document recognition: the Document Attention Network. In addition to the text recognition, the model is trained to label text parts using begin and end tags in an XML-like fashion. This model is made up of an FCN encoder for feature extraction and a stack of transformer decoder layers for a recurrent token-by-token prediction process. It takes whole text documents as input and sequentially outputs characters, as well as logical layout tokens. Contrary to the existing segmentation-based approaches, the model is trained without using any segmentation label. We achieve competitive results on the READ 2016 dataset at page level, as well as double-page level with a CER of 3.53% and 3.69%, respectively. We also provide results for the RIMES 2009 dataset at page level, reaching 4.54% of CER. We provide all source code and pre-trained model weights at https://github.com/FactoDeepLearning/DAN.

翻译：不受限制的手写文本识别是一项具有挑战性的计算机愿景任务。它传统上由两步方法处理,将线条分割合并,然后是文字线辨识。我们第一次为手写文件识别任务提出一个无端到端分解结构: 文件注意网络。除了文本识别外, 模型还受过培训, 使用类似XML的起始和结尾标签对文本部分进行标签。这个模型由用于特征提取的FCN编码器和一组变压器解码器层组成, 用于经常性的象征性预测进程。它将整个文本文档作为输入和顺序输出字符, 以及逻辑布局符号。与基于分解的现有方法相反, 该模型在不使用任何分解标签的情况下受到培训。我们实现了页面级的READ2016数据集的竞争性结果, 以及双页级的CER为3.53%和3. 69%。我们还为页面级的 RIMES 2009 数据集提供了结果, 达到CER的4.54%。我们提供了所有源码, 和在 http://FADAR/FADRATINS/ abregrstrate 体重。

0

相关内容

文档识别

文档识别主要应用于学习工作等一些关于文档处理的办公领域，可以快速高效利用OCR技术对文案文档、证书、票据、病历、说明书、简历、合同等各类纸质文档进行识别，另外可以通过云端技术将识别后的内容以及图像上传到服务器进行备份储存，并具备方便的检索功能，可以使用户简单方便的找到备份的内容。

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

【ICDAR2019教程】用于文档分析、文本识别和语言建模的深度学习（Deep Learning for Document Analysis, Text Recognition, and Language Modeling）

【ICDAR2019教程】用于文档分析、文本识别和语言建模的深度学习（Deep Learning for Document Analysis, Text Recognition, and Language Modeling）

专知会员服务

22+阅读 · 2019年12月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

免费自然语言处理(NLP)课程及教材分享

免费自然语言处理(NLP)课程及教材分享

深度学习与NLP

29+阅读 · 2019年1月18日

超强合集：OCR文本检测干货汇总（含论文、源码、demo等资源）

超强合集：OCR文本检测干货汇总（含论文、源码、demo等资源）

极市平台

33+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

专知

13+阅读 · 2018年6月24日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

全球人工智能

20+阅读 · 2017年12月17日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

基于深度学习的乳腺癌分子生物信息的文本挖掘研究

国家自然科学基金

1+阅读 · 2015年12月31日

动力学涨落对网络结构的影响

国家自然科学基金

0+阅读 · 2015年12月31日

基于光学扫描全息的多图像加密原理及方法研究

国家自然科学基金

0+阅读 · 2014年12月31日

面向中文文本的事件时空语义解析方法研究

国家自然科学基金

3+阅读 · 2013年12月31日

虚拟化云计算平台内存资源调度技术研究

国家自然科学基金

2+阅读 · 2012年12月31日

云服务环境下服务选择与组合优化方法

国家自然科学基金

0+阅读 · 2011年12月31日

中国典型海洋环境中病毒对细菌多样性和种群结构的影响

国家自然科学基金

0+阅读 · 2009年12月31日

面向GIS的文本空间关系解析机制研究

国家自然科学基金

1+阅读 · 2009年12月31日

基于多层次语言粒度的文本情感分类研究

国家自然科学基金

1+阅读 · 2008年12月31日

木刻藏文经书识别系统中特征提取算法的研究

国家自然科学基金

1+阅读 · 2008年12月31日

LayoutLMv3: Pre-training for Document AI with Unified Text and Image Masking

LayoutLMv3: Pre-training for Document AI with Unified Text and Image Masking

Arxiv

0+阅读 · 2022年4月19日

Multi-View Spatial-Temporal Network for Continuous Sign Language Recognition

Arxiv

0+阅读 · 2022年4月19日

Constrained Sequence-to-Tree Generation for Hierarchical Text Classification

Arxiv

0+阅读 · 2022年4月19日

A Hierarchical Terminal Recognition Approach based on Network Traffic Analysis

Arxiv

0+阅读 · 2022年4月16日

Text Detection and Recognition in the Wild: A Review

Arxiv

20+阅读 · 2020年6月8日

Look-into-Object: Self-supervised Structure Modeling for Object Recognition

Look-into-Object: Self-supervised Structure Modeling for Object Recognition

Arxiv

15+阅读 · 2020年3月31日

CAN-NER: Convolutional Attention Network forChinese Named Entity Recognition

Arxiv

16+阅读 · 2019年4月3日

Graph Convolutional Networks for Text Classification

Arxiv

31+阅读 · 2018年11月13日

Reinforced Self-Attention Network: a Hybrid of Hard and Soft Attention for Sequence Modeling

Arxiv

16+阅读 · 2018年1月31日

Distance-based Self-Attention Network for Natural Language Inference

Arxiv

10+阅读 · 2017年12月6日

VIP会员

文章信息

相关主题

注意力机制

相关VIP内容

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

【ICDAR2019教程】用于文档分析、文本识别和语言建模的深度学习（Deep Learning for Document Analysis, Text Recognition, and Language Modeling）

【ICDAR2019教程】用于文档分析、文本识别和语言建模的深度学习（Deep Learning for Document Analysis, Text Recognition, and Language Modeling）

专知会员服务

22+阅读 · 2019年12月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《物联网（IoT）中的无人机通信高效控制》135页

《在GNSS信号降级环境中利用共识实现无人机集群稳健协调》

中程单向攻击无人机的战略意义：俄乌战争启示

《面向无人机集群的避障动态传感器覆盖算法》最新38页

相关资讯

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

免费自然语言处理(NLP)课程及教材分享

免费自然语言处理(NLP)课程及教材分享

深度学习与NLP

29+阅读 · 2019年1月18日

超强合集：OCR文本检测干货汇总（含论文、源码、demo等资源）

超强合集：OCR文本检测干货汇总（含论文、源码、demo等资源）

极市平台

33+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

专知

13+阅读 · 2018年6月24日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

全球人工智能

20+阅读 · 2017年12月17日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

相关论文

LayoutLMv3: Pre-training for Document AI with Unified Text and Image Masking

LayoutLMv3: Pre-training for Document AI with Unified Text and Image Masking

Arxiv

0+阅读 · 2022年4月19日

Multi-View Spatial-Temporal Network for Continuous Sign Language Recognition

Arxiv

0+阅读 · 2022年4月19日

Constrained Sequence-to-Tree Generation for Hierarchical Text Classification

Arxiv

0+阅读 · 2022年4月19日

A Hierarchical Terminal Recognition Approach based on Network Traffic Analysis

Arxiv

0+阅读 · 2022年4月16日

Text Detection and Recognition in the Wild: A Review

Arxiv

20+阅读 · 2020年6月8日

Look-into-Object: Self-supervised Structure Modeling for Object Recognition

Look-into-Object: Self-supervised Structure Modeling for Object Recognition

Arxiv

15+阅读 · 2020年3月31日

CAN-NER: Convolutional Attention Network forChinese Named Entity Recognition

Arxiv

16+阅读 · 2019年4月3日

Graph Convolutional Networks for Text Classification

Arxiv

31+阅读 · 2018年11月13日

Reinforced Self-Attention Network: a Hybrid of Hard and Soft Attention for Sequence Modeling

Arxiv

16+阅读 · 2018年1月31日

Distance-based Self-Attention Network for Natural Language Inference

Arxiv

10+阅读 · 2017年12月6日

相关基金

基于深度学习的乳腺癌分子生物信息的文本挖掘研究

国家自然科学基金

1+阅读 · 2015年12月31日

动力学涨落对网络结构的影响

国家自然科学基金

0+阅读 · 2015年12月31日

基于光学扫描全息的多图像加密原理及方法研究

国家自然科学基金

0+阅读 · 2014年12月31日

面向中文文本的事件时空语义解析方法研究

国家自然科学基金

3+阅读 · 2013年12月31日

虚拟化云计算平台内存资源调度技术研究

国家自然科学基金

2+阅读 · 2012年12月31日

云服务环境下服务选择与组合优化方法

国家自然科学基金

0+阅读 · 2011年12月31日

中国典型海洋环境中病毒对细菌多样性和种群结构的影响

国家自然科学基金

0+阅读 · 2009年12月31日

面向GIS的文本空间关系解析机制研究

国家自然科学基金

1+阅读 · 2009年12月31日

基于多层次语言粒度的文本情感分类研究

国家自然科学基金

1+阅读 · 2008年12月31日

木刻藏文经书识别系统中特征提取算法的研究

国家自然科学基金

1+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员