终端到终端文档识别和对下沉降的理解 (End-to-end Document Recognition and Understanding with Dessurt) - 专知论文

会员服务 ·

0

文档识别 · 可理解性 · 端到端 · MoDELS · Performer ·

2022 年 6 月 3 日

End-to-end Document Recognition and Understanding with Dessurt

翻译：终端到终端文档识别和对下沉降的理解

Brian Davis,Bryan Morse,Bryan Price,Chris Tensmeyer,Curtis Wigington,Vlad Morariu

We introduce Dessurt, a relatively simple document understanding transformer capable of being fine-tuned on a greater variety of document tasks than prior methods. It receives a document image and task string as input and generates arbitrary text autoregressively as output. Because Dessurt is an end-to-end architecture that performs text recognition in addition to the document understanding, it does not require an external recognition model as prior methods do. Dessurt is a more flexible model than prior methods and is able to handle a variety of document domains and tasks. We show that this model is effective at 9 different dataset-task combinations.

翻译：我们引入了一个相对简单的文件理解变压器Dessurt, 这个变压器能够比以前的方法更精确地调整更多的文档任务。它作为输入接收一个文档图像和任务字符串, 并生成任意的文字自动递增为输出。因为 Dessurt 是一个端对端结构, 除了文件理解外, 还可以进行文字识别, 不需要像以前的方法那样使用外部识别模式。 Dessurt 比以前的方法更灵活, 并且能够处理各种文档域和任务。我们显示, 这个模式在9个不同的数据设置- 任务组合中有效。

0

相关内容

文档识别

文档识别主要应用于学习工作等一些关于文档处理的办公领域，可以快速高效利用OCR技术对文案文档、证书、票据、病历、说明书、简历、合同等各类纸质文档进行识别，另外可以通过云端技术将识别后的内容以及图像上传到服务器进行备份储存，并具备方便的检索功能，可以使用户简单方便的找到备份的内容。

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

专知会员服务

44+阅读 · 2020年11月2日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

非均质量子器件Schr？dinger-Poisson系统多尺度分析与算法研究

国家自然科学基金

0+阅读 · 2014年12月31日

重金属污染土壤原位钝化修复的稳定性及其影响机理

国家自然科学基金

0+阅读 · 2014年12月31日

mTOR功能性单倍体通过ERS-IRE1/α-JNK通路调控乳腺癌细胞药物敏感性的机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

Doublesex基因在对虾性别决定和分化中的功能研究

国家自然科学基金

0+阅读 · 2013年12月31日

杂多化合物-离子液体双催化剂的柴油深度氧化脱硫协同作用的研究

国家自然科学基金

0+阅读 · 2012年12月31日

心肾综合征中p38MAPK通路相关机制和干预治疗的多模态MR研究

国家自然科学基金

0+阅读 · 2012年12月31日

PPARγ调控PI3K/Akt在胰岛素抵抗中的作用

国家自然科学基金

0+阅读 · 2011年12月31日

动脉粥样硬化中PPARγ19978;调c-Ski的机制及作用研究

国家自然科学基金

0+阅读 · 2010年12月31日

以表面活性离子液体为模版制备介孔二氧化硅材料

国家自然科学基金

0+阅读 · 2009年12月31日

离子液体室温电解制备金属钛的基础研究

国家自然科学基金

0+阅读 · 2008年12月31日

ESPnet-SE++: Speech Enhancement for Robust Speech Recognition, Translation, and Understanding

Arxiv

0+阅读 · 2022年7月19日

LayoutLMv3: Pre-training for Document AI with Unified Text and Image Masking

Arxiv

0+阅读 · 2022年7月19日

Recur, Attend or Convolve? On Whether Temporal Modeling Matters for Cross-Domain Robustness in Action Recognition

Arxiv

0+阅读 · 2022年7月15日

Sequence-aware multimodal page classification of Brazilian legal documents

Arxiv

0+阅读 · 2022年7月15日

Emotion Recognition in Conversation using Probabilistic Soft Logic

Arxiv

0+阅读 · 2022年7月14日

A Survey on Deep Learning for Named Entity Recognition

A Survey on Deep Learning for Named Entity Recognition

Arxiv

26+阅读 · 2020年3月13日

LayoutLM: Pre-training of Text and Layout for Document Image Understanding

LayoutLM: Pre-training of Text and Layout for Document Image Understanding

Arxiv

12+阅读 · 2020年2月19日

Hierarchical Graph Pooling with Structure Learning

Arxiv

13+阅读 · 2019年11月14日

Label-aware Double Transfer Learning for Cross-Specialty Medical Named Entity Recognition

Arxiv

10+阅读 · 2018年4月28日

Zero-shot Recognition via Semantic Embeddings and Knowledge Graphs

Arxiv

18+阅读 · 2018年4月8日

VIP会员

文章信息

相关主题

相关VIP内容

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

专知会员服务

44+阅读 · 2020年11月2日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【NeurIPS2025】VideoLucy：用于长视频理解的深度记忆回溯机制

不确定环境下无人机与无人地面车辆编队的地下勘探规划算法 | 122页

【NTU博士论文】端到端鲁棒自动语音识别的最新进展

用于强化学习的扩散模型：基础、分类与发展

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

ESPnet-SE++: Speech Enhancement for Robust Speech Recognition, Translation, and Understanding

Arxiv

0+阅读 · 2022年7月19日

LayoutLMv3: Pre-training for Document AI with Unified Text and Image Masking

Arxiv

0+阅读 · 2022年7月19日

Recur, Attend or Convolve? On Whether Temporal Modeling Matters for Cross-Domain Robustness in Action Recognition

Arxiv

0+阅读 · 2022年7月15日

Sequence-aware multimodal page classification of Brazilian legal documents

Arxiv

0+阅读 · 2022年7月15日

Emotion Recognition in Conversation using Probabilistic Soft Logic

Arxiv

0+阅读 · 2022年7月14日

A Survey on Deep Learning for Named Entity Recognition

A Survey on Deep Learning for Named Entity Recognition

Arxiv

26+阅读 · 2020年3月13日

LayoutLM: Pre-training of Text and Layout for Document Image Understanding

LayoutLM: Pre-training of Text and Layout for Document Image Understanding

Arxiv

12+阅读 · 2020年2月19日

Hierarchical Graph Pooling with Structure Learning

Arxiv

13+阅读 · 2019年11月14日

Label-aware Double Transfer Learning for Cross-Specialty Medical Named Entity Recognition

Arxiv

10+阅读 · 2018年4月28日

Zero-shot Recognition via Semantic Embeddings and Knowledge Graphs

Arxiv

18+阅读 · 2018年4月8日

相关基金

非均质量子器件Schr？dinger-Poisson系统多尺度分析与算法研究

国家自然科学基金

0+阅读 · 2014年12月31日

重金属污染土壤原位钝化修复的稳定性及其影响机理

国家自然科学基金

0+阅读 · 2014年12月31日

mTOR功能性单倍体通过ERS-IRE1/α-JNK通路调控乳腺癌细胞药物敏感性的机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

Doublesex基因在对虾性别决定和分化中的功能研究

国家自然科学基金

0+阅读 · 2013年12月31日

杂多化合物-离子液体双催化剂的柴油深度氧化脱硫协同作用的研究

国家自然科学基金

0+阅读 · 2012年12月31日

心肾综合征中p38MAPK通路相关机制和干预治疗的多模态MR研究

国家自然科学基金

0+阅读 · 2012年12月31日

PPARγ调控PI3K/Akt在胰岛素抵抗中的作用

国家自然科学基金

0+阅读 · 2011年12月31日

动脉粥样硬化中PPARγ19978;调c-Ski的机制及作用研究

国家自然科学基金

0+阅读 · 2010年12月31日

以表面活性离子液体为模版制备介孔二氧化硅材料

国家自然科学基金

0+阅读 · 2009年12月31日

离子液体室温电解制备金属钛的基础研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员