了解何处和知道什么:统一字块文件理解前培训 (Knowing Where and What: Unified Word Block Pretraining for Document Understanding) - 专知论文

会员服务 ·

0

Learning · 块 · 可理解性 · Extensibility · 可辨认的 ·

2022 年 7 月 29 日

Knowing Where and What: Unified Word Block Pretraining for Document Understanding

翻译：了解何处和知道什么:统一字块文件理解前培训

Song Tao,Zijian Wang,Tiantian Fan,Canjie Luo,Can Huang

from arxiv, incomplete experiments

Due to the complex layouts of documents, it is challenging to extract information for documents. Most previous studies develop multimodal pre-trained models in a self-supervised way. In this paper, we focus on the embedding learning of word blocks containing text and layout information, and propose UTel, a language model with Unified TExt and Layout pre-training. Specifically, we propose two pre-training tasks: Surrounding Word Prediction (SWP) for the layout learning, and Contrastive learning of Word Embeddings (CWE) for identifying different word blocks. Moreover, we replace the commonly used 1D position embedding with a 1D clipped relative position embedding. In this way, the joint training of Masked Layout-Language Modeling (MLLM) and two newly proposed tasks enables the interaction between semantic and spatial features in a unified way. Additionally, the proposed UTel can process arbitrary-length sequences by removing the 1D position embedding, while maintaining competitive performance. Extensive experimental results show UTel learns better joint representations and achieves superior performance than previous methods on various downstream tasks, though requiring no image modality. Code is available at \url{https://github.com/taosong2019/UTel}.

翻译：由于文件的复杂布局,为文件提取信息具有挑战性。大多数以往的研究都以自我监督的方式开发多式预培训模型。在本文中,我们侧重于嵌入含有文本和布局信息的字块学习,并提议使用UTel(一种语言模型,配有统一 TExt 和布局预培训)。具体地说,我们提议了两项培训前任务:用于布局学习的环绕字形预测(SWP)和用于查找不同字形块的单词嵌入(CWE)对比学习。此外,我们用1D剪接相对位置嵌入的通用1D位置替换了通常使用的 1D 位置。通过这种方式,对MDLLLM(MLM) 和两个新提议的任务进行联合培训,使得语义和空间特征之间能够以统一的方式互动。此外,拟议的UTel(S)可以通过删除1D 位置嵌入,同时保持竞争性的性能。广度实验结果显示UTel在各种下游任务上比以往的方法学习更好的联合表现并取得优异性业绩,但不需要图像模式。 aml/com 。

0

相关内容

Learning

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

对比学习简述

专知会员服务

90+阅读 · 2021年6月29日

【论文翻译】NLP注意力机制综述论文翻译，Attention, please! A Critical Review of Neural Attention Models in Natural Language Processing

【论文翻译】NLP注意力机制综述论文翻译，Attention, please! A Critical Review of Neural Attention Models in Natural Language Processing

专知会员服务

96+阅读 · 2020年4月18日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

2019年自然语言处理NLP亮点总结，29页pdf，NLP Year in Review — 2019 NLP highlights for the year 2019.

2019年自然语言处理NLP亮点总结，29页pdf，NLP Year in Review — 2019 NLP highlights for the year 2019.

专知会员服务

69+阅读 · 2020年1月2日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

中国图象图形学学会CSIG

0+阅读 · 2021年11月10日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

BERT/Transformer/迁移学习NLP资源大列表

BERT/Transformer/迁移学习NLP资源大列表

专知

19+阅读 · 2019年6月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

NeuroD2调控人恶性神经胶质瘤细胞向神经元分化的作用研究

国家自然科学基金

0+阅读 · 2015年12月31日

N-乙酰葡萄糖胺增强TRAIL诱导的非小细胞肺癌凋亡的机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

Kronheimer-Nakajima quiver 模空间与有理曲面

国家自然科学基金

1+阅读 · 2013年12月31日

Mumford-Shah型图像分割问题研究

国家自然科学基金

0+阅读 · 2013年12月31日

地下水中痕量卤素形态分析研究

国家自然科学基金

0+阅读 · 2012年12月31日

Persephin在急性肾损伤中的作用研究

国家自然科学基金

0+阅读 · 2012年12月31日

TRAIL协同IER3调节NF-κB信号通路介导肝癌细胞凋亡的相关机制研究

国家自然科学基金

1+阅读 · 2012年12月31日

原位组装碳纳米管/石墨烯有序结构及其二元协同增强增韧纳米复合材料的机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

不同抗盐机理红树植物叶片富集和释放汞的机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

转录因子CUTL1在恶性黑素瘤发生发展中的作用及机制研究

国家自然科学基金

0+阅读 · 2009年12月31日

Free Lunch for Surgical Video Understanding by Distilling Self-Supervisions

Arxiv

0+阅读 · 2022年9月26日

VAESim: A probabilistic approach for self-supervised prototype discovery

Arxiv

0+阅读 · 2022年9月25日

Self-Supervised Masked Convolutional Transformer Block for Anomaly Detection

Arxiv

1+阅读 · 2022年9月25日

Unifying Vision-and-Language Tasks via Text Generation

Arxiv

10+阅读 · 2021年2月4日

Overcoming Catastrophic Forgetting in Graph Neural Networks

Arxiv

14+阅读 · 2020年12月10日

PEGASUS: Pre-training with Extracted Gap-sentences for Abstractive Summarization

Arxiv

17+阅读 · 2020年6月2日

LayoutLM: Pre-training of Text and Layout for Document Image Understanding

LayoutLM: Pre-training of Text and Layout for Document Image Understanding

Arxiv

12+阅读 · 2020年2月19日

X-BERT: eXtreme Multi-label Text Classification with BERT

X-BERT: eXtreme Multi-label Text Classification with BERT

Arxiv

12+阅读 · 2019年7月4日

XLNet: Generalized Autoregressive Pretraining for Language Understanding

Arxiv

14+阅读 · 2019年6月19日

Pre-Training with Whole Word Masking for Chinese BERT

Arxiv

11+阅读 · 2019年6月19日

VIP会员

文章信息

相关主题

相关VIP内容

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

对比学习简述

专知会员服务

90+阅读 · 2021年6月29日

【论文翻译】NLP注意力机制综述论文翻译，Attention, please! A Critical Review of Neural Attention Models in Natural Language Processing

【论文翻译】NLP注意力机制综述论文翻译，Attention, please! A Critical Review of Neural Attention Models in Natural Language Processing

专知会员服务

96+阅读 · 2020年4月18日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

2019年自然语言处理NLP亮点总结，29页pdf，NLP Year in Review — 2019 NLP highlights for the year 2019.

2019年自然语言处理NLP亮点总结，29页pdf，NLP Year in Review — 2019 NLP highlights for the year 2019.

专知会员服务

69+阅读 · 2020年1月2日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

热门VIP内容

开通专知VIP会员享更多权益服务

《2024年度美国防部作战测试与评估报告》500页

《面相未来作战空中系统中有人-无人编组的AI驱动协作模式选择》含slides

无人机编队飞行：复杂环境中作战的策略、挑战与应用

《探索军事背景下共享大语言模型：AI助手与智能体部署中可扩展性与效率的早期洞察》（含44页slides）

相关资讯

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

中国图象图形学学会CSIG

0+阅读 · 2021年11月10日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

BERT/Transformer/迁移学习NLP资源大列表

BERT/Transformer/迁移学习NLP资源大列表

专知

19+阅读 · 2019年6月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

相关论文

Free Lunch for Surgical Video Understanding by Distilling Self-Supervisions

Arxiv

0+阅读 · 2022年9月26日

VAESim: A probabilistic approach for self-supervised prototype discovery

Arxiv

0+阅读 · 2022年9月25日

Self-Supervised Masked Convolutional Transformer Block for Anomaly Detection

Arxiv

1+阅读 · 2022年9月25日

Unifying Vision-and-Language Tasks via Text Generation

Arxiv

10+阅读 · 2021年2月4日

Overcoming Catastrophic Forgetting in Graph Neural Networks

Arxiv

14+阅读 · 2020年12月10日

PEGASUS: Pre-training with Extracted Gap-sentences for Abstractive Summarization

Arxiv

17+阅读 · 2020年6月2日

LayoutLM: Pre-training of Text and Layout for Document Image Understanding

LayoutLM: Pre-training of Text and Layout for Document Image Understanding

Arxiv

12+阅读 · 2020年2月19日

X-BERT: eXtreme Multi-label Text Classification with BERT

X-BERT: eXtreme Multi-label Text Classification with BERT

Arxiv

12+阅读 · 2019年7月4日

XLNet: Generalized Autoregressive Pretraining for Language Understanding

Arxiv

14+阅读 · 2019年6月19日

Pre-Training with Whole Word Masking for Chinese BERT

Arxiv

11+阅读 · 2019年6月19日

相关基金

NeuroD2调控人恶性神经胶质瘤细胞向神经元分化的作用研究

国家自然科学基金

0+阅读 · 2015年12月31日

N-乙酰葡萄糖胺增强TRAIL诱导的非小细胞肺癌凋亡的机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

Kronheimer-Nakajima quiver 模空间与有理曲面

国家自然科学基金

1+阅读 · 2013年12月31日

Mumford-Shah型图像分割问题研究

国家自然科学基金

0+阅读 · 2013年12月31日

地下水中痕量卤素形态分析研究

国家自然科学基金

0+阅读 · 2012年12月31日

Persephin在急性肾损伤中的作用研究

国家自然科学基金

0+阅读 · 2012年12月31日

TRAIL协同IER3调节NF-κB信号通路介导肝癌细胞凋亡的相关机制研究

国家自然科学基金

1+阅读 · 2012年12月31日

原位组装碳纳米管/石墨烯有序结构及其二元协同增强增韧纳米复合材料的机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

不同抗盐机理红树植物叶片富集和释放汞的机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

转录因子CUTL1在恶性黑素瘤发生发展中的作用及机制研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员