ERNIE-VIL:通过景点图增加知识的视觉语言代表 (ERNIE-ViL: Knowledge Enhanced Vision-Language Representations Through Scene Graph) - 专知论文

会员服务 ·

0

图 · Vision · Performer · state-of-the-art · 学成 ·

2021 年 3 月 19 日

ERNIE-ViL: Knowledge Enhanced Vision-Language Representations Through Scene Graph

翻译：ERNIE-VIL:通过景点图增加知识的视觉语言代表

Fei Yu,Jiji Tang,Weichong Yin,Yu Sun,Hao Tian,Hua Wu,Haifeng Wang

from arxiv, Paper has been published in the AAAI2021 conference

We propose a knowledge-enhanced approach, ERNIE-ViL, which incorporates structured knowledge obtained from scene graphs to learn joint representations of vision-language. ERNIE-ViL tries to build the detailed semantic connections (objects, attributes of objects and relationships between objects) across vision and language, which are essential to vision-language cross-modal tasks. Utilizing scene graphs of visual scenes, ERNIE-ViL constructs Scene Graph Prediction tasks, i.e., Object Prediction, Attribute Prediction and Relationship Prediction tasks in the pre-training phase. Specifically, these prediction tasks are implemented by predicting nodes of different types in the scene graph parsed from the sentence. Thus, ERNIE-ViL can learn the joint representations characterizing the alignments of the detailed semantics across vision and language. After pre-training on large scale image-text aligned datasets, we validate the effectiveness of ERNIE-ViL on 5 cross-modal downstream tasks. ERNIE-ViL achieves state-of-the-art performances on all these tasks and ranks the first place on the VCR leaderboard with an absolute improvement of 3.7%.

翻译：我们建议一种知识强化方法,即ERNIE-VIL,它包含从场景图中获得的结构化知识,以学习视觉语言的联合表述。ERNIE-VIL试图在视觉和语言之间建立详细的语义联系(对象、物体属性和物体之间的关系),这是视觉和语言跨模式任务必不可少的。利用视觉场景图,ERNIE-VIL构建视觉图像预测任务,即目标预测、属性预测和关系预测任务,在培训前阶段。具体来说,这些预测任务是通过预测场景图中不同类型的节点来执行的。因此,ERNIE-VIL可以学习关于视觉和语言之间详细语义一致性的联合表述。在对大规模图像-文字对齐数据集进行预先培训之后,我们验证ENIE-VIE-VIL在5个跨模式下游任务上的有效性。ERNIE-VIL在所有这些任务上实现了艺术状态,并在VCRA的首选位上将VCR的绝对位列列。

1

相关内容

【AAAI2021】知识增强的视觉-语言预训练技术 ERNIE-ViL

【AAAI2021】知识增强的视觉-语言预训练技术 ERNIE-ViL

专知会员服务

26+阅读 · 2021年1月29日

最新《Transformers模型》教程，64页ppt

最新《Transformers模型》教程，64页ppt

专知会员服务

320+阅读 · 2020年11月26日

KG-BERT：基于BERT的知识图谱补全，KG-BERT: BERT for Knowledge Graph Completion

KG-BERT：基于BERT的知识图谱补全，KG-BERT: BERT for Knowledge Graph Completion

专知会员服务

195+阅读 · 2020年5月31日

语言视觉预训练语言模型揭密，Behind the Scene: Revealing the Secrets of Pre-trained Vision-and-Language Models

语言视觉预训练语言模型揭密，Behind the Scene: Revealing the Secrets of Pre-trained Vision-and-Language Models

专知会员服务

36+阅读 · 2020年5月20日

【知识迁移视觉识别综述论文】Knowledge Transfer in Vision Recognition: A Survey

【知识迁移视觉识别综述论文】Knowledge Transfer in Vision Recognition: A Survey

专知会员服务

30+阅读 · 2020年4月19日

【AAAI2020】知识图谱表示，获取和应用的综述 25页PDF A Survey on Knowledge Graphs: Representation, Acquisition and Applications

【AAAI2020】知识图谱表示，获取和应用的综述 25页PDF A Survey on Knowledge Graphs: Representation, Acquisition and Applications

专知会员服务

95+阅读 · 2020年3月29日

【ACL2019】基于学习注意力机制的知识图谱中关系预测的嵌入 Learning Attention-based Embeddings for Relation Prediction in Knowledge Graphs

【ACL2019】基于学习注意力机制的知识图谱中关系预测的嵌入 Learning Attention-based Embeddings for Relation Prediction in Knowledge Graphs

专知会员服务

122+阅读 · 2020年3月29日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

AAAI2020必读的10篇「知识图谱（Knowledge Graph）」相关论文和代码

AAAI2020必读的10篇「知识图谱（Knowledge Graph）」相关论文和代码

专知会员服务

146+阅读 · 2020年1月10日

【表示学习(Representation Learning)】8篇 NeurIPS 2019论文选读

专知会员服务

54+阅读 · 2019年12月22日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知

133+阅读 · 2020年3月18日

17篇必看[知识图谱Knowledge Graphs] 论文@AAAI2020

17篇必看[知识图谱Knowledge Graphs] 论文@AAAI2020

专知

82+阅读 · 2020年2月13日

【Github】nlp-tutorial：TensorFlow 和 PyTorch 实现各种NLP模型

【Github】nlp-tutorial：TensorFlow 和 PyTorch 实现各种NLP模型

AINLP

14+阅读 · 2019年9月4日

CVPR 2019 | 重磅！34篇 CVPR2019 论文实现代码

CVPR 2019 | 重磅！34篇 CVPR2019 论文实现代码

AI研习社

11+阅读 · 2019年6月21日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

论文浅尝 |「知识表示学习」专题论文推荐

论文浅尝 |「知识表示学习」专题论文推荐

开放知识图谱

13+阅读 · 2018年2月12日

Retrieving Complex Tables with Multi-Granular Graph Representation Learning

Arxiv

0+阅读 · 2021年5月4日

CSKG: The CommonSense Knowledge Graph

CSKG: The CommonSense Knowledge Graph

Arxiv

18+阅读 · 2020年12月21日

AutoETER: Automated Entity Type Representation for Knowledge Graph Embedding

Arxiv

5+阅读 · 2020年10月6日

K-BERT: Enabling Language Representation with Knowledge Graph

K-BERT: Enabling Language Representation with Knowledge Graph

Arxiv

19+阅读 · 2019年9月17日

ERNIE: Enhanced Language Representation with Informative Entities

Arxiv

5+阅读 · 2019年5月17日

SCEF: A Support-Confidence-aware Embedding Framework for Knowledge Graph Refinement

SCEF: A Support-Confidence-aware Embedding Framework for Knowledge Graph Refinement

Arxiv

7+阅读 · 2019年2月18日

Attentive Relational Networks for Mapping Images to Scene Graphs

Arxiv

3+阅读 · 2018年11月26日

Representation Learning for Visual-Relational Knowledge Graphs

Arxiv

9+阅读 · 2018年3月31日

Scene Graph Parsing as Dependency Parsing

Arxiv

6+阅读 · 2018年3月25日

Incorporating Literals into Knowledge Graph Embeddings

Arxiv

4+阅读 · 2018年2月3日

VIP会员

文章信息

相关主题

state-of-the-art

相关VIP内容

【AAAI2021】知识增强的视觉-语言预训练技术 ERNIE-ViL

【AAAI2021】知识增强的视觉-语言预训练技术 ERNIE-ViL

专知会员服务

26+阅读 · 2021年1月29日

最新《Transformers模型》教程，64页ppt

最新《Transformers模型》教程，64页ppt

专知会员服务

320+阅读 · 2020年11月26日

KG-BERT：基于BERT的知识图谱补全，KG-BERT: BERT for Knowledge Graph Completion

KG-BERT：基于BERT的知识图谱补全，KG-BERT: BERT for Knowledge Graph Completion

专知会员服务

195+阅读 · 2020年5月31日

语言视觉预训练语言模型揭密，Behind the Scene: Revealing the Secrets of Pre-trained Vision-and-Language Models

语言视觉预训练语言模型揭密，Behind the Scene: Revealing the Secrets of Pre-trained Vision-and-Language Models

专知会员服务

36+阅读 · 2020年5月20日

【知识迁移视觉识别综述论文】Knowledge Transfer in Vision Recognition: A Survey

【知识迁移视觉识别综述论文】Knowledge Transfer in Vision Recognition: A Survey

专知会员服务

30+阅读 · 2020年4月19日

【AAAI2020】知识图谱表示，获取和应用的综述 25页PDF A Survey on Knowledge Graphs: Representation, Acquisition and Applications

【AAAI2020】知识图谱表示，获取和应用的综述 25页PDF A Survey on Knowledge Graphs: Representation, Acquisition and Applications

专知会员服务

95+阅读 · 2020年3月29日

【ACL2019】基于学习注意力机制的知识图谱中关系预测的嵌入 Learning Attention-based Embeddings for Relation Prediction in Knowledge Graphs

【ACL2019】基于学习注意力机制的知识图谱中关系预测的嵌入 Learning Attention-based Embeddings for Relation Prediction in Knowledge Graphs

专知会员服务

122+阅读 · 2020年3月29日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

AAAI2020必读的10篇「知识图谱（Knowledge Graph）」相关论文和代码

AAAI2020必读的10篇「知识图谱（Knowledge Graph）」相关论文和代码

专知会员服务

146+阅读 · 2020年1月10日

【表示学习(Representation Learning)】8篇 NeurIPS 2019论文选读

专知会员服务

54+阅读 · 2019年12月22日

热门VIP内容

开通专知VIP会员享更多权益服务

《巡飞弹药（爆炸性无人机）威胁态势分析》最新24页报告

《军用后勤无人机：破解战场运输挑战的创新方案》

人工智能战争：以色列、伊朗与新型AI战争形态

《俄乌战争：现代战争未来的启示与经验》

相关资讯

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知

133+阅读 · 2020年3月18日

17篇必看[知识图谱Knowledge Graphs] 论文@AAAI2020

17篇必看[知识图谱Knowledge Graphs] 论文@AAAI2020

专知

82+阅读 · 2020年2月13日

【Github】nlp-tutorial：TensorFlow 和 PyTorch 实现各种NLP模型

【Github】nlp-tutorial：TensorFlow 和 PyTorch 实现各种NLP模型

AINLP

14+阅读 · 2019年9月4日

CVPR 2019 | 重磅！34篇 CVPR2019 论文实现代码

CVPR 2019 | 重磅！34篇 CVPR2019 论文实现代码

AI研习社

11+阅读 · 2019年6月21日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

论文浅尝 |「知识表示学习」专题论文推荐

论文浅尝 |「知识表示学习」专题论文推荐

开放知识图谱

13+阅读 · 2018年2月12日

相关论文

Retrieving Complex Tables with Multi-Granular Graph Representation Learning

Arxiv

0+阅读 · 2021年5月4日

CSKG: The CommonSense Knowledge Graph

CSKG: The CommonSense Knowledge Graph

Arxiv

18+阅读 · 2020年12月21日

AutoETER: Automated Entity Type Representation for Knowledge Graph Embedding

Arxiv

5+阅读 · 2020年10月6日

K-BERT: Enabling Language Representation with Knowledge Graph

K-BERT: Enabling Language Representation with Knowledge Graph

Arxiv

19+阅读 · 2019年9月17日

ERNIE: Enhanced Language Representation with Informative Entities

Arxiv

5+阅读 · 2019年5月17日

SCEF: A Support-Confidence-aware Embedding Framework for Knowledge Graph Refinement

SCEF: A Support-Confidence-aware Embedding Framework for Knowledge Graph Refinement

Arxiv

7+阅读 · 2019年2月18日

Attentive Relational Networks for Mapping Images to Scene Graphs

Arxiv

3+阅读 · 2018年11月26日

Representation Learning for Visual-Relational Knowledge Graphs

Arxiv

9+阅读 · 2018年3月31日

Scene Graph Parsing as Dependency Parsing

Arxiv

6+阅读 · 2018年3月25日

Incorporating Literals into Knowledge Graph Embeddings

Arxiv

4+阅读 · 2018年2月3日

微信扫码咨询专知VIP会员