关系变换器网络 (Relation Transformer Network) - 专知论文

会员服务 ·

0

INTERACT · 变换 · Networking · 位置嵌入 · 图 ·

2021 年 7 月 20 日

Relation Transformer Network

翻译：关系变换器网络

Rajat Koner,Suprosanna Shit,Volker Tresp

The extraction of a scene graph with objects as nodes and mutual relationships as edges is the basis for a deep understanding of image content. Despite recent advances, such as message passing and joint classification, the detection of visual relationships remains a challenging task due to sub-optimal exploration of the mutual interaction among the visual objects. In this work, we propose a novel transformer formulation for scene graph generation and relation prediction. We leverage the encoder-decoder architecture of the transformer for rich feature embedding of nodes and edges. Specifically, we model the node-to-node interaction with the self-attention of the transformer encoder and the edge-to-node interaction with the cross-attention of the transformer decoder. Further, we introduce a novel positional embedding suitable to handle edges in the decoder. Finally, our relation prediction module classifies the directed relation from the learned node and edge embedding. We name this architecture as Relation Transformer Network (RTN). On the Visual Genome and GQA dataset, we have achieved an overall mean of 4.85% and 3.1% point improvement in comparison with state-of-the-art methods. Our experiments show that Relation Transformer can efficiently model context across various datasets with small, medium, and large-scale relation classification.

翻译：提取带有节点和边缘等对象的图像图形,是深入理解图像内容的基础。尽管最近取得了一些进步,例如信息传递和联合分类等,但发现视觉关系仍然是一项艰巨的任务,因为对视觉对象之间的相互作用进行亚最佳的探索。在这项工作中,我们建议为屏幕图形生成和关系预测提供新型变压器配方。我们利用变压器变压器的编码器解码器结构嵌入节点和边缘的丰富特征嵌入。具体地说,我们模拟与变压器编码器自我保存的节点到节点互动和与变压器解码器交叉保护的边缘到节点互动。此外,我们引入了一种新的定位嵌入适合处理解码器边缘的配置。最后,我们的关系预测模块将所学节点和边缘嵌入的导出关系归为Relationer 变压器网络(RTN)。在视觉基因组和GQA数据集方面,我们实现了4.85%和3.1%的边际互动,我们实现了整个变压模型的总体平均值,与大变压器的中位对比,我们可以展示各种变压的中标。

0

相关内容

INTERACT

IFIP TC13 Conference on Human-Computer Interaction是人机交互领域的研究者和实践者展示其工作的重要平台。多年来，这些会议吸引了来自几个国家和文化的研究人员。官网链接：http://interact2019.org/

最新《Transformers模型》教程，64页ppt

最新《Transformers模型》教程，64页ppt

专知会员服务

321+阅读 · 2020年11月26日

Transformer模型-深度学习自然语言处理，17页ppt

Transformer模型-深度学习自然语言处理，17页ppt

专知会员服务

107+阅读 · 2020年8月30日

【ICLR 2019】双曲注意力网络，Hyperbolic Attention Network

【ICLR 2019】双曲注意力网络，Hyperbolic Attention Network

专知会员服务

84+阅读 · 2020年6月21日

Transformer文本分类代码

Transformer文本分类代码

专知会员服务

118+阅读 · 2020年2月3日

【论文推荐】基于元学习的小样本链接预测：FEW SHOT LINK PREDICTION VIA META LEARNING

【论文推荐】基于元学习的小样本链接预测：FEW SHOT LINK PREDICTION VIA META LEARNING

专知会员服务

57+阅读 · 2019年12月23日

【图机器学习论文】综述：网络表示学习（Network Representation Learning: A Survey）

【图机器学习论文】综述：网络表示学习（Network Representation Learning: A Survey）

专知会员服务

91+阅读 · 2019年12月16日

【NeurIPS2019】图变换网络：Graph Transformer Network

【NeurIPS2019】图变换网络：Graph Transformer Network

专知会员服务

112+阅读 · 2019年11月25日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

Transformer中的相对位置编码

Transformer中的相对位置编码

AINLP

5+阅读 · 2020年11月28日

Graph Neural Network（GNN）最全资源整理分享

Graph Neural Network（GNN）最全资源整理分享

深度学习与NLP

339+阅读 · 2019年7月9日

BERT/Transformer/迁移学习NLP资源大列表

BERT/Transformer/迁移学习NLP资源大列表

专知

19+阅读 · 2019年6月9日

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

AINLP

40+阅读 · 2019年6月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

语义分割 | context relation

语义分割 | context relation

极市平台

8+阅读 · 2019年2月9日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

SBAT: Video Captioning with Sparse Boundary-Aware Transformer

Arxiv

4+阅读 · 2020年7月23日

Generalized Multi-Relational Graph Convolution Network

Arxiv

10+阅读 · 2020年6月12日

MHSAN: Multi-Head Self-Attention Network for Visual Semantic Embedding

MHSAN: Multi-Head Self-Attention Network for Visual Semantic Embedding

Arxiv

4+阅读 · 2020年1月11日

Text Generation from Knowledge Graphs with Graph Transformers

Arxiv

3+阅读 · 2019年5月18日

Convolutional Self-Attention Network

Arxiv

6+阅读 · 2019年4月8日

Relation-aware Graph Attention Network for Visual Question Answering

Arxiv

4+阅读 · 2019年3月29日

Neural Speech Synthesis with Transformer Network

Neural Speech Synthesis with Transformer Network

Arxiv

5+阅读 · 2019年1月30日

Attentive Relational Networks for Mapping Images to Scene Graphs

Arxiv

3+阅读 · 2018年11月26日

Self-Attention Recurrent Network for Saliency Detection

Self-Attention Recurrent Network for Saliency Detection

Arxiv

5+阅读 · 2018年8月5日

R-VQA: Learning Visual Relation Facts with Semantic Attention for Visual Question Answering

Arxiv

7+阅读 · 2018年5月24日

VIP会员

文章信息

相关主题

相关VIP内容

最新《Transformers模型》教程，64页ppt

最新《Transformers模型》教程，64页ppt

专知会员服务

321+阅读 · 2020年11月26日

Transformer模型-深度学习自然语言处理，17页ppt

Transformer模型-深度学习自然语言处理，17页ppt

专知会员服务

107+阅读 · 2020年8月30日

【ICLR 2019】双曲注意力网络，Hyperbolic Attention Network

【ICLR 2019】双曲注意力网络，Hyperbolic Attention Network

专知会员服务

84+阅读 · 2020年6月21日

Transformer文本分类代码

Transformer文本分类代码

专知会员服务

118+阅读 · 2020年2月3日

【论文推荐】基于元学习的小样本链接预测：FEW SHOT LINK PREDICTION VIA META LEARNING

【论文推荐】基于元学习的小样本链接预测：FEW SHOT LINK PREDICTION VIA META LEARNING

专知会员服务

57+阅读 · 2019年12月23日

【图机器学习论文】综述：网络表示学习（Network Representation Learning: A Survey）

【图机器学习论文】综述：网络表示学习（Network Representation Learning: A Survey）

专知会员服务

91+阅读 · 2019年12月16日

【NeurIPS2019】图变换网络：Graph Transformer Network

【NeurIPS2019】图变换网络：Graph Transformer Network

专知会员服务

112+阅读 · 2019年11月25日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

热门VIP内容

开通专知VIP会员享更多权益服务

《物联网（IoT）中的无人机通信高效控制》135页

《在GNSS信号降级环境中利用共识实现无人机集群稳健协调》

中程单向攻击无人机的战略意义：俄乌战争启示

《面向无人机集群的避障动态传感器覆盖算法》最新38页

相关资讯

Transformer中的相对位置编码

Transformer中的相对位置编码

AINLP

5+阅读 · 2020年11月28日

Graph Neural Network（GNN）最全资源整理分享

Graph Neural Network（GNN）最全资源整理分享

深度学习与NLP

339+阅读 · 2019年7月9日

BERT/Transformer/迁移学习NLP资源大列表

BERT/Transformer/迁移学习NLP资源大列表

专知

19+阅读 · 2019年6月9日

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

AINLP

40+阅读 · 2019年6月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

语义分割 | context relation

语义分割 | context relation

极市平台

8+阅读 · 2019年2月9日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

相关论文

SBAT: Video Captioning with Sparse Boundary-Aware Transformer

Arxiv

4+阅读 · 2020年7月23日

Generalized Multi-Relational Graph Convolution Network

Arxiv

10+阅读 · 2020年6月12日

MHSAN: Multi-Head Self-Attention Network for Visual Semantic Embedding

MHSAN: Multi-Head Self-Attention Network for Visual Semantic Embedding

Arxiv

4+阅读 · 2020年1月11日

Text Generation from Knowledge Graphs with Graph Transformers

Arxiv

3+阅读 · 2019年5月18日

Convolutional Self-Attention Network

Arxiv

6+阅读 · 2019年4月8日

Relation-aware Graph Attention Network for Visual Question Answering

Arxiv

4+阅读 · 2019年3月29日

Neural Speech Synthesis with Transformer Network

Neural Speech Synthesis with Transformer Network

Arxiv

5+阅读 · 2019年1月30日

Attentive Relational Networks for Mapping Images to Scene Graphs

Arxiv

3+阅读 · 2018年11月26日

Self-Attention Recurrent Network for Saliency Detection

Self-Attention Recurrent Network for Saliency Detection

Arxiv

5+阅读 · 2018年8月5日

R-VQA: Learning Visual Relation Facts with Semantic Attention for Visual Question Answering

Arxiv

7+阅读 · 2018年5月24日

微信扫码咨询专知VIP会员