场景GATE: TExt 可视解答的场景- 地块共控网络 (SceneGATE: Scene-Graph based co-Attention networks for TExt visual question answering) - 专知论文

会员服务 ·

0

Attention · Networking · Integration · 图 · 视觉问答 ·

2022 年 12 月 16 日

SceneGATE: Scene-Graph based co-Attention networks for TExt visual question answering

翻译：场景GATE: TExt 可视解答的场景- 地块共控网络

Siwen Luo,Feiqi Cao,Felipe Nunez,Zean Wen,Josiah Poon,Caren Han

Most TextVQA approaches focus on the integration of objects, scene texts and question words by a simple transformer encoder. But this fails to capture the semantic relations between different modalities. The paper proposes a Scene Graph based co-Attention Network (SceneGATE) for TextVQA, which reveals the semantic relations among the objects, Optical Character Recognition (OCR) tokens and the question words. It is achieved by a TextVQA-based scene graph that discovers the underlying semantics of an image. We created a guided-attention module to capture the intra-modal interplay between the language and the vision as a guidance for inter-modal interactions. To make explicit teaching of the relations between the two modalities, we proposed and integrated two attention modules, namely a scene graph-based semantic relation-aware attention and a positional relation-aware attention. We conducted extensive experiments on two benchmark datasets, Text-VQA and ST-VQA. It is shown that our SceneGATE method outperformed existing ones because of the scene graph and its attention modules.

翻译：多数 TextVQA 方法都侧重于由简单的变压器编码器整合对象、现场文本和问题词。但是这未能捕捉不同模式之间的语义关系。本文建议为 TextVQA 建立一个基于 SceenGATE 的场景图形共同关注网络( SceeneGATE ), 以显示对象、光学字符识别符号和问题单词之间的语义关系。基于 TextVQA 的场景图形可以发现图像的基本语义。我们创建了一个引导注意模块, 以捕捉语言和愿景之间的模式内相互作用, 作为模式间互动的指南。为了明确教授两种模式之间的关系, 我们提议并整合了两个关注模块, 即基于图像图形的语义关系识别符号和定位关系识别单词。我们对两个基准数据集( Text- VQA 和ST- VQA) 进行了广泛的实验。我们的 SceneGATE 方法由于图像图及其关注模块而超越了现有模式。

0

相关内容

Attention

最新《知识图谱复杂问答》综述论文，A Survey on Complex Question Answering over Knowledge Base: Recent Advances and Challenges

最新《知识图谱复杂问答》综述论文，A Survey on Complex Question Answering over Knowledge Base: Recent Advances and Challenges

专知会员服务

73+阅读 · 2020年7月28日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

【NLP| 推荐文章】基于文本和知识库的语义搜索（Semantic search on text and knowledge bases）

专知会员服务

46+阅读 · 2019年11月24日

【NLP| 推荐文章】知识图谱问答系统的神经网络方法介绍（Introduction to Neural Network based Approaches for Question Answering over Knowledge Graphs）

专知会员服务

59+阅读 · 2019年11月24日

【CIKM2019 Tutorial】Recent Developments of Deep Heterogeneous Information Network Analysis（深度异构信息网络分析的最新进展），附157页PDF免费下载

【CIKM2019 Tutorial】Recent Developments of Deep Heterogeneous Information Network Analysis（深度异构信息网络分析的最新进展），附157页PDF免费下载

专知会员服务

29+阅读 · 2019年11月3日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

征稿 | International Joint Conference on Knowledge Graphs (IJCKG)

征稿 | International Joint Conference on Knowledge Graphs (IJCKG)

开放知识图谱

2+阅读 · 2022年5月20日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

中国图象图形学学会CSIG

0+阅读 · 2021年11月10日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

会议交流 | IJCKG: International Joint Conference on Knowledge Graphs

会议交流 | IJCKG: International Joint Conference on Knowledge Graphs

开放知识图谱

0+阅读 · 2021年9月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

专知

66+阅读 · 2018年1月31日

ABCB6基因在眼组织缺损中的功能研究

国家自然科学基金

0+阅读 · 2014年12月31日

泛素化在棉花曲叶病毒侵染中的作用

国家自然科学基金

0+阅读 · 2014年12月31日

一类Schrodinger-Maxwell 系统解的存在性与多解性研究

国家自然科学基金

0+阅读 · 2014年12月31日

具有临界指数的Schrodinger-Poisson系统的解

国家自然科学基金

0+阅读 · 2013年12月31日

面向全基因组关联研究的动态数据挖掘与深度查询方法

国家自然科学基金

0+阅读 · 2013年12月31日

Schrodinger-Poisson方程的若干问题研究

国家自然科学基金

1+阅读 · 2012年12月31日

基于多环链接发夹型探针多聚荧光技术构建NSCLC耐药与循环miRNAs表达微阵列研究

国家自然科学基金

0+阅读 · 2012年12月31日

拟南芥DIF（DRIP1-Interacting Factor）在胁迫信号应答中的功能分析

国家自然科学基金

0+阅读 · 2012年12月31日

几个非线性Schrodinger方程组模型及相关问题研究

国家自然科学基金

0+阅读 · 2012年12月31日

一种时空白噪声驱动的Navier-Stokes方程的隐格式

国家自然科学基金

0+阅读 · 2011年12月31日

Revisiting Pre-training in Audio-Visual Learning

Arxiv

0+阅读 · 2023年2月17日

Multi-Task Learning for Visual Scene Understanding

Arxiv

29+阅读 · 2022年3月28日

Scene Graph Generation: A Comprehensive Survey

Arxiv

26+阅读 · 2022年1月3日

QA-GNN: Reasoning with Language Models and Knowledge Graphs for Question Answering

Arxiv

20+阅读 · 2021年5月27日

A Hierarchical Reasoning Graph Neural Network for The Automatic Scoring of Answer Transcriptions in Video Job Interviews

A Hierarchical Reasoning Graph Neural Network for The Automatic Scoring of Answer Transcriptions in Video Job Interviews

Arxiv

14+阅读 · 2020年12月22日

Time-Series Event Prediction with Evolutionary State Graph

Arxiv

14+阅读 · 2020年11月25日

Look-into-Object: Self-supervised Structure Modeling for Object Recognition

Look-into-Object: Self-supervised Structure Modeling for Object Recognition

Arxiv

15+阅读 · 2020年3月31日

Multi-Modal Graph Neural Network for Joint Reasoning on Vision and Scene Text

Multi-Modal Graph Neural Network for Joint Reasoning on Vision and Scene Text

Arxiv

10+阅读 · 2020年3月31日

Text Generation from Knowledge Graphs with Graph Transformers

Arxiv

35+阅读 · 2019年4月4日

Rethinking Knowledge Graph Propagation for Zero-Shot Learning

Rethinking Knowledge Graph Propagation for Zero-Shot Learning

Arxiv

21+阅读 · 2019年3月27日

VIP会员

文章信息

相关主题

相关VIP内容

最新《知识图谱复杂问答》综述论文，A Survey on Complex Question Answering over Knowledge Base: Recent Advances and Challenges

最新《知识图谱复杂问答》综述论文，A Survey on Complex Question Answering over Knowledge Base: Recent Advances and Challenges

专知会员服务

73+阅读 · 2020年7月28日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

【NLP| 推荐文章】基于文本和知识库的语义搜索（Semantic search on text and knowledge bases）

专知会员服务

46+阅读 · 2019年11月24日

【NLP| 推荐文章】知识图谱问答系统的神经网络方法介绍（Introduction to Neural Network based Approaches for Question Answering over Knowledge Graphs）

专知会员服务

59+阅读 · 2019年11月24日

【CIKM2019 Tutorial】Recent Developments of Deep Heterogeneous Information Network Analysis（深度异构信息网络分析的最新进展），附157页PDF免费下载

【CIKM2019 Tutorial】Recent Developments of Deep Heterogeneous Information Network Analysis（深度异构信息网络分析的最新进展），附157页PDF免费下载

专知会员服务

29+阅读 · 2019年11月3日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《美空军条令出版物：战略打击》最新条令

《高能激光武器》22页slides

军事前沿模型

《面向小型无人机或无人飞行器的创新雷达探测与人工智能分类技术》263页

相关资讯

征稿 | International Joint Conference on Knowledge Graphs (IJCKG)

征稿 | International Joint Conference on Knowledge Graphs (IJCKG)

开放知识图谱

2+阅读 · 2022年5月20日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

中国图象图形学学会CSIG

0+阅读 · 2021年11月10日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

会议交流 | IJCKG: International Joint Conference on Knowledge Graphs

会议交流 | IJCKG: International Joint Conference on Knowledge Graphs

开放知识图谱

0+阅读 · 2021年9月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

专知

66+阅读 · 2018年1月31日

相关论文

Revisiting Pre-training in Audio-Visual Learning

Arxiv

0+阅读 · 2023年2月17日

Multi-Task Learning for Visual Scene Understanding

Arxiv

29+阅读 · 2022年3月28日

Scene Graph Generation: A Comprehensive Survey

Arxiv

26+阅读 · 2022年1月3日

QA-GNN: Reasoning with Language Models and Knowledge Graphs for Question Answering

Arxiv

20+阅读 · 2021年5月27日

A Hierarchical Reasoning Graph Neural Network for The Automatic Scoring of Answer Transcriptions in Video Job Interviews

A Hierarchical Reasoning Graph Neural Network for The Automatic Scoring of Answer Transcriptions in Video Job Interviews

Arxiv

14+阅读 · 2020年12月22日

Time-Series Event Prediction with Evolutionary State Graph

Arxiv

14+阅读 · 2020年11月25日

Look-into-Object: Self-supervised Structure Modeling for Object Recognition

Look-into-Object: Self-supervised Structure Modeling for Object Recognition

Arxiv

15+阅读 · 2020年3月31日

Multi-Modal Graph Neural Network for Joint Reasoning on Vision and Scene Text

Multi-Modal Graph Neural Network for Joint Reasoning on Vision and Scene Text

Arxiv

10+阅读 · 2020年3月31日

Text Generation from Knowledge Graphs with Graph Transformers

Arxiv

35+阅读 · 2019年4月4日

Rethinking Knowledge Graph Propagation for Zero-Shot Learning

Rethinking Knowledge Graph Propagation for Zero-Shot Learning

Arxiv

21+阅读 · 2019年3月27日

相关基金

ABCB6基因在眼组织缺损中的功能研究

国家自然科学基金

0+阅读 · 2014年12月31日

泛素化在棉花曲叶病毒侵染中的作用

国家自然科学基金

0+阅读 · 2014年12月31日

一类Schrodinger-Maxwell 系统解的存在性与多解性研究

国家自然科学基金

0+阅读 · 2014年12月31日

具有临界指数的Schrodinger-Poisson系统的解

国家自然科学基金

0+阅读 · 2013年12月31日

面向全基因组关联研究的动态数据挖掘与深度查询方法

国家自然科学基金

0+阅读 · 2013年12月31日

Schrodinger-Poisson方程的若干问题研究

国家自然科学基金

1+阅读 · 2012年12月31日

基于多环链接发夹型探针多聚荧光技术构建NSCLC耐药与循环miRNAs表达微阵列研究

国家自然科学基金

0+阅读 · 2012年12月31日

拟南芥DIF（DRIP1-Interacting Factor）在胁迫信号应答中的功能分析

国家自然科学基金

0+阅读 · 2012年12月31日

几个非线性Schrodinger方程组模型及相关问题研究

国家自然科学基金

0+阅读 · 2012年12月31日

一种时空白噪声驱动的Navier-Stokes方程的隐格式

国家自然科学基金

0+阅读 · 2011年12月31日

微信扫码咨询专知VIP会员