以知识为基础的基于知识的VQA统一端对端对端探索-检索框架 (A Unified End-to-End Retriever-Reader Framework for Knowledge-based VQA) - 专知论文

会员服务 ·

0

知识 (knowledge) · 视觉问答 · 端到端 · Extensibility · 自动问答 ·

2022 年 6 月 30 日

A Unified End-to-End Retriever-Reader Framework for Knowledge-based VQA

翻译：以知识为基础的基于知识的VQA统一端对端对端探索-检索框架

Yangyang Guo,Liqiang Nie,Yongkang Wong,Yibing Liu,Zhiyong Cheng,Mohan Kankanhalli

Knowledge-based Visual Question Answering (VQA) expects models to rely on external knowledge for robust answer prediction. Though significant it is, this paper discovers several leading factors impeding the advancement of current state-of-the-art methods. On the one hand, methods which exploit the explicit knowledge take the knowledge as a complement for the coarsely trained VQA model. Despite their effectiveness, these approaches often suffer from noise incorporation and error propagation. On the other hand, pertaining to the implicit knowledge, the multi-modal implicit knowledge for knowledge-based VQA still remains largely unexplored. This work presents a unified end-to-end retriever-reader framework towards knowledge-based VQA. In particular, we shed light on the multi-modal implicit knowledge from vision-language pre-training models to mine its potential in knowledge reasoning. As for the noise problem encountered by the retrieval operation on explicit knowledge, we design a novel scheme to create pseudo labels for effective knowledge supervision. This scheme is able to not only provide guidance for knowledge retrieval, but also drop these instances potentially error-prone towards question answering. To validate the effectiveness of the proposed method, we conduct extensive experiments on the benchmark dataset. The experimental results reveal that our method outperforms existing baselines by a noticeable margin. Beyond the reported numbers, this paper further spawns several insights on knowledge utilization for future research with some empirical findings.

翻译：以知识为基础的视觉问题解答(VQA)预计模型将依赖外部知识进行可靠的回答预测。虽然这是重要的,但本文件发现了阻碍目前最新技术方法进步的一些主要因素。一方面,利用明确知识的方法将知识作为粗略培训的VQA模型的补充。尽管这些方法具有效力,但它们往往会受到噪音整合和传播错误的影响。另一方面,关于隐含知识的VQA的多模式隐含知识仍然基本上没有得到探索。这项工作为以知识为基础的VQA提供了一个统一的端到端检索器阅读器框架,为以知识为基础的VQA提供了一个统一的端到端检索器阅读器框架。特别是,我们从愿景-语言培训前模型中揭示了多模式的隐含知识,以挖掘其知识推理的潜力。关于对明确知识的检索操作遇到的噪音问题,我们设计了一个新办法,为有效的知识监督创建假标签。这个办法不仅能够为知识检索提供指导,而且还能将这些实例降低潜在错误的答案用于解答。我们用一些实验性的研究基底值来验证我们现有基准利用方法中的一些实验性结果。我们用了一些实验性研究基数来进一步试验。

0

相关内容

知识 (knowledge)

知识 (knowledge)

通过学习、实践或探索所获得的认识、判断或技能。

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

80+阅读 · 2020年7月26日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

TMS1基因响应高温胁迫和ER Stress的分子机制

国家自然科学基金

0+阅读 · 2014年12月31日

溶剂热法FeSe基超导材料制备和物性研究

国家自然科学基金

0+阅读 · 2014年12月31日

superstrate结构铜锌硒硫太阳电池制备中的关键科学问题研究

国家自然科学基金

0+阅读 · 2014年12月31日

Lp-Minkowski 问题及相关的 Monge-Ampere 型方程

国家自然科学基金

0+阅读 · 2013年12月31日

Cu-Pt纳米颗粒去合金化过程中特征结构形成与演化的原子模拟

国家自然科学基金

0+阅读 · 2013年12月31日

限制性定理、谱乘子及其相关问题的研究

国家自然科学基金

1+阅读 · 2012年12月31日

复几何中的对称性及其在数学物理中的应用

国家自然科学基金

0+阅读 · 2012年12月31日

Pt/Heusler合金/MgO基垂直磁各向异性薄膜的制备及磁各向异性机制的研究

国家自然科学基金

0+阅读 · 2012年12月31日

UTMD"可逆开闭"血视网膜屏障联合rAAV-MERTK治疗视网膜色素变性的实验研究

国家自然科学基金

0+阅读 · 2012年12月31日

亚铜掺杂碘化锡基层状类钙钛矿有机-无机杂合物材料的制备与性能研究

国家自然科学基金

0+阅读 · 2009年12月31日

Locate Then Ask: Interpretable Stepwise Reasoning for Multi-hop Question Answering

Arxiv

0+阅读 · 2022年8月22日

Pre-training Tasks for User Intent Detection and Embedding Retrieval in E-commerce Search

Pre-training Tasks for User Intent Detection and Embedding Retrieval in E-commerce Search

Arxiv

0+阅读 · 2022年8月22日

What Makes the Story Forward? Inferring Commonsense Explanations as Prompts for Future Event Generation

What Makes the Story Forward? Inferring Commonsense Explanations as Prompts for Future Event Generation

Arxiv

0+阅读 · 2022年8月18日

Contrastive Spatio-Temporal Pretext Learning for Self-supervised Video Representation

Arxiv

11+阅读 · 2021年12月16日

Optimizing Reusable Knowledge for Continual Learning via Metalearning

Arxiv

15+阅读 · 2021年6月9日

Learning from Very Few Samples: A Survey

Arxiv

126+阅读 · 2020年9月6日

Look-into-Object: Self-supervised Structure Modeling for Object Recognition

Look-into-Object: Self-supervised Structure Modeling for Object Recognition

Arxiv

15+阅读 · 2020年3月31日

Representation Learning with Ordered Relation Paths for Knowledge Graph Completion

Representation Learning with Ordered Relation Paths for Knowledge Graph Completion

Arxiv

12+阅读 · 2019年9月26日

Learning over Knowledge-Base Embeddings for Recommendation

Arxiv

23+阅读 · 2018年3月22日

VQA-E: Explaining, Elaborating, and Enhancing Your Answers for Visual Questions

Arxiv

17+阅读 · 2018年3月20日

VIP会员

文章信息

相关主题

知识 (knowledge)

相关VIP内容

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

80+阅读 · 2020年7月26日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

热门VIP内容

开通专知VIP会员享更多权益服务

网络安全技术生成式人工智能服务安全基本要求

【博士论文】面向下游任务的语言模型优化：一种后训练视角

【新书】AI红队演练：智能系统的攻击与防御

基于 Transformer 的脑电解码综述询问 ChatGPT

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

相关论文

Locate Then Ask: Interpretable Stepwise Reasoning for Multi-hop Question Answering

Arxiv

0+阅读 · 2022年8月22日

Pre-training Tasks for User Intent Detection and Embedding Retrieval in E-commerce Search

Pre-training Tasks for User Intent Detection and Embedding Retrieval in E-commerce Search

Arxiv

0+阅读 · 2022年8月22日

What Makes the Story Forward? Inferring Commonsense Explanations as Prompts for Future Event Generation

What Makes the Story Forward? Inferring Commonsense Explanations as Prompts for Future Event Generation

Arxiv

0+阅读 · 2022年8月18日

Contrastive Spatio-Temporal Pretext Learning for Self-supervised Video Representation

Arxiv

11+阅读 · 2021年12月16日

Optimizing Reusable Knowledge for Continual Learning via Metalearning

Arxiv

15+阅读 · 2021年6月9日

Learning from Very Few Samples: A Survey

Arxiv

126+阅读 · 2020年9月6日

Look-into-Object: Self-supervised Structure Modeling for Object Recognition

Look-into-Object: Self-supervised Structure Modeling for Object Recognition

Arxiv

15+阅读 · 2020年3月31日

Representation Learning with Ordered Relation Paths for Knowledge Graph Completion

Representation Learning with Ordered Relation Paths for Knowledge Graph Completion

Arxiv

12+阅读 · 2019年9月26日

Learning over Knowledge-Base Embeddings for Recommendation

Arxiv

23+阅读 · 2018年3月22日

VQA-E: Explaining, Elaborating, and Enhancing Your Answers for Visual Questions

Arxiv

17+阅读 · 2018年3月20日

相关基金

TMS1基因响应高温胁迫和ER Stress的分子机制

国家自然科学基金

0+阅读 · 2014年12月31日

溶剂热法FeSe基超导材料制备和物性研究

国家自然科学基金

0+阅读 · 2014年12月31日

superstrate结构铜锌硒硫太阳电池制备中的关键科学问题研究

国家自然科学基金

0+阅读 · 2014年12月31日

Lp-Minkowski 问题及相关的 Monge-Ampere 型方程

国家自然科学基金

0+阅读 · 2013年12月31日

Cu-Pt纳米颗粒去合金化过程中特征结构形成与演化的原子模拟

国家自然科学基金

0+阅读 · 2013年12月31日

限制性定理、谱乘子及其相关问题的研究

国家自然科学基金

1+阅读 · 2012年12月31日

复几何中的对称性及其在数学物理中的应用

国家自然科学基金

0+阅读 · 2012年12月31日

Pt/Heusler合金/MgO基垂直磁各向异性薄膜的制备及磁各向异性机制的研究

国家自然科学基金

0+阅读 · 2012年12月31日

UTMD"可逆开闭"血视网膜屏障联合rAAV-MERTK治疗视网膜色素变性的实验研究

国家自然科学基金

0+阅读 · 2012年12月31日

亚铜掺杂碘化锡基层状类钙钛矿有机-无机杂合物材料的制备与性能研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员