用于外部知识视觉问题解答的 " 实体专用 " 高密度透视检索系统 (Entity-Focused Dense Passage Retrieval for Outside-Knowledge Visual Question Answering) - 专知论文

会员服务 ·

0

视觉问答 · 知识 (knowledge) · state-of-the-art · 自动问答 · Performer ·

2022 年 10 月 20 日

Entity-Focused Dense Passage Retrieval for Outside-Knowledge Visual Question Answering

翻译：用于外部知识视觉问题解答的 " 实体专用 " 高密度透视检索系统

Jialin Wu,Raymond J. Mooney

from arxiv, EMNLP 2022

Most Outside-Knowledge Visual Question Answering (OK-VQA) systems employ a two-stage framework that first retrieves external knowledge given the visual question and then predicts the answer based on the retrieved content. However, the retrieved knowledge is often inadequate. Retrievals are frequently too general and fail to cover specific knowledge needed to answer the question. Also, the naturally available supervision (whether the passage contains the correct answer) is weak and does not guarantee question relevancy. To address these issues, we propose an Entity-Focused Retrieval (EnFoRe) model that provides stronger supervision during training and recognizes question-relevant entities to help retrieve more specific knowledge. Experiments show that our EnFoRe model achieves superior retrieval performance on OK-VQA, the currently largest outside-knowledge VQA dataset. We also combine the retrieved knowledge with state-of-the-art VQA models, and achieve a new state-of-the-art performance on OK-VQA.

翻译：大多数外部知识的视觉问题解答系统(OK-VQA)采用一个两阶段框架,首先根据视觉问题检索外部知识,然后根据检索的内容预测答案。然而,检索的知识往往不充分。检索往往过于笼统,无法涵盖回答问题所需的具体知识。此外,自然可用的监督(无论段落是否包含正确的答案)很薄弱,不能保证问题的相关性。为了解决这些问题,我们提议了一个实体-视野检索(EnFoRe)模型,在培训期间提供更有力的监督,并承认与问题有关的实体,以帮助检索更具体的知识。实验显示,我们的EnFoRe模型在目前最大的外部知识VQA数据集即 OK-VQA 上取得了较好的检索性能。我们还将检索的知识与最新的VQA 模型结合起来,并在 OK-VQA 上取得新的最新技术表现。

0

相关内容

视觉问答

视觉问答（Visual Question Answering，VQA），是一种涉及计算机视觉和自然语言处理的学习任务。这一任务的定义如下： A VQA system takes as input an image and a free-form, open-ended, natural-language question about the image and produces a natural-language answer as the output[1]。翻译为中文：一个VQA系统以一张图片和一个关于这张图片形式自由、开放式的自然语言问题作为输入，以生成一条自然语言答案作为输出。简单来说，VQA就是给定的图片进行问答。

知识荟萃

精品入门和进阶教程、论文和代码整理等

更多

查看相关VIP内容、论文、资讯等

NeurlPS 2022 | 自然语言处理相关论文分类整理

NeurlPS 2022 | 自然语言处理相关论文分类整理

专知会员服务

51+阅读 · 2022年10月2日

最新《知识图谱复杂问答》综述论文，A Survey on Complex Question Answering over Knowledge Base: Recent Advances and Challenges

最新《知识图谱复杂问答》综述论文，A Survey on Complex Question Answering over Knowledge Base: Recent Advances and Challenges

专知会员服务

73+阅读 · 2020年7月28日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【论文推荐】基于BERT修剪的问答模型（Pruning a BERT-based Question Answering Model）

【论文推荐】基于BERT修剪的问答模型（Pruning a BERT-based Question Answering Model）

专知会员服务

30+阅读 · 2019年11月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

会议交流 | IJCKG: International Joint Conference on Knowledge Graphs

会议交流 | IJCKG: International Joint Conference on Knowledge Graphs

开放知识图谱

0+阅读 · 2021年9月9日

RoBERTa for Chinese：大规模中文预训练RoBERTa模型

RoBERTa for Chinese：大规模中文预训练RoBERTa模型

AINLP

30+阅读 · 2019年9月8日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文推荐】最新七篇图像检索相关论文—草图、Tie-Aware、场景图解析、叠加跨注意力机制、深度哈希、人群估计

【论文推荐】最新七篇图像检索相关论文—草图、Tie-Aware、场景图解析、叠加跨注意力机制、深度哈希、人群估计

专知

10+阅读 · 2018年4月22日

【论文推荐】最新7篇视觉问答（VQA）相关论文—解释、读写记忆网络、逆视觉问答、视觉推理、可解释性、注意力机制、计数

【论文推荐】最新7篇视觉问答（VQA）相关论文—解释、读写记忆网络、逆视觉问答、视觉推理、可解释性、注意力机制、计数

专知

30+阅读 · 2018年3月22日

【论文推荐】最新五篇命名实体识别（NER）相关论文—对抗学习、语料库、深度多任务学习、先验知识、跨语言语义

【论文推荐】最新五篇命名实体识别（NER）相关论文—对抗学习、语料库、深度多任务学习、先验知识、跨语言语义

专知

37+阅读 · 2018年2月21日

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

专知

66+阅读 · 2018年1月31日

量子Ising模型中Kibble-Zurek机制的量子模拟实验研究

国家自然科学基金

0+阅读 · 2014年12月31日

磁各向异性纳米结构的尺寸、微结构与磁转变温度内在关联机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

一类Schrodinger-Maxwell 系统解的存在性与多解性研究

国家自然科学基金

0+阅读 · 2014年12月31日

晶面调控砷化镓纳米线的原位掺杂与输运特性研究

国家自然科学基金

0+阅读 · 2013年12月31日

TWIST在胃癌多药耐药中的作用及分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

一株含双降解质粒的红球菌（Rhodococcus sp.）二噁英降解机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于氯通道差异性表达的Disulfiram-Cu靶向抗肿瘤作用及其机制

国家自然科学基金

0+阅读 · 2012年12月31日

RGM与neogenin信号调控应激性精神障碍-PTSD杏仁核、海马神经细胞凋亡的分子机制

国家自然科学基金

0+阅读 · 2011年12月31日

几何阻挫体系ATO2中自旋、电荷、轨道序及其相互作用研究

国家自然科学基金

0+阅读 · 2011年12月31日

条斑紫菜环境胁迫适应分子机制的转录组学研究

国家自然科学基金

0+阅读 · 2009年12月31日

Automatic Context Pattern Generation for Entity Set Expansion

Arxiv

0+阅读 · 2022年12月4日

Named Entity and Relation Extraction with Multi-Modal Retrieval

Arxiv

0+阅读 · 2022年12月3日

UniKGQA: Unified Retrieval and Reasoning for Solving Multi-hop Question Answering Over Knowledge Graph

Arxiv

0+阅读 · 2022年12月2日

ConTextual Masked Auto-Encoder for Dense Passage Retrieval

Arxiv

0+阅读 · 2022年12月1日

Medical Visual Question Answering: A Survey

Arxiv

15+阅读 · 2021年11月19日

K-AID: Enhancing Pre-trained Language Models with Domain Knowledge for Question Answering

Arxiv

15+阅读 · 2021年9月22日

Strong Baselines for Simple Question Answering over Knowledge Graphs with and without Neural Networks

Arxiv

17+阅读 · 2018年6月5日

VQA-E: Explaining, Elaborating, and Enhancing Your Answers for Visual Questions

Arxiv

17+阅读 · 2018年3月20日

Learning to Count Objects in Natural Images for Visual Question Answering

Arxiv

12+阅读 · 2018年2月15日

DeepSeek: Content Based Image Search & Retrieval

Arxiv

13+阅读 · 2018年1月11日

VIP会员

文章信息

相关主题

知识 (knowledge)

state-of-the-art

相关VIP内容

NeurlPS 2022 | 自然语言处理相关论文分类整理

NeurlPS 2022 | 自然语言处理相关论文分类整理

专知会员服务

51+阅读 · 2022年10月2日

最新《知识图谱复杂问答》综述论文，A Survey on Complex Question Answering over Knowledge Base: Recent Advances and Challenges

最新《知识图谱复杂问答》综述论文，A Survey on Complex Question Answering over Knowledge Base: Recent Advances and Challenges

专知会员服务

73+阅读 · 2020年7月28日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【论文推荐】基于BERT修剪的问答模型（Pruning a BERT-based Question Answering Model）

【论文推荐】基于BERT修剪的问答模型（Pruning a BERT-based Question Answering Model）

专知会员服务

30+阅读 · 2019年11月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《物联网（IoT）中的无人机通信高效控制》135页

《在GNSS信号降级环境中利用共识实现无人机集群稳健协调》

中程单向攻击无人机的战略意义：俄乌战争启示

《面向无人机集群的避障动态传感器覆盖算法》最新38页

相关资讯

会议交流 | IJCKG: International Joint Conference on Knowledge Graphs

会议交流 | IJCKG: International Joint Conference on Knowledge Graphs

开放知识图谱

0+阅读 · 2021年9月9日

RoBERTa for Chinese：大规模中文预训练RoBERTa模型

RoBERTa for Chinese：大规模中文预训练RoBERTa模型

AINLP

30+阅读 · 2019年9月8日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文推荐】最新七篇图像检索相关论文—草图、Tie-Aware、场景图解析、叠加跨注意力机制、深度哈希、人群估计

【论文推荐】最新七篇图像检索相关论文—草图、Tie-Aware、场景图解析、叠加跨注意力机制、深度哈希、人群估计

专知

10+阅读 · 2018年4月22日

【论文推荐】最新7篇视觉问答（VQA）相关论文—解释、读写记忆网络、逆视觉问答、视觉推理、可解释性、注意力机制、计数

【论文推荐】最新7篇视觉问答（VQA）相关论文—解释、读写记忆网络、逆视觉问答、视觉推理、可解释性、注意力机制、计数

专知

30+阅读 · 2018年3月22日

【论文推荐】最新五篇命名实体识别（NER）相关论文—对抗学习、语料库、深度多任务学习、先验知识、跨语言语义

【论文推荐】最新五篇命名实体识别（NER）相关论文—对抗学习、语料库、深度多任务学习、先验知识、跨语言语义

专知

37+阅读 · 2018年2月21日

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

专知

66+阅读 · 2018年1月31日

相关论文

Automatic Context Pattern Generation for Entity Set Expansion

Arxiv

0+阅读 · 2022年12月4日

Named Entity and Relation Extraction with Multi-Modal Retrieval

Arxiv

0+阅读 · 2022年12月3日

UniKGQA: Unified Retrieval and Reasoning for Solving Multi-hop Question Answering Over Knowledge Graph

Arxiv

0+阅读 · 2022年12月2日

ConTextual Masked Auto-Encoder for Dense Passage Retrieval

Arxiv

0+阅读 · 2022年12月1日

Medical Visual Question Answering: A Survey

Arxiv

15+阅读 · 2021年11月19日

K-AID: Enhancing Pre-trained Language Models with Domain Knowledge for Question Answering

Arxiv

15+阅读 · 2021年9月22日

Strong Baselines for Simple Question Answering over Knowledge Graphs with and without Neural Networks

Arxiv

17+阅读 · 2018年6月5日

VQA-E: Explaining, Elaborating, and Enhancing Your Answers for Visual Questions

Arxiv

17+阅读 · 2018年3月20日

Learning to Count Objects in Natural Images for Visual Question Answering

Arxiv

12+阅读 · 2018年2月15日

DeepSeek: Content Based Image Search & Retrieval

Arxiv

13+阅读 · 2018年1月11日

相关基金

量子Ising模型中Kibble-Zurek机制的量子模拟实验研究

国家自然科学基金

0+阅读 · 2014年12月31日

磁各向异性纳米结构的尺寸、微结构与磁转变温度内在关联机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

一类Schrodinger-Maxwell 系统解的存在性与多解性研究

国家自然科学基金

0+阅读 · 2014年12月31日

晶面调控砷化镓纳米线的原位掺杂与输运特性研究

国家自然科学基金

0+阅读 · 2013年12月31日

TWIST在胃癌多药耐药中的作用及分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

一株含双降解质粒的红球菌（Rhodococcus sp.）二噁英降解机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于氯通道差异性表达的Disulfiram-Cu靶向抗肿瘤作用及其机制

国家自然科学基金

0+阅读 · 2012年12月31日

RGM与neogenin信号调控应激性精神障碍-PTSD杏仁核、海马神经细胞凋亡的分子机制

国家自然科学基金

0+阅读 · 2011年12月31日

几何阻挫体系ATO2中自旋、电荷、轨道序及其相互作用研究

国家自然科学基金

0+阅读 · 2011年12月31日

条斑紫菜环境胁迫适应分子机制的转录组学研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员