用于外部知识视觉问题解答的 " 实体专用 " 高密度透视检索系统 (Entity-Focused Dense Passage Retrieval for Outside-Knowledge Visual Question Answering) - 专知论文

会员服务 ·

0

视觉问答 · 知识 (knowledge) · state-of-the-art · 自动问答 · Performer ·

2022 年 10 月 18 日

Entity-Focused Dense Passage Retrieval for Outside-Knowledge Visual Question Answering

翻译：用于外部知识视觉问题解答的 " 实体专用 " 高密度透视检索系统

Jialin Wu,Raymond J. Mooney

from arxiv, EMNLP 2022

Most Outside-Knowledge Visual Question Answering (OK-VQA) systems employ a two-stage framework that first retrieves external knowledge given the visual question and then predicts the answer based on the retrieved content. However, the retrieved knowledge is often inadequate. Retrievals are frequently too general and fail to cover specific knowledge needed to answer the question. Also, the naturally available supervision (whether the passage contains the correct answer) is weak and does not guarantee question relevancy. To address these issues, we propose an Entity-Focused Retrieval (EnFoRe) model that provides stronger supervision during training and recognizes question-relevant entities to help retrieve more specific knowledge. Experiments show that our EnFoRe model achieves superior retrieval performance on OK-VQA, the currently largest outside-knowledge VQA dataset. We also combine the retrieved knowledge with state-of-the-art VQA models, and achieve a new state-of-the-art performance on OK-VQA.

翻译：大多数外部知识的视觉问题解答系统(OK-VQA)采用一个两阶段框架,首先根据视觉问题检索外部知识,然后根据检索的内容预测答案。然而,检索的知识往往不充分。检索往往过于笼统,无法涵盖回答问题所需的具体知识。此外,自然可用的监督(无论段落是否包含正确的答案)很薄弱,不能保证问题的相关性。为了解决这些问题,我们提议了一个实体-视野检索(EnFoRe)模型,在培训期间提供更有力的监督,并承认与问题有关的实体,以帮助检索更具体的知识。实验显示,我们的EnFoRe模型在目前最大的外部知识VQA数据集即 OK-VQA 上取得了较好的检索性能。我们还将检索的知识与最新的VQA 模型结合起来,并在 OK-VQA 上取得新的最新技术表现。

0

相关内容

视觉问答

视觉问答（Visual Question Answering，VQA），是一种涉及计算机视觉和自然语言处理的学习任务。这一任务的定义如下： A VQA system takes as input an image and a free-form, open-ended, natural-language question about the image and produces a natural-language answer as the output[1]。翻译为中文：一个VQA系统以一张图片和一个关于这张图片形式自由、开放式的自然语言问题作为输入，以生成一条自然语言答案作为输出。简单来说，VQA就是给定的图片进行问答。

知识荟萃

精品入门和进阶教程、论文和代码整理等

更多

查看相关VIP内容、论文、资讯等

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

【WWW2020】学习上下文化文档表示用于医疗答案检索，Learning Contextualized Document Representations for Healthcare Answer Retrieval

【WWW2020】学习上下文化文档表示用于医疗答案检索，Learning Contextualized Document Representations for Healthcare Answer Retrieval

专知会员服务

26+阅读 · 2020年2月10日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

会议交流 | IJCKG: International Joint Conference on Knowledge Graphs

会议交流 | IJCKG: International Joint Conference on Knowledge Graphs

开放知识图谱

0+阅读 · 2021年9月9日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

【论文推荐】最新九篇自动问答相关论文—可解释推理网络、上下文知识图谱嵌入、注意力RNN、Multi-Cast注意力网络

【论文推荐】最新九篇自动问答相关论文—可解释推理网络、上下文知识图谱嵌入、注意力RNN、Multi-Cast注意力网络

专知

15+阅读 · 2018年6月29日

HLA IgG3介导的P-选择素-ASK1-JNK内皮细胞凋亡信号通路在肝移植术后AMR反应中的研究

国家自然科学基金

0+阅读 · 2013年12月31日

甘草酸二铵调控糖皮质激素受体防治百草枯肺损伤机制：与TLRs信号通路相关？

国家自然科学基金

0+阅读 · 2012年12月31日

城镇居民亚健康状态的评价方法学及健康管理模式研究

国家自然科学基金

0+阅读 · 2011年12月31日

探寻与高功能孤独症和Asperger综合征相关的拷贝数变异

国家自然科学基金

0+阅读 · 2009年12月31日

同步辐射小角X射线散射先进技术研究及若干应用

国家自然科学基金

0+阅读 · 2008年12月31日

Medical Visual Question Answering: A Survey

Arxiv

15+阅读 · 2021年11月19日

A Survey on Complex Knowledge Base Question Answering: Methods, Challenges and Solutions

Arxiv

21+阅读 · 2021年5月25日

Text Detection and Recognition in the Wild: A Review

Arxiv

20+阅读 · 2020年6月8日

AliMe KBQA: Question Answering over Structured Knowledge for E-commerce Customer Service

AliMe KBQA: Question Answering over Structured Knowledge for E-commerce Customer Service

Arxiv

23+阅读 · 2019年12月12日

Question Answering over Freebase via Attentive RNN with Similarity Matrix based CNN

Arxiv

11+阅读 · 2018年5月27日

VIP会员

文章信息

相关主题

知识 (knowledge)

state-of-the-art

相关VIP内容

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

【WWW2020】学习上下文化文档表示用于医疗答案检索，Learning Contextualized Document Representations for Healthcare Answer Retrieval

【WWW2020】学习上下文化文档表示用于医疗答案检索，Learning Contextualized Document Representations for Healthcare Answer Retrieval

专知会员服务

26+阅读 · 2020年2月10日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【伯克利博士论文】从推理服务到模型训练：面向大规模 LLM 智能体的高效系统构建

面向作战人员负责任地寻求生成式人工智能

《Hello-Agents》项目正式发布，一起从零学习智能体！

智能体 AI (Agentic AI) 的新进展：回归初心，预见未来

相关资讯

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

会议交流 | IJCKG: International Joint Conference on Knowledge Graphs

会议交流 | IJCKG: International Joint Conference on Knowledge Graphs

开放知识图谱

0+阅读 · 2021年9月9日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

【论文推荐】最新九篇自动问答相关论文—可解释推理网络、上下文知识图谱嵌入、注意力RNN、Multi-Cast注意力网络

【论文推荐】最新九篇自动问答相关论文—可解释推理网络、上下文知识图谱嵌入、注意力RNN、Multi-Cast注意力网络

专知

15+阅读 · 2018年6月29日

相关论文

Medical Visual Question Answering: A Survey

Arxiv

15+阅读 · 2021年11月19日

A Survey on Complex Knowledge Base Question Answering: Methods, Challenges and Solutions

Arxiv

21+阅读 · 2021年5月25日

Text Detection and Recognition in the Wild: A Review

Arxiv

20+阅读 · 2020年6月8日

AliMe KBQA: Question Answering over Structured Knowledge for E-commerce Customer Service

AliMe KBQA: Question Answering over Structured Knowledge for E-commerce Customer Service

Arxiv

23+阅读 · 2019年12月12日

Question Answering over Freebase via Attentive RNN with Similarity Matrix based CNN

Arxiv

11+阅读 · 2018年5月27日

相关基金

HLA IgG3介导的P-选择素-ASK1-JNK内皮细胞凋亡信号通路在肝移植术后AMR反应中的研究

国家自然科学基金

0+阅读 · 2013年12月31日

甘草酸二铵调控糖皮质激素受体防治百草枯肺损伤机制：与TLRs信号通路相关？

国家自然科学基金

0+阅读 · 2012年12月31日

城镇居民亚健康状态的评价方法学及健康管理模式研究

国家自然科学基金

0+阅读 · 2011年12月31日

探寻与高功能孤独症和Asperger综合征相关的拷贝数变异

国家自然科学基金

0+阅读 · 2009年12月31日

同步辐射小角X射线散射先进技术研究及若干应用

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员