通过区分超优类似情况,克服视觉问题解答中的语言前言 (Overcoming Language Priors in Visual Question Answering via Distinguishing Superficially Similar Instances) - 专知论文

会员服务 ·

0

视觉问答 · 相似度 · 示例 · 自动问答 · ForCES ·

2022 年 9 月 18 日

Overcoming Language Priors in Visual Question Answering via Distinguishing Superficially Similar Instances

翻译：通过区分超优类似情况,克服视觉问题解答中的语言前言

Yike Wu,Yu Zhao,Shiwan Zhao,Ying Zhang,Xiaojie Yuan,Guoqing Zhao,Ning Jiang

from arxiv, Published in COLING 2022

Despite the great progress of Visual Question Answering (VQA), current VQA models heavily rely on the superficial correlation between the question type and its corresponding frequent answers (i.e., language priors) to make predictions, without really understanding the input. In this work, we define the training instances with the same question type but different answers as \textit{superficially similar instances}, and attribute the language priors to the confusion of VQA model on such instances. To solve this problem, we propose a novel training framework that explicitly encourages the VQA model to distinguish between the superficially similar instances. Specifically, for each training instance, we first construct a set that contains its superficially similar counterparts. Then we exploit the proposed distinguishing module to increase the distance between the instance and its counterparts in the answer space. In this way, the VQA model is forced to further focus on the other parts of the input beyond the question type, which helps to overcome the language priors. Experimental results show that our method achieves the state-of-the-art performance on VQA-CP v2. Codes are available at \href{https://github.com/wyk-nku/Distinguishing-VQA.git}{Distinguishing-VQA}.

翻译：尽管视觉问答(VQA)取得了巨大进展,但当前的VQA模式在很大程度上依赖问题类型与相应常见答案(即语言前科)之间的表面关联,以作出预测,而没有真正理解投入。在这项工作中,我们用同样的问题类型界定培训案例,但不同的答案则与\ textit{urfically subility situal situations} 相同,并将之前的语言与VQA模式在这类案例中的混淆联系起来。为了解决这一问题,我们提议了一个新的培训框架,明确鼓励VQA模式区分表面相似的例子。具体地说,我们为每个培训实例首先建立一个包含其表面相似对应的数据集。然后我们利用拟议的区分模块来增加实例与回答空间对应方之间的距离。这样,VQAA模型被迫进一步侧重于问题类型以外的投入的其他部分,这有助于克服先前的语言。实验结果显示,我们的方法在VQA-CP vCP v2. 上实现了州-艺术表现。Cases在 asiv-Q.Ahrik/Disuk_Vgiusing@Disqu_Vgiv_Q.

0

相关内容

视觉问答

视觉问答（Visual Question Answering，VQA），是一种涉及计算机视觉和自然语言处理的学习任务。这一任务的定义如下： A VQA system takes as input an image and a free-form, open-ended, natural-language question about the image and produces a natural-language answer as the output[1]。翻译为中文：一个VQA系统以一张图片和一个关于这张图片形式自由、开放式的自然语言问题作为输入，以生成一条自然语言答案作为输出。简单来说，VQA就是给定的图片进行问答。

知识荟萃

精品入门和进阶教程、论文和代码整理等

更多

查看相关VIP内容、论文、资讯等

【超赞的#C++#速查&信息图】“hacking c++ - Cheat Sheets & Infographics”

【超赞的#C++#速查&信息图】“hacking c++ - Cheat Sheets & Infographics”

专知会员服务

30+阅读 · 2022年3月8日

【CVPR 2022】基于Tracklet查询和建议的高效视频实例分割，Efficient Video Instance Segmentation via Tracklet Query and Proposal

【CVPR 2022】基于Tracklet查询和建议的高效视频实例分割，Efficient Video Instance Segmentation via Tracklet Query and Proposal

专知会员服务

16+阅读 · 2022年3月3日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

中国图象图形学学会CSIG

0+阅读 · 2021年11月10日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

基于量子点酶辅助RCA放大荧光偏振分析用于miRNA高灵敏多重检测

国家自然科学基金

0+阅读 · 2014年12月31日

mTOR功能性单倍体通过ERS-IRE1/α-JNK通路调控乳腺癌细胞药物敏感性的机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

力-温-湿多场耦合对PEMFC性能损耗影响的机理研究

国家自然科学基金

0+阅读 · 2013年12月31日

预应力波纹管孔道压浆质量的超声检测方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

新型磁性量子点复合物的电致化学发光生物传感器及其在肿瘤细胞检测中的应用研究

国家自然科学基金

0+阅读 · 2011年12月31日

多谱NaY(Gd)F4:Yb,Er(Tm)纳米粒子的界面修饰、性质及农药多残留免疫分析方法研究

国家自然科学基金

0+阅读 · 2011年12月31日

新型四氮唑微孔配位聚合物的合成及荧光探针与气体吸附性能研究

国家自然科学基金

0+阅读 · 2011年12月31日

基于激励阵列的ACFM缺陷可视化检测技术与应用研究

国家自然科学基金

0+阅读 · 2009年12月31日

硼碳氮系有序介孔材料的制备与电化学生物传感研究

国家自然科学基金

0+阅读 · 2009年12月31日

强磁场下过渡金属掺杂In2O3稀磁半导体纳米材料的研究

国家自然科学基金

0+阅读 · 2008年12月31日

Compressing And Debiasing Vision-Language Pre-Trained Models for Visual Question Answering

Arxiv

1+阅读 · 2022年10月26日

Reliability-Aware Prediction via Uncertainty Learning for Person Image Retrieval

Arxiv

0+阅读 · 2022年10月24日

Language-free Training for Zero-shot Video Grounding

Arxiv

0+阅读 · 2022年10月24日

Efficient learning of nonlinear prediction models with time-series privileged information

Arxiv

0+阅读 · 2022年10月21日

LittleBird: Efficient Faster & Longer Transformer for Question Answering

Arxiv

0+阅读 · 2022年10月21日

Entity-Focused Dense Passage Retrieval for Outside-Knowledge Visual Question Answering

Arxiv

0+阅读 · 2022年10月20日

Medical Visual Question Answering: A Survey

Arxiv

15+阅读 · 2021年11月19日

VQA-E: Explaining, Elaborating, and Enhancing Your Answers for Visual Questions

Arxiv

17+阅读 · 2018年3月20日

EARL: Joint Entity and Relation Linking for Question Answering over Knowledge Graphs

Arxiv

21+阅读 · 2018年1月16日

An Interpretable Reasoning Network for Multi-Relation Question Answering

Arxiv

17+阅读 · 2018年1月15日

VIP会员

文章信息

相关主题

相关VIP内容

【超赞的#C++#速查&信息图】“hacking c++ - Cheat Sheets & Infographics”

【超赞的#C++#速查&信息图】“hacking c++ - Cheat Sheets & Infographics”

专知会员服务

30+阅读 · 2022年3月8日

【CVPR 2022】基于Tracklet查询和建议的高效视频实例分割，Efficient Video Instance Segmentation via Tracklet Query and Proposal

【CVPR 2022】基于Tracklet查询和建议的高效视频实例分割，Efficient Video Instance Segmentation via Tracklet Query and Proposal

专知会员服务

16+阅读 · 2022年3月3日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《物联网（IoT）中的无人机通信高效控制》135页

《在GNSS信号降级环境中利用共识实现无人机集群稳健协调》

中程单向攻击无人机的战略意义：俄乌战争启示

《面向无人机集群的避障动态传感器覆盖算法》最新38页

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

中国图象图形学学会CSIG

0+阅读 · 2021年11月10日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

Compressing And Debiasing Vision-Language Pre-Trained Models for Visual Question Answering

Arxiv

1+阅读 · 2022年10月26日

Reliability-Aware Prediction via Uncertainty Learning for Person Image Retrieval

Arxiv

0+阅读 · 2022年10月24日

Language-free Training for Zero-shot Video Grounding

Arxiv

0+阅读 · 2022年10月24日

Efficient learning of nonlinear prediction models with time-series privileged information

Arxiv

0+阅读 · 2022年10月21日

LittleBird: Efficient Faster & Longer Transformer for Question Answering

Arxiv

0+阅读 · 2022年10月21日

Entity-Focused Dense Passage Retrieval for Outside-Knowledge Visual Question Answering

Arxiv

0+阅读 · 2022年10月20日

Medical Visual Question Answering: A Survey

Arxiv

15+阅读 · 2021年11月19日

VQA-E: Explaining, Elaborating, and Enhancing Your Answers for Visual Questions

Arxiv

17+阅读 · 2018年3月20日

EARL: Joint Entity and Relation Linking for Question Answering over Knowledge Graphs

Arxiv

21+阅读 · 2018年1月16日

An Interpretable Reasoning Network for Multi-Relation Question Answering

Arxiv

17+阅读 · 2018年1月15日

相关基金

基于量子点酶辅助RCA放大荧光偏振分析用于miRNA高灵敏多重检测

国家自然科学基金

0+阅读 · 2014年12月31日

mTOR功能性单倍体通过ERS-IRE1/α-JNK通路调控乳腺癌细胞药物敏感性的机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

力-温-湿多场耦合对PEMFC性能损耗影响的机理研究

国家自然科学基金

0+阅读 · 2013年12月31日

预应力波纹管孔道压浆质量的超声检测方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

新型磁性量子点复合物的电致化学发光生物传感器及其在肿瘤细胞检测中的应用研究

国家自然科学基金

0+阅读 · 2011年12月31日

多谱NaY(Gd)F4:Yb,Er(Tm)纳米粒子的界面修饰、性质及农药多残留免疫分析方法研究

国家自然科学基金

0+阅读 · 2011年12月31日

新型四氮唑微孔配位聚合物的合成及荧光探针与气体吸附性能研究

国家自然科学基金

0+阅读 · 2011年12月31日

基于激励阵列的ACFM缺陷可视化检测技术与应用研究

国家自然科学基金

0+阅读 · 2009年12月31日

硼碳氮系有序介孔材料的制备与电化学生物传感研究

国家自然科学基金

0+阅读 · 2009年12月31日

强磁场下过渡金属掺杂In2O3稀磁半导体纳米材料的研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员