通向说明理由的可解释性VQA (Towards Reasoning-Aware Explainable VQA) - 专知论文

会员服务 ·

0

视觉问答 · 模型评估 · MoDELS · SOTA · 黑盒子 ·

2022 年 11 月 9 日

Towards Reasoning-Aware Explainable VQA

翻译：通向说明理由的可解释性VQA

Rakesh Vaideeswaran,Feng Gao,Abhinav Mathur,Govind Thattai

The domain of joint vision-language understanding, especially in the context of reasoning in Visual Question Answering (VQA) models, has garnered significant attention in the recent past. While most of the existing VQA models focus on improving the accuracy of VQA, the way models arrive at an answer is oftentimes a black box. As a step towards making the VQA task more explainable and interpretable, our method is built upon the SOTA VQA framework by augmenting it with an end-to-end explanation generation module. In this paper, we investigate two network architectures, including Long Short-Term Memory (LSTM) and Transformer decoder, as the explanation generator. Our method generates human-readable textual explanations while maintaining SOTA VQA accuracy on the GQA-REX (77.49%) and VQA-E (71.48%) datasets. Approximately 65.16% of the generated explanations are approved by humans as valid. Roughly 60.5% of the generated explanations are valid and lead to the correct answers.

翻译：共同愿景语言理解的领域,特别是在视觉问答模型的推理方面,最近引起了人们的极大关注。虽然大多数现有的 VQA 模型侧重于提高 VQA 的准确性,但模型到达答案的方式往往是一个黑盒。作为使VQA任务更能解释和解释的一步,我们的方法建立在SOTA VQA框架上,方法是通过一个端到端解释生成模块来增加它。在本文中,我们调查了两个网络结构,包括长期短期内存(LSTM)和变换器解码器,作为解释生成者。我们的方法产生人类可读的文本解释,同时在GQA-REX(77.49%)和VQA-E(71.48%)数据集上保持SOTA VQA-E(71.48%)的准确性。大约65.16%的解释得到人类的认可。大约60.5%的解释是有效的,并导致正确的答案。

1

相关内容

视觉问答

视觉问答（Visual Question Answering，VQA），是一种涉及计算机视觉和自然语言处理的学习任务。这一任务的定义如下： A VQA system takes as input an image and a free-form, open-ended, natural-language question about the image and produces a natural-language answer as the output[1]。翻译为中文：一个VQA系统以一张图片和一个关于这张图片形式自由、开放式的自然语言问题作为输入，以生成一条自然语言答案作为输出。简单来说，VQA就是给定的图片进行问答。

知识荟萃

精品入门和进阶教程、论文和代码整理等

更多

查看相关VIP内容、论文、资讯等

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

知识图嵌入和可解释人工智能 Knowledge Graph Embeddings and Explainable AI

知识图嵌入和可解释人工智能 Knowledge Graph Embeddings and Explainable AI

专知会员服务

135+阅读 · 2020年5月1日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

【论文推荐】可解释神经网络，Towards Explainable Deep Neural Networks (xDNN)

【论文推荐】可解释神经网络，Towards Explainable Deep Neural Networks (xDNN)

专知会员服务

40+阅读 · 2019年12月5日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

AI可解释性文献列表

AI可解释性文献列表

专知

42+阅读 · 2019年10月7日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文推荐】最新7篇视觉问答（VQA）相关论文—解释、读写记忆网络、逆视觉问答、视觉推理、可解释性、注意力机制、计数

【论文推荐】最新7篇视觉问答（VQA）相关论文—解释、读写记忆网络、逆视觉问答、视觉推理、可解释性、注意力机制、计数

专知

30+阅读 · 2018年3月22日

【论文推荐】最新6篇视觉问答（VQA）相关论文—目标推理、深度循环模型、可解释性、数据可视化、Triplet学习、基准

【论文推荐】最新6篇视觉问答（VQA）相关论文—目标推理、深度循环模型、可解释性、数据可视化、Triplet学习、基准

专知

15+阅读 · 2018年2月3日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

新型光合藻微生物燃料电池高效转化二氧化碳研究

国家自然科学基金

0+阅读 · 2015年12月31日

介孔复合微纳结构CaTi2O5的可控制备及光催化性能研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于TGF-β/Smad 信号通路的莪术醋制抗肝纤维化代谢组学研究

国家自然科学基金

0+阅读 · 2014年12月31日

Irisin调控肌肉与脂肪对胰岛素敏感性的研究

国家自然科学基金

0+阅读 · 2014年12月31日

甲醇制丙烯反应与扩散的介尺度耦合机制及其催化剂积碳的调控研究

国家自然科学基金

0+阅读 · 2013年12月31日

靶向微管蛋白秋水仙碱位点的白藜芦醇-Combrestatin A-4类抑制剂的设计、合成及活性研究

国家自然科学基金

0+阅读 · 2013年12月31日

二维金属基纳米材料的可控制备和催化性能研究

国家自然科学基金

0+阅读 · 2013年12月31日

表面态对氧化物半导体低维纳米结构光电性能的影响

国家自然科学基金

0+阅读 · 2012年12月31日

HC-SCR反应中乙醇催化制氢与还原剂活化耦合研究

国家自然科学基金

0+阅读 · 2011年12月31日

GeS气溶胶/MOFs复合多孔纳米材料可见光催化CO2和H2O合成甲醇的研究

国家自然科学基金

0+阅读 · 2009年12月31日

Towards Autoformalization of Mathematics and Code Correctness: Experiments with Elementary Proofs

Arxiv

0+阅读 · 2023年1月5日

Towards Reasoning in Large Language Models: A Survey

Arxiv

34+阅读 · 2022年12月20日

A Survey on Explainable Reinforcement Learning: Concepts, Algorithms, Challenges

Arxiv

28+阅读 · 2022年11月15日

ProtGNN: Towards Self-Explaining Graph Neural Networks

Arxiv

22+阅读 · 2021年12月2日

QA-GNN: Reasoning with Language Models and Knowledge Graphs for Question Answering

Arxiv

20+阅读 · 2021年5月27日

A Survey of the State of Explainable AI for Natural Language Processing

Arxiv

26+阅读 · 2020年10月1日

Machine Reasoning Explainability

Arxiv

14+阅读 · 2020年9月1日

Explainable Artificial Intelligence (XAI): Concepts, Taxonomies, Opportunities and Challenges toward Responsible AI

Explainable Artificial Intelligence (XAI): Concepts, Taxonomies, Opportunities and Challenges toward Responsible AI

Arxiv

77+阅读 · 2019年10月22日

Explainable Reasoning over Knowledge Graphs for Recommendation

Arxiv

11+阅读 · 2018年11月12日

VQA-E: Explaining, Elaborating, and Enhancing Your Answers for Visual Questions

Arxiv

17+阅读 · 2018年3月20日

VIP会员

文章信息

相关主题

相关VIP内容

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

知识图嵌入和可解释人工智能 Knowledge Graph Embeddings and Explainable AI

知识图嵌入和可解释人工智能 Knowledge Graph Embeddings and Explainable AI

专知会员服务

135+阅读 · 2020年5月1日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

【论文推荐】可解释神经网络，Towards Explainable Deep Neural Networks (xDNN)

【论文推荐】可解释神经网络，Towards Explainable Deep Neural Networks (xDNN)

专知会员服务

40+阅读 · 2019年12月5日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《多智能体不确定环境追逃博弈研究》216页

美智库最新发布《解放军"人机编组协同作战"发展路径：理论与实践》53页

现代战争"杀伤区"理论：空间尺度与结构特征、控制手段与毁伤机制、生存策略与战线转移

《俄军无人机创新技术或已在乌克兰达成"战场空中封锁"作战效果》最新18页报告

相关资讯

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

AI可解释性文献列表

AI可解释性文献列表

专知

42+阅读 · 2019年10月7日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文推荐】最新7篇视觉问答（VQA）相关论文—解释、读写记忆网络、逆视觉问答、视觉推理、可解释性、注意力机制、计数

【论文推荐】最新7篇视觉问答（VQA）相关论文—解释、读写记忆网络、逆视觉问答、视觉推理、可解释性、注意力机制、计数

专知

30+阅读 · 2018年3月22日

【论文推荐】最新6篇视觉问答（VQA）相关论文—目标推理、深度循环模型、可解释性、数据可视化、Triplet学习、基准

【论文推荐】最新6篇视觉问答（VQA）相关论文—目标推理、深度循环模型、可解释性、数据可视化、Triplet学习、基准

专知

15+阅读 · 2018年2月3日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

相关论文

Towards Autoformalization of Mathematics and Code Correctness: Experiments with Elementary Proofs

Arxiv

0+阅读 · 2023年1月5日

Towards Reasoning in Large Language Models: A Survey

Arxiv

34+阅读 · 2022年12月20日

A Survey on Explainable Reinforcement Learning: Concepts, Algorithms, Challenges

Arxiv

28+阅读 · 2022年11月15日

ProtGNN: Towards Self-Explaining Graph Neural Networks

Arxiv

22+阅读 · 2021年12月2日

QA-GNN: Reasoning with Language Models and Knowledge Graphs for Question Answering

Arxiv

20+阅读 · 2021年5月27日

A Survey of the State of Explainable AI for Natural Language Processing

Arxiv

26+阅读 · 2020年10月1日

Machine Reasoning Explainability

Arxiv

14+阅读 · 2020年9月1日

Explainable Artificial Intelligence (XAI): Concepts, Taxonomies, Opportunities and Challenges toward Responsible AI

Explainable Artificial Intelligence (XAI): Concepts, Taxonomies, Opportunities and Challenges toward Responsible AI

Arxiv

77+阅读 · 2019年10月22日

Explainable Reasoning over Knowledge Graphs for Recommendation

Arxiv

11+阅读 · 2018年11月12日

VQA-E: Explaining, Elaborating, and Enhancing Your Answers for Visual Questions

Arxiv

17+阅读 · 2018年3月20日

相关基金

新型光合藻微生物燃料电池高效转化二氧化碳研究

国家自然科学基金

0+阅读 · 2015年12月31日

介孔复合微纳结构CaTi2O5的可控制备及光催化性能研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于TGF-β/Smad 信号通路的莪术醋制抗肝纤维化代谢组学研究

国家自然科学基金

0+阅读 · 2014年12月31日

Irisin调控肌肉与脂肪对胰岛素敏感性的研究

国家自然科学基金

0+阅读 · 2014年12月31日

甲醇制丙烯反应与扩散的介尺度耦合机制及其催化剂积碳的调控研究

国家自然科学基金

0+阅读 · 2013年12月31日

靶向微管蛋白秋水仙碱位点的白藜芦醇-Combrestatin A-4类抑制剂的设计、合成及活性研究

国家自然科学基金

0+阅读 · 2013年12月31日

二维金属基纳米材料的可控制备和催化性能研究

国家自然科学基金

0+阅读 · 2013年12月31日

表面态对氧化物半导体低维纳米结构光电性能的影响

国家自然科学基金

0+阅读 · 2012年12月31日

HC-SCR反应中乙醇催化制氢与还原剂活化耦合研究

国家自然科学基金

0+阅读 · 2011年12月31日

GeS气溶胶/MOFs复合多孔纳米材料可见光催化CO2和H2O合成甲醇的研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员