人-人-人-人-人-人视觉问题解答 (Human-Adversarial Visual Question Answering) - 专知论文

会员服务 ·

0

视觉问答 · state-of-the-art · INTERACT · Performer · MoDELS ·

2021 年 6 月 4 日

Human-Adversarial Visual Question Answering

翻译：人-人-人-人-人-人视觉问题解答

Sasha Sheng,Amanpreet Singh,Vedanuj Goswami,Jose Alberto Lopez Magana,Wojciech Galuba,Devi Parikh,Douwe Kiela

from arxiv, 22 pages, 13 figures. First two authors contributed equally

Performance on the most commonly used Visual Question Answering dataset (VQA v2) is starting to approach human accuracy. However, in interacting with state-of-the-art VQA models, it is clear that the problem is far from being solved. In order to stress test VQA models, we benchmark them against human-adversarial examples. Human subjects interact with a state-of-the-art VQA model, and for each image in the dataset, attempt to find a question where the model's predicted answer is incorrect. We find that a wide range of state-of-the-art models perform poorly when evaluated on these examples. We conduct an extensive analysis of the collected adversarial examples and provide guidance on future research directions. We hope that this Adversarial VQA (AdVQA) benchmark can help drive progress in the field and advance the state of the art.

翻译：在最常用的视觉问题回答数据集(VQA v2)上的性能开始接近人的准确性。然而,在与最新的VQA模型进行互动时,问题显然远未解决。为了对VQA模型进行压力测试,我们根据人与人之间的对立实例进行基准测试。人类主体与最先进的VQA模型互动,并针对数据集中的每张图像,试图找到一个模型预测的答案不正确的问题。我们发现,在评估这些实例时,大量最先进的模型表现不佳。我们对收集到的对抗性实例进行广泛分析,并就未来的研究方向提供指导。我们希望,Adversarial VQA(AdVQA)基准能够帮助推动实地的进步,推进艺术的状态。

1

相关内容

视觉问答

视觉问答（Visual Question Answering，VQA），是一种涉及计算机视觉和自然语言处理的学习任务。这一任务的定义如下： A VQA system takes as input an image and a free-form, open-ended, natural-language question about the image and produces a natural-language answer as the output[1]。翻译为中文：一个VQA系统以一张图片和一个关于这张图片形式自由、开放式的自然语言问题作为输入，以生成一条自然语言答案作为输出。简单来说，VQA就是给定的图片进行问答。

知识荟萃

精品入门和进阶教程、论文和代码整理等

更多

查看相关VIP内容、论文、资讯等

复杂的序列数据分析：现有算法的系统文献综述，Complex Sequential Data Analysis: A Systematic Literature Review of Existing Algorithms

复杂的序列数据分析：现有算法的系统文献综述，Complex Sequential Data Analysis: A Systematic Literature Review of Existing Algorithms

专知会员服务

27+阅读 · 2020年7月24日

【CVPR2020】通过自适应GANs生成不同的图像，Diverse Image Generation via Self-Conditioned GANs

【CVPR2020】通过自适应GANs生成不同的图像，Diverse Image Generation via Self-Conditioned GANs

专知会员服务

34+阅读 · 2020年6月19日

【CVPR2020】自监督的深度视觉测程与在线适应，Self-Supervised Deep Visual Odometry

【CVPR2020】自监督的深度视觉测程与在线适应，Self-Supervised Deep Visual Odometry

专知会员服务

32+阅读 · 2020年5月14日

【DeepMind】基于变换的大规模数据对抗视频预测，Transformation-based Adversarial Video Prediction on Large-Scale Data

【DeepMind】基于变换的大规模数据对抗视频预测，Transformation-based Adversarial Video Prediction on Large-Scale Data

专知会员服务

17+阅读 · 2020年3月9日

【自监督学习深度神经网络视觉特征学习综述论文】Self-supervised Visual Feature Learning with Deep Neural Networks: A Survey

【自监督学习深度神经网络视觉特征学习综述论文】Self-supervised Visual Feature Learning with Deep Neural Networks: A Survey

专知会员服务

87+阅读 · 2020年3月1日

【Facebook AI】对抗性NLI:自然语言理解的新基准，Adversarial NLI: A New Benchmark for Natural Language Understanding

【Facebook AI】对抗性NLI:自然语言理解的新基准，Adversarial NLI: A New Benchmark for Natural Language Understanding

专知会员服务

11+阅读 · 2019年11月2日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文推荐】最新六篇视觉问答相关论文—鲁棒性分析、虚拟意象、双曲注意力网络、R-VQA、关系推理、双线性注意力网络

【论文推荐】最新六篇视觉问答相关论文—鲁棒性分析、虚拟意象、双曲注意力网络、R-VQA、关系推理、双线性注意力网络

专知

17+阅读 · 2018年6月7日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

ICCV17 :12为顶级大牛教你学生成对抗网络（GAN)！

ICCV17 :12为顶级大牛教你学生成对抗网络（GAN)！

全球人工智能

8+阅读 · 2017年11月26日

【推荐】自然语言处理（NLP）指南

【推荐】自然语言处理（NLP）指南

机器学习研究会

35+阅读 · 2017年11月17日

【论文】图上的表示学习综述

【论文】图上的表示学习综述

机器学习研究会

15+阅读 · 2017年9月24日

【推荐】视频目标分割基础

【推荐】视频目标分割基础

机器学习研究会

9+阅读 · 2017年9月19日

【推荐】GAN架构入门综述(资源汇总)

【推荐】GAN架构入门综述(资源汇总)

机器学习研究会

10+阅读 · 2017年9月3日

Generating Question Relevant Captions to Aid Visual Question Answering

Generating Question Relevant Captions to Aid Visual Question Answering

Arxiv

5+阅读 · 2019年9月9日

OK-VQA: A Visual Question Answering Benchmark Requiring External Knowledge

OK-VQA: A Visual Question Answering Benchmark Requiring External Knowledge

Arxiv

10+阅读 · 2019年9月4日

Visual Question Answering using Deep Learning: A Survey and Performance Analysis

Arxiv

4+阅读 · 2019年8月27日

Visual Question Answering as Reading Comprehension

Arxiv

3+阅读 · 2018年11月29日

Adversarial TableQA: Attention Supervision for Question Answering on Tables

Arxiv

4+阅读 · 2018年10月18日

Robustness Analysis of Visual QA Models by Basic Questions

Arxiv

4+阅读 · 2018年5月26日

Differential Attention for Visual Question Answering

Arxiv

5+阅读 · 2018年4月3日

iVQA: Inverse Visual Question Answering

Arxiv

5+阅读 · 2018年3月16日

Interpretable Counting for Visual Question Answering

Arxiv

3+阅读 · 2017年12月23日

VQA: Visual Question Answering

Arxiv

9+阅读 · 2016年10月27日

VIP会员

文章信息

相关主题

state-of-the-art

相关VIP内容

复杂的序列数据分析：现有算法的系统文献综述，Complex Sequential Data Analysis: A Systematic Literature Review of Existing Algorithms

复杂的序列数据分析：现有算法的系统文献综述，Complex Sequential Data Analysis: A Systematic Literature Review of Existing Algorithms

专知会员服务

27+阅读 · 2020年7月24日

【CVPR2020】通过自适应GANs生成不同的图像，Diverse Image Generation via Self-Conditioned GANs

【CVPR2020】通过自适应GANs生成不同的图像，Diverse Image Generation via Self-Conditioned GANs

专知会员服务

34+阅读 · 2020年6月19日

【CVPR2020】自监督的深度视觉测程与在线适应，Self-Supervised Deep Visual Odometry

【CVPR2020】自监督的深度视觉测程与在线适应，Self-Supervised Deep Visual Odometry

专知会员服务

32+阅读 · 2020年5月14日

【DeepMind】基于变换的大规模数据对抗视频预测，Transformation-based Adversarial Video Prediction on Large-Scale Data

【DeepMind】基于变换的大规模数据对抗视频预测，Transformation-based Adversarial Video Prediction on Large-Scale Data

专知会员服务

17+阅读 · 2020年3月9日

【自监督学习深度神经网络视觉特征学习综述论文】Self-supervised Visual Feature Learning with Deep Neural Networks: A Survey

【自监督学习深度神经网络视觉特征学习综述论文】Self-supervised Visual Feature Learning with Deep Neural Networks: A Survey

专知会员服务

87+阅读 · 2020年3月1日

【Facebook AI】对抗性NLI:自然语言理解的新基准，Adversarial NLI: A New Benchmark for Natural Language Understanding

【Facebook AI】对抗性NLI:自然语言理解的新基准，Adversarial NLI: A New Benchmark for Natural Language Understanding

专知会员服务

11+阅读 · 2019年11月2日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

操作系统智能体：基于多模态大模型（MLLM）的通用计算设备智能体综述

《美国太空军系统全生命周期建模、仿真与分析效能提升方案》最新84页报告

【博士论文】推进数据高效的深度学习：非参数 Transformer、主动测试与上下文学习

自主人工智能：未来战争是否将是自主化的？

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文推荐】最新六篇视觉问答相关论文—鲁棒性分析、虚拟意象、双曲注意力网络、R-VQA、关系推理、双线性注意力网络

【论文推荐】最新六篇视觉问答相关论文—鲁棒性分析、虚拟意象、双曲注意力网络、R-VQA、关系推理、双线性注意力网络

专知

17+阅读 · 2018年6月7日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

ICCV17 :12为顶级大牛教你学生成对抗网络（GAN)！

ICCV17 :12为顶级大牛教你学生成对抗网络（GAN)！

全球人工智能

8+阅读 · 2017年11月26日

【推荐】自然语言处理（NLP）指南

【推荐】自然语言处理（NLP）指南

机器学习研究会

35+阅读 · 2017年11月17日

【论文】图上的表示学习综述

【论文】图上的表示学习综述

机器学习研究会

15+阅读 · 2017年9月24日

【推荐】视频目标分割基础

【推荐】视频目标分割基础

机器学习研究会

9+阅读 · 2017年9月19日

【推荐】GAN架构入门综述(资源汇总)

【推荐】GAN架构入门综述(资源汇总)

机器学习研究会

10+阅读 · 2017年9月3日

相关论文

Generating Question Relevant Captions to Aid Visual Question Answering

Generating Question Relevant Captions to Aid Visual Question Answering

Arxiv

5+阅读 · 2019年9月9日

OK-VQA: A Visual Question Answering Benchmark Requiring External Knowledge

OK-VQA: A Visual Question Answering Benchmark Requiring External Knowledge

Arxiv

10+阅读 · 2019年9月4日

Visual Question Answering using Deep Learning: A Survey and Performance Analysis

Arxiv

4+阅读 · 2019年8月27日

Visual Question Answering as Reading Comprehension

Arxiv

3+阅读 · 2018年11月29日

Adversarial TableQA: Attention Supervision for Question Answering on Tables

Arxiv

4+阅读 · 2018年10月18日

Robustness Analysis of Visual QA Models by Basic Questions

Arxiv

4+阅读 · 2018年5月26日

Differential Attention for Visual Question Answering

Arxiv

5+阅读 · 2018年4月3日

iVQA: Inverse Visual Question Answering

Arxiv

5+阅读 · 2018年3月16日

Interpretable Counting for Visual Question Answering

Arxiv

3+阅读 · 2017年12月23日

VQA: Visual Question Answering

Arxiv

9+阅读 · 2016年10月27日

微信扫码咨询专知VIP会员