超级CLEVR:视觉理性分析域域强度的虚拟基准 (Super-CLEVR: A Virtual Benchmark to Diagnose Domain Robustness in Visual Reasoning) - 专知论文

会员服务 ·

0

视觉问答 · 分解的 · domain shift · 稳健性 · 泛化理论 ·

2022 年 12 月 1 日

Super-CLEVR: A Virtual Benchmark to Diagnose Domain Robustness in Visual Reasoning

翻译：超级CLEVR:视觉理性分析域域强度的虚拟基准

Zhuowan Li,Xingrui Wang,Elias Stengel-Eskin,Adam Kortylewski,Wufei Ma,Benjamin Van Durme,Alan Yuille

from arxiv, The dataset and code are released at https://github.com/Lizw14/Super-CLEVR

Visual Question Answering (VQA) models often perform poorly on out-of-distribution data and struggle on domain generalization. Due to the multi-modal nature of this task, multiple factors of variation are intertwined, making generalization difficult to analyze. This motivates us to introduce a virtual benchmark, Super-CLEVR, where different factors in VQA domain shifts can be isolated in order that their effects can be studied independently. Four factors are considered: visual complexity, question redundancy, concept distribution and concept compositionality. With controllably generated data, Super-CLEVR enables us to test VQA methods in situations where the test data differs from the training data along each of these axes. We study four existing methods, including two neural symbolic methods NSCL and NSVQA, and two non-symbolic methods FiLM and mDETR; and our proposed method, probabilistic NSVQA (P-NSVQA), which extends NSVQA with uncertainty reasoning. P-NSVQA outperforms other methods on three of the four domain shift factors. Our results suggest that disentangling reasoning and perception, combined with probabilistic uncertainty, form a strong VQA model that is more robust to domain shifts. The dataset and code are released at https://github.com/Lizw14/Super-CLEVR.

翻译：视觉问题解答(VQA) 模型在分配外的数据和领域一般化斗争方面往往表现不佳。由于这项任务的多模式性质,多种变异因素相互交织,难以分析。这促使我们引入了虚拟基准,即超级CLEVR, VQA 域变换的不同因素可以分离,以便独立研究其效果。考虑了四个因素:视觉复杂性、问题冗余、概念分布和概念构成性。随着可控制生成的数据,超级CLEVR使我们能够在测试数据与每个轴的训练数据不同的情况下测试VQA方法。我们研究了四种现有方法,包括两种神经象征方法NSCLL和NSVQA,以及两种非同步方法FILM和MDETR;以及我们拟议的方法,即概率性NSVQA(P-NSVQA),该方法以不确定性推理推理扩展NSVVQA。P-NSVQA在四种域变换因素中,其他方法优于三种变换模型。我们研究的结果显示,一种更稳性的变位式数据推判和变式数据。

1

相关内容

视觉问答

视觉问答（Visual Question Answering，VQA），是一种涉及计算机视觉和自然语言处理的学习任务。这一任务的定义如下： A VQA system takes as input an image and a free-form, open-ended, natural-language question about the image and produces a natural-language answer as the output[1]。翻译为中文：一个VQA系统以一张图片和一个关于这张图片形式自由、开放式的自然语言问题作为输入，以生成一条自然语言答案作为输出。简单来说，VQA就是给定的图片进行问答。

知识荟萃

精品入门和进阶教程、论文和代码整理等

更多

查看相关VIP内容、论文、资讯等

IJCAI2022开会了! 微软等《领域泛化Domain Generalization》教程，阐述DG最新进展，附PPT和视频

IJCAI2022开会了! 微软等《领域泛化Domain Generalization》教程，阐述DG最新进展，附PPT和视频

专知会员服务

60+阅读 · 2022年7月24日

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

计算机科学课程与视频课件合集，Computer Science courses with video lectures

计算机科学课程与视频课件合集，Computer Science courses with video lectures

专知会员服务

37+阅读 · 2022年1月24日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文推荐】最新七篇图像分割相关论文—域适应深度表示学习、循环残差卷积、二值分割、图像合成、无监督跨模态

【论文推荐】最新七篇图像分割相关论文—域适应深度表示学习、循环残差卷积、二值分割、图像合成、无监督跨模态

专知

19+阅读 · 2018年6月1日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

【推荐】GAN架构入门综述(资源汇总)

【推荐】GAN架构入门综述(资源汇总)

机器学习研究会

10+阅读 · 2017年9月3日

【推荐】SVM实例教程

【推荐】SVM实例教程

机器学习研究会

17+阅读 · 2017年8月26日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

基于面部解剖结构动力学模型与多模态时空数据耦合的人脸仿真

国家自然科学基金

0+阅读 · 2013年12月31日

基于SURE/PURE准则的图像盲反卷积算法研究

国家自然科学基金

3+阅读 · 2013年12月31日

Kronheimer-Nakajima quiver 模空间与有理曲面

国家自然科学基金

1+阅读 · 2013年12月31日

调和分析及其应用

国家自然科学基金

0+阅读 · 2013年12月31日

Intraflagellar Transport运输纤毛蛋白的分子机理

国家自然科学基金

0+阅读 · 2012年12月31日

15-kDa硒蛋白在内质网应激（ERS）和阿尔茨海默病(AD)中的功能研究

国家自然科学基金

0+阅读 · 2012年12月31日

苝酰亚胺类半导体材料介观超分子结构调控及在光伏器件中的应用

国家自然科学基金

0+阅读 · 2012年12月31日

内质网应激在视网膜色素变性中的作用研究

国家自然科学基金

0+阅读 · 2011年12月31日

细胞膜和线粒体Cx43介导氧化应激与心脏缺血后处理相关性研究

国家自然科学基金

0+阅读 · 2009年12月31日

CFFT-GAN: Cross-domain Feature Fusion Transformer for Exemplar-based Image Translation

Arxiv

0+阅读 · 2023年2月3日

Class Overwhelms: Mutual Conditional Blended-Target Domain Adaptation

Arxiv

0+阅读 · 2023年2月3日

Dreamix: Video Diffusion Models are General Video Editors

Arxiv

0+阅读 · 2023年2月2日

Encouraging Intra-Class Diversity Through a Reverse Contrastive Loss for Better Single-Source Domain Generalization

Arxiv

0+阅读 · 2023年2月2日

VQA and Visual Reasoning: An Overview of Recent Datasets, Methods and Challenges

Arxiv

11+阅读 · 2022年12月26日

VideoDG: Generalizing Temporal Relations in Videos to Novel Domains

Arxiv

14+阅读 · 2021年9月17日

Domain Generalization in Vision: A Survey

Arxiv

16+阅读 · 2021年7月18日

Unsupervised Multi-Source Domain Adaptation for Person Re-Identification

Arxiv

14+阅读 · 2021年4月27日

Adaptive Methods for Real-World Domain Generalization

Arxiv

13+阅读 · 2021年3月29日

Diverse Image-to-Image Translation via Disentangled Representations

Diverse Image-to-Image Translation via Disentangled Representations

Arxiv

13+阅读 · 2018年8月2日

VIP会员

文章信息

相关主题

相关VIP内容

IJCAI2022开会了! 微软等《领域泛化Domain Generalization》教程，阐述DG最新进展，附PPT和视频

IJCAI2022开会了! 微软等《领域泛化Domain Generalization》教程，阐述DG最新进展，附PPT和视频

专知会员服务

60+阅读 · 2022年7月24日

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

计算机科学课程与视频课件合集，Computer Science courses with video lectures

计算机科学课程与视频课件合集，Computer Science courses with video lectures

专知会员服务

37+阅读 · 2022年1月24日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《使用量化测量将传感器节点关联到融合中心的算法设计》171页

军事前沿模型

提升军事训练能力的最佳人工智能模拟工具

《社交媒体信息作战》最新48页技术报告

相关资讯

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文推荐】最新七篇图像分割相关论文—域适应深度表示学习、循环残差卷积、二值分割、图像合成、无监督跨模态

【论文推荐】最新七篇图像分割相关论文—域适应深度表示学习、循环残差卷积、二值分割、图像合成、无监督跨模态

专知

19+阅读 · 2018年6月1日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

【推荐】GAN架构入门综述(资源汇总)

【推荐】GAN架构入门综述(资源汇总)

机器学习研究会

10+阅读 · 2017年9月3日

【推荐】SVM实例教程

【推荐】SVM实例教程

机器学习研究会

17+阅读 · 2017年8月26日

相关论文

CFFT-GAN: Cross-domain Feature Fusion Transformer for Exemplar-based Image Translation

Arxiv

0+阅读 · 2023年2月3日

Class Overwhelms: Mutual Conditional Blended-Target Domain Adaptation

Arxiv

0+阅读 · 2023年2月3日

Dreamix: Video Diffusion Models are General Video Editors

Arxiv

0+阅读 · 2023年2月2日

Encouraging Intra-Class Diversity Through a Reverse Contrastive Loss for Better Single-Source Domain Generalization

Arxiv

0+阅读 · 2023年2月2日

VQA and Visual Reasoning: An Overview of Recent Datasets, Methods and Challenges

Arxiv

11+阅读 · 2022年12月26日

VideoDG: Generalizing Temporal Relations in Videos to Novel Domains

Arxiv

14+阅读 · 2021年9月17日

Domain Generalization in Vision: A Survey

Arxiv

16+阅读 · 2021年7月18日

Unsupervised Multi-Source Domain Adaptation for Person Re-Identification

Arxiv

14+阅读 · 2021年4月27日

Adaptive Methods for Real-World Domain Generalization

Arxiv

13+阅读 · 2021年3月29日

Diverse Image-to-Image Translation via Disentangled Representations

Diverse Image-to-Image Translation via Disentangled Representations

Arxiv

13+阅读 · 2018年8月2日

相关基金

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

基于面部解剖结构动力学模型与多模态时空数据耦合的人脸仿真

国家自然科学基金

0+阅读 · 2013年12月31日

基于SURE/PURE准则的图像盲反卷积算法研究

国家自然科学基金

3+阅读 · 2013年12月31日

Kronheimer-Nakajima quiver 模空间与有理曲面

国家自然科学基金

1+阅读 · 2013年12月31日

调和分析及其应用

国家自然科学基金

0+阅读 · 2013年12月31日

Intraflagellar Transport运输纤毛蛋白的分子机理

国家自然科学基金

0+阅读 · 2012年12月31日

15-kDa硒蛋白在内质网应激（ERS）和阿尔茨海默病(AD)中的功能研究

国家自然科学基金

0+阅读 · 2012年12月31日

苝酰亚胺类半导体材料介观超分子结构调控及在光伏器件中的应用

国家自然科学基金

0+阅读 · 2012年12月31日

内质网应激在视网膜色素变性中的作用研究

国家自然科学基金

0+阅读 · 2011年12月31日

细胞膜和线粒体Cx43介导氧化应激与心脏缺血后处理相关性研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员