VIPHHY: 检验“可视”物理常识知识 (VIPHY: Probing "Visible" Physical Commonsense Knowledge) - 专知论文

会员服务 ·

0

知识 (knowledge) · Performer · MoDELS · INFORMS · HTTPS ·

2022 年 9 月 15 日

VIPHY: Probing "Visible" Physical Commonsense Knowledge

翻译：VIPHHY: 检验“可视”物理常识知识

Shikhar Singh,Ehsan Qasemi,Muhao Chen

from arxiv, In Progress (under review)

In recent years, vision-language models (VLMs) have shown remarkable performance on visual reasoning tasks (e.g. attributes, location). While such tasks measure the requisite knowledge to ground and reason over a given visual instance, they do not, however, measure the ability of VLMs to retain and generalize such knowledge. In this work, we evaluate their ability to acquire "visible" physical knowledge -- the information that is easily accessible from images of static scenes, particularly across the dimensions of object color, size and space. We build an automatic pipeline to derive a comprehensive knowledge resource for calibrating and probing these models. Our results indicate a severe gap between model and human performance across all three tasks. Furthermore, our caption pretrained baseline (CapBERT) significantly outperforms VLMs on both size and spatial tasks -- highlighting that despite sufficient access to ground language with visual modality, they struggle to retain such knowledge. The dataset and code are available at https://github.com/Axe--/ViPhy .

翻译：近年来,视觉语言模型(VLMS)在视觉推理任务(例如属性、位置)上表现显著,虽然这些任务测量了在特定视觉实例中地面和理性所需的知识,但是它们并没有测量VLMs保留和普及这种知识的能力。在这项工作中,我们评估了它们获得“可见”物理知识的能力,这种知识很容易从静态场景图像中获取,特别是从物体的颜色、大小和空间等方方面面获得。我们建立了一条自动管道,以获得用于校准和检验这些模型的全面知识资源。我们的结果显示,模型和人类业绩在所有三项任务中都存在严重差距。此外,我们的字幕预设基线(CAPBERT)大大超越了VLMs在大小和空间两方面的任务。我们强调,尽管能够以视觉方式充分接触地面语言,但它们很难保留这种知识。数据集和代码可在https://github.com/Axe-/ViPhy查阅。

0

相关内容

知识 (knowledge)

知识 (knowledge)

通过学习、实践或探索所获得的认识、判断或技能。

多模态认知计算

多模态认知计算

专知会员服务

180+阅读 · 2022年9月16日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

【因果基础】Causality Basics，36页ppt

专知会员服务

52+阅读 · 2021年8月8日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

15+阅读 · 2019年10月23日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

let-7g介导的ceRNA调控网络对奥氮平/氯氮平相关代谢综合征的作用机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

分子伴侣Calnexin/Calreticulin和Erp57在流感病毒HA蛋白成熟过程中的作用研究

国家自然科学基金

0+阅读 · 2015年12月31日

miR-124靶向TRAF6在骨肉瘤中的作用

国家自然科学基金

0+阅读 · 2013年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

结合多模态影像技术观察VDAC1介导氢气对CA-ROSC兔脑能量代谢的作用及机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

BRCA1蛋白出核的分子机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

c-Abl基因缺失与PrPSc诱导神经元细胞氧化应激机制的研究

国家自然科学基金

0+阅读 · 2012年12月31日

PSCA对前列腺癌细胞自分泌IL-6的调控作用及其机制

国家自然科学基金

0+阅读 · 2012年12月31日

能量临界情形的非线性Schrodinger方程

国家自然科学基金

0+阅读 · 2011年12月31日

SARI基因在肺癌侵袭转移中的作用及分子机制

国家自然科学基金

0+阅读 · 2009年12月31日

Advanced Semantics for Commonsense Knowledge Extraction

Arxiv

0+阅读 · 2022年10月25日

VLC-BERT: Visual Question Answering with Contextualized Commonsense Knowledge

Arxiv

0+阅读 · 2022年10月24日

FCM: Forgetful Causal Masking Makes Causal Language Models Better Zero-Shot Learners

Arxiv

0+阅读 · 2022年10月24日

Causal Inference Principles for Reasoning about Commonsense Causality

Arxiv

13+阅读 · 2022年1月31日

Improving Event Causality Identification via Self-Supervised Representation Learning on External Causal Statement

Arxiv

15+阅读 · 2021年6月3日

CSKG: The CommonSense Knowledge Graph

CSKG: The CommonSense Knowledge Graph

Arxiv

18+阅读 · 2020年12月21日

The Causal Learning of Retail Delinquency

Arxiv

14+阅读 · 2020年12月17日

Spatio-Temporal Graph for Video Captioning with Knowledge Distillation

Spatio-Temporal Graph for Video Captioning with Knowledge Distillation

Arxiv

19+阅读 · 2020年3月31日

Commonsense Reasoning for Natural Language Understanding: A Survey of Benchmarks, Resources, and Approaches

Arxiv

16+阅读 · 2019年4月2日

Transferring Common-Sense Knowledge for Object Detection

Arxiv

12+阅读 · 2018年4月3日

VIP会员

文章信息

相关主题

知识 (knowledge)

相关VIP内容

多模态认知计算

多模态认知计算

专知会员服务

180+阅读 · 2022年9月16日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

【因果基础】Causality Basics，36页ppt

专知会员服务

52+阅读 · 2021年8月8日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

15+阅读 · 2019年10月23日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《美国海军陆战队软件定义网络应用案例：分布式防火墙自动化系统》148页

《多体环境下定位导航授时（PNT）系统研究》228页

软件定义无线电（SDR）：商业与军事领域的技术、应用及未来趋势

《攻势防空作战中无人追击者/规避者最优轨迹研究（含动态交战区建模）》95页

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

Advanced Semantics for Commonsense Knowledge Extraction

Arxiv

0+阅读 · 2022年10月25日

VLC-BERT: Visual Question Answering with Contextualized Commonsense Knowledge

Arxiv

0+阅读 · 2022年10月24日

FCM: Forgetful Causal Masking Makes Causal Language Models Better Zero-Shot Learners

Arxiv

0+阅读 · 2022年10月24日

Causal Inference Principles for Reasoning about Commonsense Causality

Arxiv

13+阅读 · 2022年1月31日

Improving Event Causality Identification via Self-Supervised Representation Learning on External Causal Statement

Arxiv

15+阅读 · 2021年6月3日

CSKG: The CommonSense Knowledge Graph

CSKG: The CommonSense Knowledge Graph

Arxiv

18+阅读 · 2020年12月21日

The Causal Learning of Retail Delinquency

Arxiv

14+阅读 · 2020年12月17日

Spatio-Temporal Graph for Video Captioning with Knowledge Distillation

Spatio-Temporal Graph for Video Captioning with Knowledge Distillation

Arxiv

19+阅读 · 2020年3月31日

Commonsense Reasoning for Natural Language Understanding: A Survey of Benchmarks, Resources, and Approaches

Arxiv

16+阅读 · 2019年4月2日

Transferring Common-Sense Knowledge for Object Detection

Arxiv

12+阅读 · 2018年4月3日

相关基金

let-7g介导的ceRNA调控网络对奥氮平/氯氮平相关代谢综合征的作用机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

分子伴侣Calnexin/Calreticulin和Erp57在流感病毒HA蛋白成熟过程中的作用研究

国家自然科学基金

0+阅读 · 2015年12月31日

miR-124靶向TRAF6在骨肉瘤中的作用

国家自然科学基金

0+阅读 · 2013年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

结合多模态影像技术观察VDAC1介导氢气对CA-ROSC兔脑能量代谢的作用及机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

BRCA1蛋白出核的分子机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

c-Abl基因缺失与PrPSc诱导神经元细胞氧化应激机制的研究

国家自然科学基金

0+阅读 · 2012年12月31日

PSCA对前列腺癌细胞自分泌IL-6的调控作用及其机制

国家自然科学基金

0+阅读 · 2012年12月31日

能量临界情形的非线性Schrodinger方程

国家自然科学基金

0+阅读 · 2011年12月31日

SARI基因在肺癌侵袭转移中的作用及分子机制

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员