物理Q:物理理由测试床 (Phy-Q: A Testbed for Physical Reasoning) - 专知论文

会员服务 ·

0

泛化理论 · Extensibility · 学成 · 可辨认的 · 知识 (knowledge) ·

2022 年 5 月 18 日

Phy-Q: A Testbed for Physical Reasoning

翻译：物理Q:物理理由测试床

Cheng Xue,Vimukthini Pinto,Chathura Gamage,Ekaterina Nikonova,Peng Zhang,Jochen Renz

from arxiv, For the associated website, see https://github.com/phy-q/benchmark

Humans are well-versed in reasoning about the behaviors of physical objects and choosing actions accordingly to accomplish tasks, while it remains a major challenge for AI. To facilitate research addressing this problem, we propose a new testbed that requires an agent to reason about physical scenarios and take an action appropriately. Inspired by the physical knowledge acquired in infancy and the capabilities required for robots to operate in real-world environments, we identify 15 essential physical scenarios. For each scenario, we create a wide variety of distinct task templates, and we ensure all the task templates within the same scenario can be solved by using one specific strategic physical rule. By having such a design, we evaluate two distinct levels of generalization, namely the local generalization and the broad generalization. We conduct an extensive evaluation with human players, learning agents with varying input types and architectures, and heuristic agents with different strategies. Inspired by how human IQ is calculated, we define the physical reasoning quotient (Phy-Q score) that reflects the physical reasoning intelligence of an agent. Our evaluation shows that 1) all agents are far below human performance, and 2) learning agents, even with good local generalization ability, struggle to learn the underlying physical reasoning rules and fail to generalize broadly. We encourage the development of intelligent agents that can reach the human level Phy-Q score. Website: https://github.com/phy-q/benchmark

翻译：人类完全精通物理物体行为的推理,并据此选择完成任务的行动,这仍然是大赦国际面临的一项重大挑战。为了便利研究解决这一问题,我们提议一个新的测试台,要求代理人了解物理情景并采取适当行动。受婴儿获得的物理知识和机器人在现实世界环境中运作所需的物理知识的启发,我们确定15种基本物理情景。对于每一种情景,我们创建了各种不同的任务模板,我们确保同一情景中的所有任务模板都能通过使用一种具体的战略物理规则来解决。我们的评价表明,1)所有代理人都远远低于人类绩效,2)学习代理人,即使具有良好的当地一般化能力,我们也与人类玩家、具有不同投入类型和结构的学习代理人以及具有不同战略的超自然剂进行广泛的评价。根据人类智商的计算,我们定义了反映代理人物理推理智慧的物理推理(Phy-Q分)。我们的评价表明,1)所有代理人都远远低于人类绩效,2)学习代理人,即使具有良好的当地一般化能力,我们也要与人类行为者进行广泛的评价,并努力学习物理推理学/基础。我们鼓励一般推理学水平。

0

相关内容

泛化理论

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

三维椭圆方程Cauchy问题的正则化方法

国家自然科学基金

0+阅读 · 2013年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

高电荷态原子电子碰撞辐射极化特性的理论研究

国家自然科学基金

0+阅读 · 2012年12月31日

针对Android系统的Java/C++多语言接口建模与分析

国家自然科学基金

0+阅读 · 2012年12月31日

关于Cayley图的若干研究

国家自然科学基金

0+阅读 · 2012年12月31日

Hamilton系统的辛几何算法和对称算法的定性研究

国家自然科学基金

0+阅读 · 2012年12月31日

利用CALIPSO资料研究极地平流层云对卫星监测臭氧损耗精度的影响

国家自然科学基金

0+阅读 · 2012年12月31日

Cayley图的匹配可扩性和semi-Cayley图的谱

国家自然科学基金

0+阅读 · 2011年12月31日

膀胱癌特异性lncRNA-UCA1的结合分子及作用

国家自然科学基金

0+阅读 · 2011年12月31日

遍历哈密顿系统的谱理论

国家自然科学基金

0+阅读 · 2009年12月31日

Unified Learning from Demonstrations, Corrections, and Preferences during Physical Human-Robot Interaction

Unified Learning from Demonstrations, Corrections, and Preferences during Physical Human-Robot Interaction

Arxiv

0+阅读 · 2022年7月7日

Functional additive models on manifolds of planar shapes and forms

Arxiv

0+阅读 · 2022年7月7日

A Local Optimization Framework for Multi-Objective Ergodic Search

Arxiv

0+阅读 · 2022年7月6日

Physically-Feasible Repair of Reactive, Linear Temporal Logic-based, High-Level Tasks

Physically-Feasible Repair of Reactive, Linear Temporal Logic-based, High-Level Tasks

Arxiv

0+阅读 · 2022年7月6日

Using Ontologies for the Formalization and Recognition of Criticality for Automated Driving

Arxiv

0+阅读 · 2022年7月5日

Local Multi-Label Explanations for Random Forest

Arxiv

0+阅读 · 2022年7月5日

Hierarchical Symbolic Reasoning in Hyperbolic Space for Deep Discriminative Models

Arxiv

0+阅读 · 2022年7月5日

QA-GNN: Reasoning with Language Models and Knowledge Graphs for Question Answering

Arxiv

20+阅读 · 2021年5月27日

Adaptive Synthetic Characters for Military Training

Adaptive Synthetic Characters for Military Training

Arxiv

49+阅读 · 2021年1月6日

A Survey on Causal Inference

Arxiv

112+阅读 · 2020年2月5日

VIP会员

文章信息

相关主题

知识 (knowledge)

相关VIP内容

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《美国海军陆战队软件定义网络应用案例：分布式防火墙自动化系统》148页

《多体环境下定位导航授时（PNT）系统研究》228页

软件定义无线电（SDR）：商业与军事领域的技术、应用及未来趋势

《攻势防空作战中无人追击者/规避者最优轨迹研究（含动态交战区建模）》95页

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

Unified Learning from Demonstrations, Corrections, and Preferences during Physical Human-Robot Interaction

Unified Learning from Demonstrations, Corrections, and Preferences during Physical Human-Robot Interaction

Arxiv

0+阅读 · 2022年7月7日

Functional additive models on manifolds of planar shapes and forms

Arxiv

0+阅读 · 2022年7月7日

A Local Optimization Framework for Multi-Objective Ergodic Search

Arxiv

0+阅读 · 2022年7月6日

Physically-Feasible Repair of Reactive, Linear Temporal Logic-based, High-Level Tasks

Physically-Feasible Repair of Reactive, Linear Temporal Logic-based, High-Level Tasks

Arxiv

0+阅读 · 2022年7月6日

Using Ontologies for the Formalization and Recognition of Criticality for Automated Driving

Arxiv

0+阅读 · 2022年7月5日

Local Multi-Label Explanations for Random Forest

Arxiv

0+阅读 · 2022年7月5日

Hierarchical Symbolic Reasoning in Hyperbolic Space for Deep Discriminative Models

Arxiv

0+阅读 · 2022年7月5日

QA-GNN: Reasoning with Language Models and Knowledge Graphs for Question Answering

Arxiv

20+阅读 · 2021年5月27日

Adaptive Synthetic Characters for Military Training

Adaptive Synthetic Characters for Military Training

Arxiv

49+阅读 · 2021年1月6日

A Survey on Causal Inference

Arxiv

112+阅读 · 2020年2月5日

相关基金

三维椭圆方程Cauchy问题的正则化方法

国家自然科学基金

0+阅读 · 2013年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

高电荷态原子电子碰撞辐射极化特性的理论研究

国家自然科学基金

0+阅读 · 2012年12月31日

针对Android系统的Java/C++多语言接口建模与分析

国家自然科学基金

0+阅读 · 2012年12月31日

关于Cayley图的若干研究

国家自然科学基金

0+阅读 · 2012年12月31日

Hamilton系统的辛几何算法和对称算法的定性研究

国家自然科学基金

0+阅读 · 2012年12月31日

利用CALIPSO资料研究极地平流层云对卫星监测臭氧损耗精度的影响

国家自然科学基金

0+阅读 · 2012年12月31日

Cayley图的匹配可扩性和semi-Cayley图的谱

国家自然科学基金

0+阅读 · 2011年12月31日

膀胱癌特异性lncRNA-UCA1的结合分子及作用

国家自然科学基金

0+阅读 · 2011年12月31日

遍历哈密顿系统的谱理论

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员