学习在多式联运环境中的物理行动的影响 (Learning the Effects of Physical Actions in a Multi-modal Environment) - 专知论文

会员服务 ·

0

回合 · MoDELS · Better · Learning · INFORMS ·

2023 年 1 月 27 日

Learning the Effects of Physical Actions in a Multi-modal Environment

翻译：学习在多式联运环境中的物理行动的影响

Gautier Dagan,Frank Keller,Alex Lascarides

Large Language Models (LLMs) handle physical commonsense information inadequately. As a result of being trained in a disembodied setting, LLMs often fail to predict an action's outcome in a given environment. However, predicting the effects of an action before it is executed is crucial in planning, where coherent sequences of actions are often needed to achieve a goal. Therefore, we introduce the multi-modal task of predicting the outcomes of actions solely from realistic sensory inputs (images and text). Next, we extend an LLM to model latent representations of objects to better predict action outcomes in an environment. We show that multi-modal models can capture physical commonsense when augmented with visual information. Finally, we evaluate our model's performance on novel actions and objects and find that combining modalities help models to generalize and learn physical commonsense reasoning better.

翻译：大型语言模型(LLMS)处理的有形常识信息不够充分。由于在空洞的环境中接受培训,LLMS往往无法预测特定环境中的行动结果。然而,预测行动执行前的行动效果对于规划来说至关重要,在规划中,往往需要一致的行动顺序来实现目标。因此,我们引入了仅从现实的感官投入(图像和文字)预测行动结果的多模式任务。接下来,我们将LMM扩大到对物体的模拟潜在表现,以便更好地预测环境中的行动结果。我们显示,多模式模型在增加视觉信息时,可以捕捉到物理常识。最后,我们评估了我们模型在新行动和物体上的绩效,并发现将模式结合起来有助于将物理常识推理归纳和学习得更好。

0

相关内容

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

多孔配位聚合物的吸附机理研究

国家自然科学基金

0+阅读 · 2014年12月31日

面向大数据的信息可视化设计方法研究

国家自然科学基金

6+阅读 · 2014年12月31日

PTEN、SHIP和CTMP对糖尿病肾病的作用及机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

运动和高温调控胰岛素抵抗大鼠Irisin代谢通路的研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于混合DNA策略的肺癌相关拷贝数变异的全基因组关联研究

国家自然科学基金

0+阅读 · 2011年12月31日

Interpretability from a new lens: Integrating Stratification and Domain knowledge for Biomedical Applications

Arxiv

0+阅读 · 2023年3月15日

Understanding and Constructing Latent Modality Structures in Multi-modal Representation Learning

Arxiv

11+阅读 · 2023年3月10日

Controllable Data Generation by Deep Learning: A Review

Arxiv

15+阅读 · 2022年7月19日

A Survey of Deep Reinforcement Learning in Recommender Systems: A Systematic Review and Future Directions

Arxiv

15+阅读 · 2021年9月8日

Text Detection and Recognition in the Wild: A Review

Arxiv

20+阅读 · 2020年6月8日

VIP会员

文章信息

相关主题

相关VIP内容

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

智能体工程（Agent Engineering）

《全球地缘政治环境中的反无人机系统互操作性》252页

专业软件开发者不靠“氛围编程”（Vibe Coding），而靠“控制”：2025 年 AI Agent 在编程中的应用研究

基于大语言模型的智能体化软件问题解决：综述

相关资讯

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

Interpretability from a new lens: Integrating Stratification and Domain knowledge for Biomedical Applications

Arxiv

0+阅读 · 2023年3月15日

Understanding and Constructing Latent Modality Structures in Multi-modal Representation Learning

Arxiv

11+阅读 · 2023年3月10日

Controllable Data Generation by Deep Learning: A Review

Arxiv

15+阅读 · 2022年7月19日

A Survey of Deep Reinforcement Learning in Recommender Systems: A Systematic Review and Future Directions

Arxiv

15+阅读 · 2021年9月8日

Text Detection and Recognition in the Wild: A Review

Arxiv

20+阅读 · 2020年6月8日

相关基金

多孔配位聚合物的吸附机理研究

国家自然科学基金

0+阅读 · 2014年12月31日

面向大数据的信息可视化设计方法研究

国家自然科学基金

6+阅读 · 2014年12月31日

PTEN、SHIP和CTMP对糖尿病肾病的作用及机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

运动和高温调控胰岛素抵抗大鼠Irisin代谢通路的研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于混合DNA策略的肺癌相关拷贝数变异的全基因组关联研究

国家自然科学基金

0+阅读 · 2011年12月31日

微信扫码咨询专知VIP会员