联合吸收:学习设计和模拟行为 (Co-Imitation: Learning Design and Behaviour by Imitation) - 专知论文

会员服务 ·

0

奖励函数 · 泛函 · Engineering · Learning · Performer ·

2023 年 2 月 7 日

Co-Imitation: Learning Design and Behaviour by Imitation

翻译：联合吸收:学习设计和模拟行为

Chang Rajani,Karol Arndt,David Blanco-Mulero,Kevin Sebastian Luck,Ville Kyrki

from arxiv, 14 pages, 11 figures, accepted for AAAI-23

The co-adaptation of robots has been a long-standing research endeavour with the goal of adapting both body and behaviour of a system for a given task, inspired by the natural evolution of animals. Co-adaptation has the potential to eliminate costly manual hardware engineering as well as improve the performance of systems. The standard approach to co-adaptation is to use a reward function for optimizing behaviour and morphology. However, defining and constructing such reward functions is notoriously difficult and often a significant engineering effort. This paper introduces a new viewpoint on the co-adaptation problem, which we call co-imitation: finding a morphology and a policy that allow an imitator to closely match the behaviour of a demonstrator. To this end we propose a co-imitation methodology for adapting behaviour and morphology by matching state distributions of the demonstrator. Specifically, we focus on the challenging scenario with mismatched state- and action-spaces between both agents. We find that co-imitation increases behaviour similarity across a variety of tasks and settings, and demonstrate co-imitation by transferring human walking, jogging and kicking skills onto a simulated humanoid.

翻译：机器人的共适应是一项长期的研究工作,目的是在动物自然进化的启发下,使一个系统的身体和行为适应特定任务。共适应有可能消除昂贵的人工硬件工程,并改进系统的性能。共同适应的标准方法是利用奖励功能优化行为和形态学。然而,界定和构建这种奖励功能是众所周知的困难,而且往往是一项重大的工程工作。本文介绍了关于共适应问题的新观点,我们称之为共适应问题:找到一种形态学和政策,使模仿者能够密切匹配示范者的行为。为此,我们提出一种共同调整行为和形态的方法,将示范者的国家分布相匹配。具体地说,我们侧重于两种代理人之间不匹配的状态和行动空间的具有挑战性的设想。我们发现,共模仿增加了各种任务和环境中的相似性,通过将人类行走、慢步和踢动技能转移到模拟人类的模具,来展示共适应。

0

相关内容

奖励函数

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【CoRL2019最佳论文】模仿学习，A Divergence Minimization Perspective on Imitation Learning Methods

【CoRL2019最佳论文】模仿学习，A Divergence Minimization Perspective on Imitation Learning Methods

专知会员服务

24+阅读 · 2019年11月11日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

163+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

Fur调控霍乱弧菌生物膜形成和TCP合成的分子机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

迭代变化因素下基于二维H∞理论的迭代学习控制方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

高超声速飞行器动力与飞行控制一体化设计方法

国家自然科学基金

5+阅读 · 2013年12月31日

IL-32/Integrins/FAK通路在肝纤维化形成中的作用研究

国家自然科学基金

0+阅读 · 2013年12月31日

混杂Lagrange网络系统协调动力学的分析与控制

国家自然科学基金

0+阅读 · 2012年12月31日

Hint1与Girdin/Akt及Src信号通路串话在肝癌细胞增殖中的调控机制

国家自然科学基金

0+阅读 · 2012年12月31日

Ｓlingshot-1L/LIM Kinase1信号网络逆转骨肉瘤转移及多药耐药的机制

国家自然科学基金

0+阅读 · 2011年12月31日

基于磁层卫星和地面观测与太阳日冕遥测的磁场重联研究

国家自然科学基金

0+阅读 · 2011年12月31日

无限维非线性系统的分岔控制

国家自然科学基金

0+阅读 · 2011年12月31日

鲁里叶型微分包含系统的控制和观测

国家自然科学基金

0+阅读 · 2010年12月31日

MAHALO: Unifying Offline Reinforcement Learning and Imitation Learning from Observations

Arxiv

0+阅读 · 2023年3月30日

Optimizing Lead Time in Fall Detection for a Planar Bipedal Robot

Arxiv

0+阅读 · 2023年3月27日

Learning to Zoom and Unzoom

Arxiv

0+阅读 · 2023年3月27日

Information Maximizing Curriculum: A Curriculum-Based Approach for Training Mixtures of Experts

Arxiv

0+阅读 · 2023年3月27日

Learning a Single Policy for Diverse Behaviors on a Quadrupedal Robot using Scalable Motion Imitation

Arxiv

0+阅读 · 2023年3月27日

ICL-D3IE: In-Context Learning with Diverse Demonstrations Updating for Document Information Extraction

Arxiv

0+阅读 · 2023年3月26日

Optimal Transport for Offline Imitation Learning

Arxiv

0+阅读 · 2023年3月24日

OPT-Mimic: Imitation of Optimized Trajectories for Dynamic Quadruped Behaviors

Arxiv

0+阅读 · 2023年3月23日

Visual Attention Methods in Deep Learning: An In-Depth Survey

Arxiv

45+阅读 · 2022年4月16日

Event Extraction with Generative Adversarial Imitation Learning

Arxiv

13+阅读 · 2018年4月21日

VIP会员

文章信息

相关主题

相关VIP内容

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【CoRL2019最佳论文】模仿学习，A Divergence Minimization Perspective on Imitation Learning Methods

【CoRL2019最佳论文】模仿学习，A Divergence Minimization Perspective on Imitation Learning Methods

专知会员服务

24+阅读 · 2019年11月11日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

163+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

大语言模型智能体强化学习：全景综述

《城市滨海地区：理解复杂多变环境下的指挥控制框架》50页报告

【伯克利博士论文】从推理服务到训练：面向大规模 LLM 智能体的高效系统

美空军“顶点2025”实验：推进AI在C2、动态目标锁定与联盟集成中的应用

相关资讯

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

相关论文

MAHALO: Unifying Offline Reinforcement Learning and Imitation Learning from Observations

Arxiv

0+阅读 · 2023年3月30日

Optimizing Lead Time in Fall Detection for a Planar Bipedal Robot

Arxiv

0+阅读 · 2023年3月27日

Learning to Zoom and Unzoom

Arxiv

0+阅读 · 2023年3月27日

Information Maximizing Curriculum: A Curriculum-Based Approach for Training Mixtures of Experts

Arxiv

0+阅读 · 2023年3月27日

Learning a Single Policy for Diverse Behaviors on a Quadrupedal Robot using Scalable Motion Imitation

Arxiv

0+阅读 · 2023年3月27日

ICL-D3IE: In-Context Learning with Diverse Demonstrations Updating for Document Information Extraction

Arxiv

0+阅读 · 2023年3月26日

Optimal Transport for Offline Imitation Learning

Arxiv

0+阅读 · 2023年3月24日

OPT-Mimic: Imitation of Optimized Trajectories for Dynamic Quadruped Behaviors

Arxiv

0+阅读 · 2023年3月23日

Visual Attention Methods in Deep Learning: An In-Depth Survey

Arxiv

45+阅读 · 2022年4月16日

Event Extraction with Generative Adversarial Imitation Learning

Arxiv

13+阅读 · 2018年4月21日

相关基金

Fur调控霍乱弧菌生物膜形成和TCP合成的分子机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

迭代变化因素下基于二维H∞理论的迭代学习控制方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

高超声速飞行器动力与飞行控制一体化设计方法

国家自然科学基金

5+阅读 · 2013年12月31日

IL-32/Integrins/FAK通路在肝纤维化形成中的作用研究

国家自然科学基金

0+阅读 · 2013年12月31日

混杂Lagrange网络系统协调动力学的分析与控制

国家自然科学基金

0+阅读 · 2012年12月31日

Hint1与Girdin/Akt及Src信号通路串话在肝癌细胞增殖中的调控机制

国家自然科学基金

0+阅读 · 2012年12月31日

Ｓlingshot-1L/LIM Kinase1信号网络逆转骨肉瘤转移及多药耐药的机制

国家自然科学基金

0+阅读 · 2011年12月31日

基于磁层卫星和地面观测与太阳日冕遥测的磁场重联研究

国家自然科学基金

0+阅读 · 2011年12月31日

无限维非线性系统的分岔控制

国家自然科学基金

0+阅读 · 2011年12月31日

鲁里叶型微分包含系统的控制和观测

国家自然科学基金

0+阅读 · 2010年12月31日

微信扫码咨询专知VIP会员