C§2:通过并行网络共同设计机器人,同时将在线和离线强化学习结合起来 (C^2:Co-design of Robots via Concurrent Networks Coupling Online and Offline Reinforcement Learning)

With the rise of computing power, using data-driven approaches for co-designing robots' morphology and controller has become a feasible way. Nevertheless, evaluating the fitness of the controller under each morphology is time-consuming. As a pioneering data-driven method, Co-adaptation utilizes a double-network mechanism with the aim of learning a Q function conditioned on morphology parameters to replace the traditional evaluation of a diverse set of candidates, thereby speeding up optimization. In this paper, we find that Co-adaptation ignores the existence of exploration error during training and state-action distribution shift during parameter transmitting, which hurt the performance. We propose the framework of the concurrent network that couples online and offline RL methods. By leveraging the behavior cloning term flexibly, we mitigate the impact of the above issues on the results. Simulation and physical experiments are performed to demonstrate that our proposed method outperforms baseline algorithms, which illustrates that the proposed method is an effective way of discovering the optimal combination of morphology and controller.

翻译：随着计算能力的提高,使用数据驱动的方法来共同设计机器人的形态和控制器已成为一种可行的方法。然而,评价每个形态下的控制器是否适合是需要时间的。作为一种先导的数据驱动方法,共同适应使用一种双网络机制,目的是学习一种以形态参数为条件的Q功能,以取代对不同候选人群的传统评价,从而加速优化。在本文中,我们发现,共同适应忽略了参数传输期间培训和状态行动分布变化期间的探索错误的存在,从而损害了性能。我们提出了同时使用的网络框架,即夫妇在网上和离线的RL方法。通过灵活地利用行为克隆术语,我们减轻上述问题对结果的影响。进行模拟和物理实验是为了证明,我们拟议的方法超越了基线算法,这说明拟议的方法是发现形态学和控制器的最佳组合的有效方法。

相关内容

Networking

关注 22

Networking：IFIP International Conferences on Networking。 Explanation：国际网络会议。 Publisher：IFIP。 SIT： http://dblp.uni-trier.de/db/conf/networking/index.html

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日