Both the design and control of a robot play equally important roles in its task performance. However, while optimal control is well studied in the machine learning and robotics community, less attention is placed on finding the optimal robot design. This is mainly because co-optimizing design and control in robotics is characterized as a challenging problem, and more importantly, a comprehensive evaluation benchmark for co-optimization does not exist. In this paper, we propose Evolution Gym, the first large-scale benchmark for co-optimizing the design and control of soft robots. In our benchmark, each robot is composed of different types of voxels (e.g., soft, rigid, actuators), resulting in a modular and expressive robot design space. Our benchmark environments span a wide range of tasks, including locomotion on various types of terrains and manipulation. Furthermore, we develop several robot co-evolution algorithms by combining state-of-the-art design optimization methods and deep reinforcement learning techniques. Evaluating the algorithms on our benchmark platform, we observe robots exhibiting increasingly complex behaviors as evolution progresses, with the best evolved designs solving many of our proposed tasks. Additionally, even though robot designs are evolved autonomously from scratch without prior knowledge, they often grow to resemble existing natural creatures while outperforming hand-designed robots. Nevertheless, all tested algorithms fail to find robots that succeed in our hardest environments. This suggests that more advanced algorithms are required to explore the high-dimensional design space and evolve increasingly intelligent robots -- an area of research in which we hope Evolution Gym will accelerate progress. Our website with code, environments, documentation, and tutorials is available at http://evogym.csail.mit.edu.
翻译:机器人的设计和控制在其任务性能中扮演同等重要的角色。 然而, 虽然在机器学习和机器人界对最佳控制进行了很好的研究, 但对于最佳控制进行了很好的研究, 但对于找到最佳机器人设计却不太重视。 这主要是因为机器人的设计和控制共同优化是一个具有挑战性的问题, 更重要的是, 并不存在一个共同优化的综合评估基准。 在本文中, 我们提议了 Evolu Gym, 即软机器人设计和控制共同优化设计和控制的第一个大规模基准。 在我们的基准中, 每个机器人都是由不同种类的 voxel( 例如软软、硬、动作器)组成的, 导致一个模块化和表达式机器人设计。 这主要是因为机器人的设计和控制是一个广泛的任务, 包括各种地形和操控的组合。 此外, 我们开发了数个机器人共同革命性算法, 结合了最先进的设计方法, 以及更深的强化性学习技术。 评估了我们基准平台上的算法, 我们观察的机器人表现越来越复杂的行为, 包括进化的进化过程, 以及不断进化的进化的进化的进化的进化的进化过程。