Improving open-ended learning capabilities is a promising approach to enable robots to face the unbounded complexity of the real-world. Among existing methods, the ability of Quality-Diversity algorithms to generate large collections of diverse and high-performing skills is instrumental in this context. However, most of those algorithms rely on a hand-coded behavioural descriptor to characterise the diversity, hence requiring prior knowledge about the considered tasks. In this work, we propose an additional analysis of Autonomous Robots Realising their Abilities; a Quality-Diversity algorithm that autonomously finds behavioural characterisations. We evaluate this approach on a simulated robotic environment, where the robot has to autonomously discover its abilities from its full-state trajectories. All algorithms were applied to three tasks: navigation, moving forward with a high velocity, and performing half-rolls. The experimental results show that the algorithm under study discovers autonomously collections of solutions that are diverse with respect to all tasks. More specifically, the analysed approach autonomously finds policies that make the robot move to diverse positions, but also utilise its legs in diverse ways, and even perform half-rolls.
翻译:提高开放的学习能力是使机器人能够面对现实世界未受限制的复杂性的一个很有希望的方法。在现有的方法中,质量多样性算法生成大量多样化和高绩效技能集的能力在这方面至关重要。然而,大多数这些算法依赖手工编码的行为描述符来描述多样性,因此需要事先了解所考虑的任务。在这项工作中,我们提议对自主机器人实现自身能力的能力进行额外分析;质量多样性算法,自主地发现行为特征。我们评估模拟机器人环境中的这种方法,在这种环境中,机器人必须自主地从整个状态轨迹中发现其能力。所有算法都应用于三项任务:导航、高速前进和进行半滚动。实验结果显示,正在研究的算法发现自主地收集了与所有任务不同的解决方案。更具体地说,分析的方法自主地发现使机器人移动到不同位置的政策,但也以不同的方式使用其腿,甚至进行半滚动。