We introduce a new benchmarking suite for high-dimensional control, targeted at testing high spatial and temporal precision, coordination, and planning, all with an underactuated system frequently making-and-breaking contacts. The proposed challenge is mastering the piano through bi-manual dexterity, using a pair of simulated anthropomorphic robot hands. We call it RoboPianist, and the initial version covers a broad set of 150 variable-difficulty songs. We investigate both model-free and model-based methods on the benchmark, characterizing their performance envelopes. We observe that while certain existing methods, when well-tuned, can achieve impressive levels of performance in certain aspects, there is significant room for improvement. RoboPianist provides a rich quantitative benchmarking environment, with human-interpretable results, high ease of expansion by simply augmenting the repertoire with new songs, and opportunities for further research, including in multi-task learning, zero-shot generalization, multimodal (sound, vision, touch) learning, and imitation. Supplementary information, including videos of our control policies, can be found at https://kzakka.com/robopianist/
翻译:我们引入了一个针对高维控制的新基准测试套件,旨在测试高精度、协调和规划的高空间和时间精度,所有这些都与一个经常制造和打破接触的欠驱动系统有关。所提出的挑战是通过双手灵巧来掌握钢琴,使用一对仿真的拟人机器人手。我们称之为机器钢琴家,最初的版本覆盖了150个难度大小可变的曲目。我们在基准测试中研究了无模型和基于模型的方法,并描述了它们的性能范围。我们观察到,虽然某些现有方法在某些方面表现出色,但在很大程度上仍有改进空间。机器钢琴家提供了一个丰富的定量基准测试环境,具有人类可解释的结果、通过简单地增加新曲目来扩展曲库的高易用性以及进一步研究的机会,包括多任务学习、零-shot泛化、多模态(声音、视觉、触觉)学习和模仿。补充信息,包括我们的控制策略视频,可在https://kzakka.com/robopianist/找到。