The virtuoso plays the piano with passion, poetry and extraordinary technical ability. As Liszt said (a virtuoso)must call up scent and blossom, and breathe the breath of life. The strongest robots that can play a piano are based on a combination of specialized robot hands/piano and hardcoded planning algorithms. In contrast to that, in this paper, we demonstrate how an agent can learn directly from machine-readable music score to play the piano with dexterous hands on a simulated piano using reinforcement learning (RL) from scratch. We demonstrate the RL agents can not only find the correct key position but also deal with various rhythmic, volume and fingering, requirements. We achieve this by using a touch-augmented reward and a novel curriculum of tasks. We conclude by carefully studying the important aspects to enable such learning algorithms and that can potentially shed light on future research in this direction.
翻译:虚拟音乐以激情、诗歌和非凡的技术能力演奏钢琴。 Liszt 说( 虚拟音乐), 必须唤起气味和鲜花, 呼吸生命的气息。 最强壮的机器人能够弹钢琴, 其基础是专门的机器人手/ 钢琴和硬码规划算法。 与此相反, 在本文中, 我们展示一个代理人如何直接从机器可读音乐评分中学习钢琴, 用柔软的手在一台模拟钢琴上弹奏钢琴, 从零到零的强化学习( RL) 。 我们展示了RL 代理人不仅能够找到正确的关键位置, 还可以处理各种有节奏、 体积和指法的要求。 我们通过触摸式奖励和新颖的任务课程来实现这一目标。 我们通过仔细研究重要方面来促成这种学习算法, 并有可能为今后朝这个方向的研究提供线索。