Assistive Robotics is a class of robotics concerned with aiding humans in daily care tasks that they may be inhibited from doing due to disabilities or age. While research has demonstrated that classical control methods can be used to design policies to complete these tasks, these methods can be difficult to generalize to a variety of instantiations of a task. Reinforcement learning can provide a solution to this issue, wherein robots are trained in simulation and their policies are transferred to real-world machines. In this work, we replicate a published baseline for training robots on three tasks in the Assistive Gym environment, and we explore the usage of a Recurrent Neural Network and Phasic Policy Gradient learning to augment the original work. Our baseline implementation meets or exceeds the baseline of the original work, however, we found that our explorations into the new methods was not as effective as we anticipated. We discuss the results of our baseline, and some thoughts on why our new methods were not successful.
翻译:辅助机器人学是一组机器人,关注帮助人类从事日常护理工作,由于残疾或年龄原因,他们可能无法完成日常护理工作。虽然研究表明,古典控制方法可以用来设计完成这些任务的政策,但这些方法可能难以概括到各种工作。强化学习可以解决这个问题,即机器人接受模拟培训,其政策被转移到现实世界机器。在这项工作中,我们复制了已公布的机器人在辅助性健身环境中三项任务的培训基准,我们探索了使用常规神经网络和法西政策梯度学习来扩大最初的工作。我们的基线实施达到或超过最初工作的基线,然而,我们发现对新方法的探索没有如我们预期的那样有效。我们讨论了我们的基线结果,以及为什么我们的新方法不成功的一些想法。