Reward learning is a fundamental problem in robotics to have robots that operate in alignment with what their human user wants. Many preference-based learning algorithms and active querying techniques have been proposed as a solution to this problem. In this paper, we present APReL, a library for active preference-based reward learning algorithms, which enable researchers and practitioners to experiment with the existing techniques and easily develop their own algorithms for various modules of the problem.
翻译:奖励学习是机器人拥有符合其人类用户需要的机器人的根本问题。 许多基于偏好的学习算法和积极的查询技术被提出来解决这个问题。 在本文中,我们介绍APReL,这是一个积极的基于偏好的奖励学习算法图书馆,使研究人员和从业人员能够对现有技术进行实验,并很容易地为问题的各个模块开发自己的算法。