Reward learning is a fundamental problem in human-robot interaction to have robots that operate in alignment with what their human user wants. Many preference-based learning algorithms and active querying techniques have been proposed as a solution to this problem. In this paper, we present APReL, a library for active preference-based reward learning algorithms, which enable researchers and practitioners to experiment with the existing techniques and easily develop their own algorithms for various modules of the problem. APReL is available at https://github.com/Stanford-ILIAD/APReL.
翻译:奖励学习是人类-机器人互动中的一个基本问题,让机器人按照人类用户的要求运作。许多基于优惠的学习算法和积极的查询技术已被提出来解决这个问题。本文介绍APReL,这是一个积极的基于优惠的奖励学习算法图书馆,使研究人员和从业者能够试验现有的技术,并很容易地为问题的各个模块制定自己的算法。APReL可在https://github.com/Stanford-LIAD/APREL上查阅。