There is a growing interest in developing automated agents that can work alongside humans. In addition to completing the assigned task, such an agent will undoubtedly be expected to behave in a manner that is preferred by the human. This requires the human to communicate their preferences to the agent. To achieve this, the current approaches either require the users to specify the reward function or the preference is interactively learned from queries that ask the user to compare behavior. The former approach can be challenging if the internal representation used by the agent is inscrutable to the human while the latter is unnecessarily cumbersome for the user if their preference can be specified more easily in symbolic terms. In this work, we propose PRESCA (PREference Specification through Concept Acquisition), a system that allows users to specify their preferences in terms of concepts that they understand. PRESCA maintains a set of such concepts in a shared vocabulary. If the relevant concept is not in the shared vocabulary, then it is learned. To make learning a new concept more feedback efficient, PRESCA leverages causal associations between the target concept and concepts that are already known. In addition, we use a novel data augmentation approach to further reduce required feedback. We evaluate PRESCA by using it on a Minecraft environment and show that it can effectively align the agent with the user's preference.
翻译:开发可以与人类一起工作的自动化代理物的兴趣日益增长。除了完成指定的任务外,毫无疑问,这种代理物会以人类喜欢的方式行事。这要求人向代理人表达其偏好。为了做到这一点,目前的方法要求用户指定奖励功能,或从要求用户比较行为的询问中互动地学习偏好。如果该代理物所使用的内部代表物对人类来说是不可分割的,那么前者可能具有挑战性,而后者对于用户来说是不必要的麻烦,如果能够更方便地以象征性的术语说明其偏好的话。在这项工作中,我们建议采用PRESCA(通过概念获取确定精度)系统,使用户能够根据他们所理解的概念来说明其偏好。PRESCA在共同的词汇中保留了一套此类概念。如果相关概念不是在共同词汇中,那么它就会被学习。为了学习一个新的概念,那么,PRESCA在已经知道的目标概念和概念之间就会不必要地产生因果关系。此外,我们使用一种新的数据增强方法来进一步减少所需要的用户偏好。我们用一个新的数据增强方法来评价需要的用户的用户的偏好。我们用一种环境来评价。我们用一个用户的用户的组合来评价。