There is a growing interest in developing automated agents that can work alongside humans. In addition to completing the assigned task, such an agent will undoubtedly be expected to behave in a manner that is preferred by the human. This requires the human to communicate their preferences to the agent. To achieve this, the current approaches either require the users to specify the reward function or the preference is interactively learned from queries that ask the user to compare trajectories. The former approach can be challenging if the internal representation used by the agent is inscrutable to the human while the latter is unnecessarily cumbersome for the user if their preference can be specified more easily in symbolic terms. In this work, we propose PRESCA (PREference Specification through Concept Acquisition), a system that allows users to specify their preferences in terms of concepts that they understand. PRESCA maintains a set of such concepts in a shared vocabulary. If the relevant concept is not in the shared vocabulary, then it is learned. To make learning a new concept more efficient, PRESCA leverages causal associations between the target concept and concepts that are already known. Additionally, the effort of learning the new concept is amortized by adding the concept to the shared vocabulary for supporting preference specification in future interactions. We evaluate PRESCA by using it on a Minecraft environment and show that it can be effectively used to make the agent align with the user's preference.
翻译:除了完成指定的任务外,毫无疑问,这种代理人的行为方式将受到人类的青睐。这要求人向代理人表达其偏好。为了实现这一点,目前的做法要么要求用户指定奖励功能,要么要求用户从询问用户比较轨迹的询问中交互地学习这种偏好。如果该代理人使用的内部代表方式对人类来说是不可逾越的,而后者对于用户来说是不必要的麻烦,如果其偏好可以更方便地以象征性的术语加以说明的话。我们在此工作中提议PRESCA(通过概念获取的PREference规格),这一系统允许用户在他们理解的概念方面指定其偏好。PRESCA在共同的词汇中保留一系列这类概念。如果相关概念不是在共同词汇中比较,那么,就可学习前一种方法具有挑战性。为了学习一个新的概念,PRESCA在已经知道的目标概念和概念之间会变得不必要的麻烦性联系。此外,在这项工作中,学习新概念的努力(通过概念获取概念的精细度),通过在将来的规格中进行我们使用的比化,从而展示新的定义,从而在将来能够有效地评价。