We study a collaborative scenario where a user not only instructs a system to complete tasks, but also acts alongside it. This allows the user to adapt to the system abilities by changing their language or deciding to simply accomplish some tasks themselves, and requires the system to effectively recover from errors as the user strategically assigns it new goals. We build a game environment to study this scenario, and learn to map user instructions to system actions. We introduce a learning approach focused on recovery from cascading errors between instructions, and modeling methods to explicitly reason about instructions with multiple goals. We evaluate with a new evaluation protocol using recorded interactions and online games with human users, and observe how users adapt to the system abilities.
翻译:我们研究一种合作设想方案,即用户不仅指示一个系统完成任务,而且还同时采取行动。这样用户就可以通过改变其语言或决定仅仅完成某些任务来适应系统能力,要求系统在用户战略性地分配新目标时有效地从错误中恢复过来。我们建立了一个游戏环境来研究这一假设方案,并学会绘制用户对系统行动的指示。我们引入了一种学习方法,侧重于从指令之间的串联错误中恢复,以及建模方法,以明确解释具有多重目标的指示。我们使用与人类用户记录的互动和在线游戏来评估新的评价协议,并观察用户如何适应系统能力。