It would be useful for machines to use computers as humans do so that they can aid us in everyday tasks. This is a setting in which there is also the potential to leverage large-scale expert demonstrations and human judgements of interactive behaviour, which are two ingredients that have driven much recent success in AI. Here we investigate the setting of computer control using keyboard and mouse, with goals specified via natural language. Instead of focusing on hand-designed curricula and specialized action spaces, we focus on developing a scalable method centered on reinforcement learning combined with behavioural priors informed by actual human-computer interactions. We achieve state-of-the-art and human-level mean performance across all tasks within the MiniWob++ benchmark, a challenging suite of computer control problems, and find strong evidence of cross-task transfer. These results demonstrate the usefulness of a unified human-agent interface when training machines to use computers. Altogether our results suggest a formula for achieving competency beyond MiniWob++ and towards controlling computers, in general, as a human would.
翻译:机器使用计算机作为人可以帮助我们完成日常任务。 在这种环境下,还有可能利用大规模专家演示和人类对互动行为的判断,这是最近AI中许多成功背后的两个因素。 我们在这里调查使用键盘和鼠标的计算机控制设置,其目标通过自然语言来规定。 我们不注重手工设计的课程和专门行动空间,而是注重开发一种可扩展的方法,其核心是强化学习,同时以实际的人体计算机互动为根据,行为前科。 我们在MiniWob++基准范围内的所有任务中实现最先进的和人一级的中值业绩,这是一套具有挑战性的计算机控制问题,并找到跨任务转移的有力证据。 这些结果显示,在培训机器使用计算机时,统一的人体代理界面是有用的。 我们的结果合在一起,提出了一种模式,以实现MiniWob++之外的能力,并普遍以人的意愿控制计算机。