From smoothly pursuing moving objects to rapidly shifting gazes during visual search, humans employ a wide variety of eye movement strategies in different contexts. While eye movements provide a rich window into mental processes, building generative models of eye movements is notoriously difficult, and to date the computational objectives guiding eye movements remain largely a mystery. In this work, we tackled these problems in the context of a canonical spatial planning task, maze-solving. We collected eye movement data from human subjects and built deep generative models of eye movements using a novel differentiable architecture for gaze fixations and gaze shifts. We found that human eye movements are best predicted by a model that is optimized not to perform the task as efficiently as possible but instead to run an internal simulation of an object traversing the maze. This not only provides a generative model of eye movements in this task but also suggests a computational theory for how humans solve the task, namely that humans use mental simulation.
翻译:在视觉搜索期间,人类从平稳地追求物体到快速移动的视线,在不同情况下采用各种各样的眼睛运动策略。眼运动为心理过程提供了丰富的窗口,但建立眼运动的基因模型却十分困难,迄今为止,指导眼运动的计算目标在很大程度上仍是一个谜。在这项工作中,我们从一个明亮的空间规划任务,即迷宫溶液中处理这些问题。我们从人类主体收集了眼睛移动数据,并用一种新的、不同的凝视固定和视觉转移结构建立了眼运动的深层次基因模型。我们发现,人类眼运动的最佳预测方式是,一种模型不是尽可能高效地执行这项任务,而是对穿越迷宫的物体进行内部模拟。这不仅提供了这一任务中眼运动的基因模型,而且还为人类如何解决这项任务,即人类使用精神模拟,提出了一个计算理论。