Virtual-reality (VR) and augmented-reality (AR) technology is increasingly combined with eye-tracking. This combination broadens both fields and opens up new areas of application, in which visual perception and related cognitive processes can be studied in interactive but still well controlled settings. However, performing a semantic gaze analysis of eye-tracking data from interactive three-dimensional scenes is a resource-intense task, which so far has been an obstacle to economic use. In this paper we present a novel approach which minimizes time and information necessary to annotate volumes of interest (VOIs) by using techniques from object recognition. To do so, we train convolutional neural networks (CNNs) on synthetic data sets derived from virtual models using image augmentation techniques. We evaluate our method in real and virtual environments, showing that the method can compete with state-of-the-art approaches, while not relying on additional markers or preexisting databases but instead offering cross-platform use.
翻译:虚拟现实(VR)和增强现实(AR)技术日益与眼睛跟踪相结合。这种结合扩大了两个领域,开辟了新的应用领域,可以在互动但仍然受到良好控制的环境中研究视觉感知和相关认知过程。然而,对互动三维场景的视觉跟踪数据进行语义视觉分析是一项资源密集的任务,迄今一直是经济使用的一个障碍。在本文件中,我们提出了一个新颖的方法,通过使用物体识别技术,最大限度地减少用于说明兴趣量所需的时间和信息。为了做到这一点,我们用虚拟模型生成的增强图像技术,在合成数据集上培训进化神经网络。我们评估在真实和虚拟环境中的方法,表明该方法可以与最先进的方法竞争,同时不依赖额外的标记或先前存在的数据库,而是提供跨平台的使用。