With the rapid development of multimedia technology, Augmented Reality (AR) has become a promising next-generation mobile platform. The primary theory underlying AR is human visual confusion, which allows users to perceive the real-world scenes and augmented contents (virtual-world scenes) simultaneously by superimposing them together. To achieve good Quality of Experience (QoE), it is important to understand the interaction between two scenarios, and harmoniously display AR contents. However, studies on how this superimposition will influence the human visual attention are lacking. Therefore, in this paper, we mainly analyze the interaction effect between background (BG) scenes and AR contents, and study the saliency prediction problem in AR. Specifically, we first construct a Saliency in AR Dataset (SARD), which contains 450 BG images, 450 AR images, as well as 1350 superimposed images generated by superimposing BG and AR images in pair with three mixing levels. A large-scale eye-tracking experiment among 60 subjects is conducted to collect eye movement data. To better predict the saliency in AR, we propose a vector quantized saliency prediction method and generalize it for AR saliency prediction. For comparison, three benchmark methods are proposed and evaluated together with our proposed method on our SARD. Experimental results demonstrate the superiority of our proposed method on both of the common saliency prediction problem and the AR saliency prediction problem over benchmark methods. Our dataset and code are available at: https://github.com/DuanHuiyu/ARSaliency.
翻译:随着多媒体技术的迅速发展,增强现实(AR)已成为一个充满希望的下一代移动平台,因此,在本文中,我们主要分析背景(BG)场景和AR内容之间的相互作用效应,并在AR中研究突出的预测问题。具体地说,我们首先在AR数据集(SARD)中构建一个团结度(SARD),其中包含450 BG图像、450 AR图像,以及1 350 图像(以三层混和方式将BG和AR图像加在一起生成)。然而,要理解两种情景之间的相互作用并和谐地展示AR的内容,重要的是要理解两种情景之间的和谐互动,而和谐地展示AR内容。因此,我们主要分析背景(BG)场景和AR内容之间的相互作用效应,并在AR中研究突出的A预测问题。具体地说,我们首先在AR数据集(SAR)中构建一个包含450 BG图像的团结度,450 AR图像, 以及1,350 以及由超层图像组合的BGG和ARCI预测方法中生成。我们提出的常规预测方法的对比方法是:我们提出的关于ARCI/CRI预测方法的拟议。