Human visual attention is a complex phenomenon that has been studied for decades. Within it, the particular problem of scanpath prediction poses a challenge, particularly due to the inter- and intra-observer variability, among other reasons. Besides, most existing approaches to scanpath prediction have focused on optimizing the prediction of a gaze point given the previous ones. In this work, we present a probabilistic time-evolving approach to scanpath prediction, based on Bayesian deep learning. We optimize our model using a novel spatio-temporal loss function based on a combination of Kullback-Leibler divergence and dynamic time warping, jointly considering the spatial and temporal dimensions of scanpaths. Our scanpath prediction framework yields results that outperform those of current state-of-the-art approaches, and are almost on par with the human baseline, suggesting that our model is able to generate scanpaths whose behavior closely resembles those of the real ones.
翻译:人类视觉关注是一个已经研究了几十年的复杂现象。 在这种现象中,特别是由于观察者之间和观察者内部的变异性,扫描者预测的特殊问题提出了挑战。 此外,大多数现有的扫描者预测方法都侧重于优化对先前的观察点的预测。在这项工作中,我们提出了一个基于巴耶斯深层学习的扫描者预测的概率、时间演进的方法。我们优化了我们的模型,利用一种基于Kullback-Leeper差异和动态时间扭曲的组合的新的空间-时间损失功能,共同考虑扫描者的空间和时间层面。我们的扫描者预测框架产生的结果超过了目前最先进的方法,并且几乎与人类基线相近,这表明我们的模型能够产生与真实方法相似的扫描者。