3D gaze estimation is about predicting the line of sight of a person in 3D space. Person-independent models for the same lack precision due to anatomical differences of subjects, whereas person-specific calibrated techniques add strict constraints on scalability. To overcome these issues, we propose a novel technique, Facial Landmark Heatmap Activated Multimodal Gaze Estimation (FLAME), as a way of combining eye anatomical information using eye landmark heatmaps to obtain precise gaze estimation without any person-specific calibration. Our evaluation demonstrates a competitive performance of about 10% improvement on benchmark datasets ColumbiaGaze and EYEDIAP. We also conduct an ablation study to validate our method.
翻译:3D 视觉估计是预测3D空间一个人的视线。 个人独立模型由于对象的解剖差异而缺乏精确度, 而个人特有校准技术则对可缩放性增加了严格的限制。 为了克服这些问题,我们提议了一种新型技术,即Facial Landmark Heatmap 激活多式热气动动动画(FLAME ), 作为一种将眼界标志性热谱信息结合起来的方法, 以获得精确的视觉估计, 而不使用任何个人特有校准。 我们的评估表明,哥伦比亚和YEMEDIAP的基准数据集有大约10%的竞争性改进。 我们还开展了一项模拟研究, 以验证我们的方法。