Head pose estimation is a challenging task that aims to solve problems related to predicting three dimensions vector, that serves for many applications in human-robot interaction or customer behavior. Previous researches have proposed some precise methods for collecting head pose data. But those methods require either expensive devices like depth cameras or complex laboratory environment setup. In this research, we introduce a new approach with efficient cost and easy setup to collecting head pose images, namely UET-Headpose dataset, with top-view head pose data. This method uses an absolute orientation sensor instead of Depth cameras to be set up quickly and small cost but still ensure good results. Through experiments, our dataset has been shown the difference between its distribution and available dataset like CMU Panoptic Dataset \cite{CMU}. Besides using the UET-Headpose dataset and other head pose datasets, we also introduce the full-range model called FSANet-Wide, which significantly outperforms head pose estimation results by the UET-Headpose dataset, especially on top-view images. Also, this model is very lightweight and takes small size images.
翻译:头形估计是一项具有挑战性的任务,旨在解决与预测三维矢量有关的问题,这在人类-机器人互动或客户行为中有许多应用。 以前的研究已经提出了某些精确的收集头部的方法。 但是,这些方法需要昂贵的设备, 如深层相机或复杂的实验室环境设置。 在这个研究中,我们引入了一种新的方法,以高效的成本和简便的设置来收集头部图像, 即UET- 头部数据集, 头部构成数据。 这种方法使用绝对定向传感器, 而不是要快速和少量地安装深度相机, 但仍能确保取得良好结果。 通过实验, 我们的数据集显示了其分布与CMU Panopic Dataset\ cite{CMU} 等现有数据集之间的差异。 除了使用UET- 头部数据集和其他头部构成数据集外, 我们还引入了名为 FSANet- Wide 的全程模型, 其头部大大超出直观显示UET- 头部数据集的估计数, 特别是在上视图像上。 此外, 这个模型非常轻, 并取小尺寸图像。