Human pose estimation is a very active research field, stimulated by its important applications in robotics, entertainment or health and sports sciences, among others. Advances in convolutional networks triggered noticeable improvements in 2D pose estimation, leading modern 3D markerless motion capture techniques to an average error per joint of 20 mm. However, with the proliferation of methods, it is becoming increasingly difficult to make an informed choice. Here, we review the leading human pose estimation methods of the past five years, focusing on metrics, benchmarks and method structures. We propose a taxonomy based on accuracy, speed and robustness that we use to classify de methods and derive directions for future research.
翻译:人类姿势估计是一个非常积极的研究领域,在机器人、娱乐或健康和体育科学等重要应用的激励下,人类姿势估计是一个非常积极的研究领域。 革命网络的进展引发了2D的明显改进,使现代3D无标记运动捕捉技术在每20毫米之间平均出错20毫米。 然而,随着方法的扩散,越来越难以作出知情的选择。在这里,我们审查过去五年中主要的人类姿势估计方法,重点是衡量标准、基准和方法结构。我们建议基于准确性、速度和稳健性进行分类,用于对方法进行分类,并为今后的研究提出方向。