In this work we propose an approach for estimating 3D human poses of multiple people from a set of calibrated cameras. Estimating 3D human poses from multiple views has several compelling properties: human poses are estimated within a global coordinate space and multiple cameras provide an extended field of view which helps in resolving ambiguities, occlusions and motion blur. Our approach builds upon a real-time 2D multi-person pose estimation system and greedily solves the association problem between multiple views. We utilize bipartite matching to track multiple people over multiple frames. This proofs to be especially efficient as problems associated with greedy matching such as occlusion can be easily resolved in 3D. Our approach achieves state-of-the-art results on popular benchmarks and may serve as a baseline for future work.
翻译:在这项工作中,我们从一组校准相机中提出一个方法来估计3D多重人的外形。从多种观点中估计3D人外形具有若干令人信服的特性:在全球协调空间中估计人的外形,而多个相机则提供了广泛的视野,有助于解决模糊不清、隐蔽和运动模糊的问题。我们的方法建立在实时的2D多人外形估计系统的基础上,贪婪地解决了多种观点之间的关联问题。我们利用双方匹配来追踪多重观点中的多人。这种证据特别有效,因为与贪婪匹配有关的问题,如隐蔽等,可以在3D中轻易得到解决。我们的方法在大众基准上取得了最新的结果,并可作为未来工作的基线。