This paper introduces an approach for multi-human 3D pose estimation and tracking based on calibrated multi-view. The main challenge lies in finding the cross-view and temporal correspondences correctly even when several human pose estimations are noisy. Compare to previous solutions that construct 3D poses from multiple views, our approach takes advantage of temporal consistency to match the 2D poses estimated with previously constructed 3D skeletons in every view. Therefore cross-view and temporal associations are accomplished simultaneously. Since the performance suffers from mistaken association and noisy predictions, we design two strategies for aiming better correspondences and 3D reconstruction. Specifically, we propose a part-aware measurement for 2D-3D association and a filter that can cope with 2D outliers during reconstruction. Our approach is efficient and effective comparing to state-of-the-art methods; it achieves competitive results on two benchmarks: 96.8% on Campus and 97.4% on Shelf. Moreover, we extends the length of Campus evaluation frames to be more challenging and our proposal also reach well-performed result.
翻译:本文介绍了一种基于校准多视角的多人 3D 3D 代表的估算和跟踪方法。 主要的挑战是找到正确的交叉视图和时间通信, 即使一些人构成的估算十分吵闹。 与以前构建 3D 的解决方案相比,我们的方法利用时间一致性来匹配每个观点的2D 估计的3D 3D 3D 骨架。 因此,交叉视图和时间关联是同时完成的。 由于工作存在错误的关联和噪音预测,我们设计了两种战略,目的是改进通信和3D 重建。 具体地说,我们建议为2D-3D 协会制定部分认知度量和过滤器,以便在重建过程中应对2D 3 外部用户。 我们的方法与最新方法相比是高效和有效的; 它在两个基准上取得了竞争性结果: 校园为96.8%, 大陆架为97.4% 。 此外,我们延长了校园评估框架的长度,以更具挑战性,我们的提案也达到了完善的结果。