人类与正常流动的人类粒子估计 (Probabilistic Monocular 3D Human Pose Estimation with Normalizing Flows)

3D human pose estimation from monocular images is a highly ill-posed problem due to depth ambiguities and occlusions. Nonetheless, most existing works ignore these ambiguities and only estimate a single solution. In contrast, we generate a diverse set of hypotheses that represents the full posterior distribution of feasible 3D poses. To this end, we propose a normalizing flow based method that exploits the deterministic 3D-to-2D mapping to solve the ambiguous inverse 2D-to-3D problem. Additionally, uncertain detections and occlusions are effectively modeled by incorporating uncertainty information of the 2D detector as condition. Further keys to success are a learned 3D pose prior and a generalization of the best-of-M loss. We evaluate our approach on the two benchmark datasets Human3.6M and MPI-INF-3DHP, outperforming all comparable methods in most metrics. The implementation is available on GitHub.

翻译：单体图像的3D人构成估计是一个极不正确的问题,原因是深度模糊和隔离。然而,大多数现有工程忽略了这些模糊,只估计了一个单一的解决方案。相反,我们产生了一系列不同的假设,代表了可行的3D配置的完整后方分布。为此,我们提议一种基于流动的标准化方法,利用确定型的 3D至2D 绘图来解决2D至3D 的模糊反向问题。此外,通过将2D 探测器的不确定信息作为条件,可以有效地模拟不确定的探测和隔离。成功的另一个关键是先学的3D 组合和对最佳M损失的概括化。我们评估了我们对两个基准数据集 Human3.6M 和 MPI-INF-3DHP 的方法,在多数指标中都比所有可比较的方法都好。在 GitHub 上可以使用。