Monocular 3D human pose estimation is quite challenging due to the inherent ambiguity and occlusion, which often lead to high uncertainty and indeterminacy. On the other hand, diffusion models have recently emerged as an effective tool for generating high-quality images from noise. Inspired by their capability, we explore a novel pose estimation framework (DiffPose) that formulates 3D pose estimation as a reverse diffusion process. We incorporate novel designs into our DiffPose that facilitate the diffusion process for 3D pose estimation: a pose-specific initialization of pose uncertainty distributions, a Gaussian Mixture Model-based forward diffusion process, and a context-conditioned reverse diffusion process. Our proposed DiffPose significantly outperforms existing methods on the widely used pose estimation benchmarks Human3.6M and MPI-INF-3DHP.
翻译:由于内在的模糊性和隔离性,往往导致高度的不确定性和不确定性。另一方面,传播模型最近成为从噪音中生成高质量图像的有效工具。受它们的能力的启发,我们探索了一个新的3D(DiffPose)预测框架(DiffPose)作为反向扩散过程来制定3D的估算。我们把有助于3D(3D)扩散过程的新设计纳入我们的DiffPose(DiffPose)中。我们将3D(3D)构成的估算:成形不确定性分布的成形初始化、基于高斯混合模型的前方扩散过程和有环境条件的反向扩散过程。我们提议的DiffPose(DiffPose)大大超越了广泛使用的现有方法,即构成估计基准Human3.6M和MPI-INF-3DHP。