In this paper, we propose a novel monocular ray-based 3D (Ray3D) absolute human pose estimation with calibrated camera. Accurate and generalizable absolute 3D human pose estimation from monocular 2D pose input is an ill-posed problem. To address this challenge, we convert the input from pixel space to 3D normalized rays. This conversion makes our approach robust to camera intrinsic parameter changes. To deal with the in-the-wild camera extrinsic parameter variations, Ray3D explicitly takes the camera extrinsic parameters as an input and jointly models the distribution between the 3D pose rays and camera extrinsic parameters. This novel network design is the key to the outstanding generalizability of Ray3D approach. To have a comprehensive understanding of how the camera intrinsic and extrinsic parameter variations affect the accuracy of absolute 3D key-point localization, we conduct in-depth systematic experiments on three single person 3D benchmarks as well as one synthetic benchmark. These experiments demonstrate that our method significantly outperforms existing state-of-the-art models. Our code and the synthetic dataset are available at https://github.com/YxZhxn/Ray3D .
翻译:在本文中, 我们提议使用校准相机进行新型的单望远镜光基 3D (Ray3D) 绝对人形估计。 由单望远镜 2D 显示输入的精确和可普遍实现的绝对 3D 人形估计是一个错误的问题。 为了应对这一挑战, 我们将像素空间的输入转换为 3D 正常化射线。 这种转换使得我们的方法对摄像内在参数变化具有很强性。 要处理电离照相机外缘参数变异, Ray3D 明确地将相机的外部参数作为输入和联合模型来模拟 3D 显示的射线和相机外部参数之间的分布。 这个新型的网络设计是Ray3D 方法的突出通用性的关键。 要全面了解相机的内在和外部参数变化如何影响绝对的 3D 关键点定位的准确性, 我们对三个单人 3D 参数以及一个合成基准进行深入的系统实验。 这些实验表明我们的方法大大超越了现有状态光谱模型和相机的扩展模型。 我们的代码和合成数据是在 httpYx 。