Recovering high-quality 3D human motion in complex scenes from monocular videos is important for many applications, ranging from AR/VR to robotics. However, capturing realistic human-scene interactions, while dealing with occlusions and partial views, is challenging; current approaches are still far from achieving compelling results. We address this problem by proposing LEMO: LEarning human MOtion priors for 4D human body capture. By leveraging the large-scale motion capture dataset AMASS, we introduce a novel motion smoothness prior, which strongly reduces the jitters exhibited by poses recovered over a sequence. Furthermore, to handle contacts and occlusions occurring frequently in body-scene interactions, we design a contact friction term and a contact-aware motion infiller obtained via per-instance self-supervised training. To prove the effectiveness of the proposed motion priors, we combine them into a novel pipeline for 4D human body capture in 3D scenes. With our pipeline, we demonstrate high-quality 4D human body capture, reconstructing smooth motions and physically plausible body-scene interactions. The code and data are available at https://sanweiliti.github.io/LEMO/LEMO.html.
翻译:在复杂场景中从单镜头视频中恢复高品质的 3D 人类运动,对于从AR/VR到机器人等许多应用来说都很重要。然而,捕捉现实的人类-生理互动,同时处理隔离和部分观点,具有挑战性;目前的方法还远远没有取得令人信服的结果。我们通过提议LEMO来解决这个问题:在4D人体捕捉中,将人类运动前程清除为4D人体捕捉。通过利用大型运动捕获数据 AMASS,我们引入了一种新的运动平滑,从而大大减少了通过成形恢复到一个序列的立体所展示的杂音。此外,为了处理在身体-显形互动中经常发生的接触和隐蔽,我们设计了一个接触摩擦术语和一个通过per- Instance自我监控培训获得的触觉运动。为了证明拟议的运动前程的有效性,我们将它们合并为4D人体捕捉取的新管道。我们用管道展示高品质的4D 人身体捕捉取,重建平稳的动作和物理上可信的身体-cenene互动。 代码和数据可在 httpssss.sss.s.ss.sssss.ssss.