The synthesis of human grasping has numerous applications including AR/VR, video games and robotics. While methods have been proposed to generate realistic hand-object interaction for object grasping and manipulation, these typically only consider interacting hand alone. Our goal is to synthesize whole-body grasping motions. Starting from an arbitrary initial pose, we aim to generate diverse and natural whole-body human motions to approach and grasp a target object in 3D space. This task is challenging as it requires modeling both whole-body dynamics and dexterous finger movements. To this end, we propose SAGA (StochAstic whole-body Grasping with contAct), a framework which consists of two key components: (a) Static whole-body grasping pose generation. Specifically, we propose a multi-task generative model, to jointly learn static whole-body grasping poses and human-object contacts. (b) Grasping motion infilling. Given an initial pose and the generated whole-body grasping pose as the start and end of the motion respectively, we design a novel contact-aware generative motion infilling module to generate a diverse set of grasp-oriented motions. We demonstrate the effectiveness of our method, which is a novel generative framework to synthesize realistic and expressive whole-body motions that approach and grasp randomly placed unseen objects. Code and models are available at https://jiahaoplus.github.io/SAGA/saga.html.
翻译:人类捕捉的合成有许多应用,包括AR/VR、视频游戏和机器人。 虽然已经提出一些方法来产生现实的人工点互动, 用于物体的捕捉和操纵, 但通常只考虑用手来互动。 我们的目标是综合整个身体的捕捉动作。 从最初的任意外观开始, 我们的目标是产生各种自然的全体人类运动, 在3D空间接近和捕捉目标物体。 这项任务具有挑战性, 因为它需要模拟整个身体的动态和伸缩手指运动。 为此, 我们提议建立SAGA( 与ContAc一起进行整体整体拼凑), 这个框架由两个关键组成部分组成:(a) 整体抓动组合一代。 具体地说, 我们提出一个多功能的基因模型模型, 共同学习全体捕捉的姿势和人类的触摸接触。 (b) 放大动作。 鉴于最初的姿势和产生的全体捕捉姿势, 分别是运动的开始和结束。 我们设计一个新型的接触感知觉动作模型, 以填补一个可理解的、 智能运动的模型。