Can we make virtual characters in a scene interact with their surrounding objects through simple instructions? Is it possible to synthesize such motion plausibly with a diverse set of objects and instructions? Inspired by these questions, we present the first framework to synthesize the full-body motion of virtual human characters performing specified actions with 3D objects placed within their reach. Our system takes textual instructions specifying the objects and the associated intentions of the virtual characters as input and outputs diverse sequences of full-body motions. This contrasts existing works, where full-body action synthesis methods generally do not consider object interactions, and human-object interaction methods focus mainly on synthesizing hand or finger movements for grasping objects. We accomplish our objective by designing an intent-driven fullbody motion generator, which uses a pair of decoupled conditional variational auto-regressors to learn the motion of the body parts in an autoregressive manner. We also optimize the 6-DoF pose of the objects such that they plausibly fit within the hands of the synthesized characters. We compare our proposed method with the existing methods of motion synthesis and establish a new and stronger state-of-the-art for the task of intent-driven motion synthesis.
翻译:我们能否通过简单的指令使场景中的虚拟字符与周围对象发生互动? 能否将这种运动与多种多样的物体和指示相合成? 受这些问题的启发,我们提出第一个框架,以合成以三维对象在它们可以接触的3D对象中执行特定动作的虚拟人类字符的全体运动。我们的系统采用文字指令,指定对象和虚拟字符的相关意图,作为全体运动的输入和产出的不同序列。这与现有的工作形成对比,在这种工作上,全体行动合成方法一般不考虑物体的相互作用,而人类物体互动方法主要侧重于为抓取对象合成手或手指运动的合成方法。我们通过设计一个意图驱动的全体运动生成器来完成我们的目标,这个生成器使用两组分解的有条件的变式自动反射器,以自动递增的方式了解身体部分的动作。我们还优化了6-DoF构成的物体,这些物体在合成的字符手中似乎适合这些物体的形状。我们将我们提议的方法与现有的运动合成方法进行比较,并建立一个新的和更强的状态驱动式任务组合。</s>