We propose a markerless performance capture method that computes a temporally coherent 4D representation of an actor deforming over time from a sparsely sampled sequence of untracked 3D point clouds. Our method proceeds by latent optimization with a spatio-temporal motion prior. Recently, task generic motion priors have been introduced and propose a coherent representation of human motion based on a single latent code, with encouraging results with short sequences and given temporal correspondences. Extending these methods to longer sequences without correspondences is all but straightforward. One latent code proves inefficient to encode longer term variability, and latent space optimization will be very susceptible to erroneous local minima due to possible inverted pose fittings. We address both problems by learning a motion prior that encodes a 4D human motion sequence into a sequence of latent primitives instead of one latent code. We also propose an additional mapping encoder which directly projects a sequence of point clouds into the learned latent space to provide a good initialization of the latent representation at inference time. Our temporal decoding from latent space is implicit and continuous in time, providing flexibility with temporal resolution. We show experimentally that our method outperforms state-of-the-art motion priors.
翻译:我们建议一种无标记的性能捕捉方法,该方法计算出一个演员在时间上具有暂时一致性的4D代表器,其变形过程从一个分散的抽样序列的未跟踪的3D点云层中逐渐变形。我们的方法是通过之前的spatio-时空运动来进行潜伏优化的。最近,任务通用运动前先行引入了任务通用动议,并提出了一个基于单一潜伏代码的人类运动的连贯代表器,其中以短序和特定时空通信为令人鼓舞的结果。将这些方法推广到没有通信的更长期序列只是简单而已。一个潜伏代码被证明是低效的,以编码长期变异性为编码,而潜在的空间优化将非常容易发生错误,因为可能发生倒置的外形装置。我们通过在将一个4D人类运动序列编码成一个潜在原始序列而不是一个潜伏代码之前学习一个动作来解决问题。我们还提议一个额外的绘图编码,将一个点云的序列直接投射到已学过的潜伏空间,以在推断时对潜在代表器进行良好的初始化。我们从潜伏地的状态进行的时间分解是隐隐隐性和持续的,我们从隐性空间进行的时间分解是隐隐含的,并且提供时间的,通过时间的解过程的动作的动作,我们先向前的形态的形态的形态的形态显示。我们展示了一种方法是实验性的。