Accurate and temporally consistent modeling of human bodies is essential for a wide range of applications, including character animation, understanding human social behavior and AR/VR interfaces. Capturing human motion accurately from a monocular image sequence is still challenging and the modeling quality is strongly influenced by the temporal consistency of the captured body motion. Our work presents an elegant solution for the integration of temporal constraints in the fitting process. This does not only increase temporal consistency but also robustness during the optimization. In detail, we derive parameters of a sequence of body models, representing shape and motion of a person, including jaw poses, facial expressions, and finger poses. We optimize these parameters over the complete image sequence, fitting one consistent body shape while imposing temporal consistency on the body motion, assuming linear body joint trajectories over a short time. Our approach enables the derivation of realistic 3D body models from image sequences, including facial expression and articulated hands. In extensive experiments, we show that our approach results in accurately estimated body shape and motion, also for challenging movements and poses. Further, we apply it to the special application of sign language analysis, where accurate and temporal consistent motion modelling is essential, and show that the approach is well-suited for this kind of application.
翻译:对人体进行精确和时间上一致的模型,对于各种各样的应用,包括性动动动、理解人类社会行为和AR/VR接口等,都是至关重要的。从单形图像序列中准确捕捉人体运动,仍然具有挑战性,模型质量受到被捕获身体运动的时间一致性的强烈影响。我们的工作为将时间限制纳入适配过程提供了一个优美的解决办法。这不仅提高了时间一致性,而且在优化过程中也非常稳健。详细而言,我们从人体模型的序列中得出代表一个人的形状和运动的参数,包括下巴姿势、面部表情和手指姿势。我们将这些参数优化于完整的图像序列,在对身体运动施加时间一致性的同时,将一个连贯的体形装好一个体形,在很短的时间内假定线形身体运动的联合轨迹。我们的方法使得将现实的3D身体模型从图像序列中衍生出来,包括面部表达和直截面手。我们在广泛的实验中显示我们的方法是准确估计身体形状和动作和动作的结果,同时也是挑战运动和手指姿势。此外,我们将这些参数应用于手势语言分析的特殊应用中,在这种精确和时间一致的模型的模型应用中非常必要。