Markerless motion capture has become an active field of research in computer vision in recent years. Its extensive applications are known in a great variety of fields, including computer animation, human motion analysis, biomedical research, virtual reality, and sports science. Estimating human posture has recently gained increasing attention in the computer vision community, but due to the depth of uncertainty and the lack of the synthetic datasets, it is a challenging task. Various approaches have recently been proposed to solve this problem, many of which are based on deep learning. They are primarily focused on improving the performance of existing benchmarks with significant advances, especially 2D images. Based on powerful deep learning techniques and recently collected real-world datasets, we explored a model that can predict the skeleton of an animation based solely on 2D images. Frames generated from different real-world datasets with synthesized poses using different body shapes from simple to complex. The implementation process uses DeepLabCut on its own dataset to perform many necessary steps, then use the input frames to train the model. The output is an animated skeleton for human movement. The composite dataset and other results are the "ground truth" of the deep model.
翻译:近年来,无标记的运动捕捉已成为计算机视觉研究的一个积极领域。它的广泛应用在很多领域都广为人知,包括计算机动画、人类运动分析、生物医学研究、虚拟现实和体育科学。估计人类的态势最近在计算机视觉界日益受到重视,但由于不确定性的深度和合成数据集的缺乏,这是一个具有挑战性的任务。最近提出了各种办法来解决这个问题,其中许多办法是以深层次学习为基础。它们主要侧重于改进现有基准的绩效,并取得显著进展,特别是2D图像。根据强有力的深层次学习技术和最近收集的真实世界数据集,我们探索了一种模型,可以预测仅以2D图像为基础的动画的骨架。由不同真实世界数据集产生的框架,这些框架使用从简单到复杂的不同身体形状合成的合成形形形形形体。执行进程在自己的数据集上使用DeepLabCut来实施许多必要的步骤,然后使用输入框架来培训模型。产出是人类运动的模拟骨架。复合数据集和其他结果是“地面”的模型。