We present a novel method for real-time pose and shape reconstruction of two strongly interacting hands. Our approach is the first two-hand tracking solution that combines an extensive list of favorable properties, namely it is marker-less, uses a single consumer-level depth camera, runs in real time, handles inter- and intra-hand collisions, and automatically adjusts to the user's hand shape. In order to achieve this, we embed a recent parametric hand pose and shape model and a dense correspondence predictor based on a deep neural network into a suitable energy minimization framework. For training the correspondence prediction network, we synthesize a two-hand dataset based on physical simulations that includes both hand pose and shape annotations while at the same time avoiding inter-hand penetrations. To achieve real-time rates, we phrase the model fitting in terms of a nonlinear least-squares problem so that the energy can be optimized based on a highly efficient GPU-based Gauss-Newton optimizer. We show state-of-the-art results in scenes that exceed the complexity level demonstrated by previous work, including tight two-hand grasps, significant inter-hand occlusions, and gesture interaction.
翻译:为了实现这一目标,我们推出了一种新型的实时形状和形状方法,并重塑了两个高度互动的双手。我们的方法是第一个双手跟踪解决方案,将大量有利的属性清单(即无标记的)结合在一起,即无标记,使用单一的消费者级深度摄像头,实时运行,处理手间和手内碰撞,并自动适应用户的手形。为了实现这一点,我们将最新的准光手形状和形状模型以及基于深层神经网络的密集通信预测器嵌入一个适当的最大限度地减少能量的框架。为了对通信预测网络进行培训,我们综合了一个基于物理模拟的双手数据集,既包括手姿势和形状说明,同时又避免手间穿透。为了实现实时速度,我们用非线性最小方位问题来描述模型,以便根据高效的GPU-Gaus-Newton优化器优化器优化能源。我们展示了超过先前工作所显示的复杂水平的场景状况,包括两手握式、显著的跨式互动。