We present TOCH, a method for refining incorrect 3D hand-object interaction sequences using a data prior. Existing hand trackers, especially those that rely on very few cameras, often produce visually unrealistic results with hand-object intersection or missing contacts. Although correcting such errors requires reasoning about temporal aspects of interaction, most previous works focus on static grasps and contacts. The core of our method are TOCH fields, a novel spatio-temporal representation for modeling correspondences between hands and objects during interaction. TOCH fields are a point-wise, object-centric representation, which encode the hand position relative to the object. Leveraging this novel representation, we learn a latent manifold of plausible TOCH fields with a temporal denoising auto-encoder. Experiments demonstrate that TOCH outperforms state-of-the-art 3D hand-object interaction models, which are limited to static grasps and contacts. More importantly, our method produces smooth interactions even before and after contact. Using a single trained TOCH model, we quantitatively and qualitatively demonstrate its usefulness for correcting erroneous sequences from off-the-shelf RGB/RGB-D hand-object reconstruction methods and transferring grasps across objects.
翻译:我们用先前的数据来改进不正确的 3D 人工物体互动序列的方法 TOCH 。 现有的手追踪器,特别是那些依赖极少数照相机的手追踪器,往往会产生视觉上不切实际的结果,使用手用弹道交叉或缺失的联系人。 虽然纠正这些错误需要关于互动的时间方面的推理, 但大多数以前的工作都侧重于静态的触摸和接触。 我们的方法的核心是 TOCH 字段, 一种在互动中模拟手与对象之间通信的新颖的时空代表器。 TOCH 字段是一种点向的、 以物体为中心的代表器, 用来编码与对象相对的手位置。 利用这个新的代表器, 我们学习了有时间分解自动校校准的自动校验器的可行TOCH 字段的潜在组合。 实验表明, TOCH 超越了艺术的 3D 手取互动模型, 仅限于静态的握和接触。 更重要的是, 我们的方法在接触之前和之后都会产生平稳的相互作用。 使用一个经过训练的TOCH 模型, 我们定量和定性地展示其有用性地展示其用于校正从外物体的错误序列, 重建RGB/ RGB-D 跨移动的移动转换方法。