Reconstructing two hands from monocular RGB images is challenging due to frequent occlusion and mutual confusion. Existing methods mainly learn an entangled representation to encode two interacting hands, which are incredibly fragile to impaired interaction, such as truncated hands, separate hands, or external occlusion. This paper presents ACR (Attention Collaboration-based Regressor), which makes the first attempt to reconstruct hands in arbitrary scenarios. To achieve this, ACR explicitly mitigates interdependencies between hands and between parts by leveraging center and part-based attention for feature extraction. However, reducing interdependence helps release the input constraint while weakening the mutual reasoning about reconstructing the interacting hands. Thus, based on center attention, ACR also learns cross-hand prior that handle the interacting hands better. We evaluate our method on various types of hand reconstruction datasets. Our method significantly outperforms the best interacting-hand approaches on the InterHand2.6M dataset while yielding comparable performance with the state-of-the-art single-hand methods on the FreiHand dataset. More qualitative results on in-the-wild and hand-object interaction datasets and web images/videos further demonstrate the effectiveness of our approach for arbitrary hand reconstruction. Our code is available at https://github.com/ZhengdiYu/Arbitrary-Hands-3D-Reconstruction.
翻译:通过单向 RGB 图像重建两只手具有挑战性。 现有方法主要是通过频繁的封隔和相互混淆, 学习一个缠绕的表达方式, 将两个互动的手编码成两个互动的手, 这两只手非常脆弱, 容易受损。 本文展示了 ACR( 以默认协作为基础的回归器), 首次尝试在任意的情景下重建手。 为了实现这一点, ACR 通过调用功能提取的中央和部分关注, 明确减轻手与部分之间的相互依存性。 然而, 减少相互依存性有助于释放输入限制, 同时削弱关于重建互动手的相互推理。 因此, ACR( ACR) 还在中心关注的基础上, 学习交叉操作前, 更好地处理交互手。 我们在各类手重建数据集上评估了我们的方法。 我们的方法大大超越了 InterHand2.6M 数据集上的最佳互动方法。 同时, 在 FreiHand数据集中, 提供与最先进的单手方法相匹配的性能。 更多的定性结果显示我们在网络/ REVD 的图像和手上的互动。</s>