从视频中重建动态对象的类不可知性重建 (Class-agnostic Reconstruction of Dynamic Objects from Videos)

We introduce REDO, a class-agnostic framework to REconstruct the Dynamic Objects from RGBD or calibrated videos. Compared to prior work, our problem setting is more realistic yet more challenging for three reasons: 1) due to occlusion or camera settings an object of interest may never be entirely visible, but we aim to reconstruct the complete shape; 2) we aim to handle different object dynamics including rigid motion, non-rigid motion, and articulation; 3) we aim to reconstruct different categories of objects with one unified framework. To address these challenges, we develop two novel modules. First, we introduce a canonical 4D implicit function which is pixel-aligned with aggregated temporal visual cues. Second, we develop a 4D transformation module which captures object dynamics to support temporal propagation and aggregation. We study the efficacy of REDO in extensive experiments on synthetic RGBD video datasets SAIL-VOS 3D and DeformingThings4D++, and on real-world video data 3DPW. We find REDO outperforms state-of-the-art dynamic reconstruction methods by a margin. In ablation studies we validate each developed component.

翻译：我们引入了REDO, 这是一种从 RGBD 或校准视频中重建动态物体的等级不可知框架。与先前的工作相比, 我们的问题设置更现实, 更具有挑战性, 原因有三:(1) 由于封闭或相机设置, 一个感兴趣的对象可能永远不会完全可见, 但我们的目标是重建完整的形状; (2) 我们的目标是处理不同的物体动态, 包括僵硬运动、非硬性运动和表达; (3) 我们的目标是用一个统一的框架来重建不同种类的物体。为了应对这些挑战, 我们开发了两个新模块。首先, 我们引入了一个与总时间视觉提示相匹配的卡通 4D 隐含功能。其次, 我们开发了一个四维转换模块, 捕捉对象动态以支持时间传播和汇总。我们在合成 RGBD 视频数据集( SAIL- VOS 3D) 和变形THINGS4D++) 以及真实世界视频数据 3DPW 。我们发现 REDOD 超越了状态动态重建方法的比值。

相关内容

AIM

关注 655

医学人工智能AIM（Artificial Intelligence in Medicine）杂志发表了多学科领域的原创文章，涉及医学中的人工智能理论和实践，以医学为导向的人类生物学和卫生保健。医学中的人工智能可以被描述为与研究、项目和应用相关的科学学科，旨在通过基于知识或数据密集型的计算机解决方案支持基于决策的医疗任务，最终支持和改善人类护理提供者的性能。官网地址：http://dblp.uni-trier.de/db/journals/artmed/