We present StrobeNet, a method for category-level 3D reconstruction of articulating objects from one or more unposed RGB images. Reconstructing general articulating object categories % has important applications, but is challenging since objects can have wide variation in shape, articulation, appearance and topology. We address this by building on the idea of category-level articulation canonicalization -- mapping observations to a canonical articulation which enables correspondence-free multiview aggregation. Our end-to-end trainable neural network estimates feature-enriched canonical 3D point clouds, articulation joints, and part segmentation from one or more unposed images of an object. These intermediate estimates are used to generate a final implicit 3D reconstruction.Our approach reconstructs objects even when they are observed in different articulations in images with large baselines, and animation of reconstructed shapes. Quantitative and qualitative evaluations on different object categories show that our method is able to achieve high reconstruction accuracy, especially as more views are added.
翻译:我们提出 StrobeNet, 这是从一个或一个以上未保存的 RGB 图像来重建表达对象的分类 3D 方法。 重新构建普通表达对象类别% 有重要应用, 但具有挑战性, 因为对象在形状、 表达、 外观和地形学上可以有很大差异。 我们的处理方法是, 以分类表达孔形化概念为基础 -- 将观测结果绘制成一个能够进行无对应式多视图聚合的剖析体。 我们的端到端可训练神经网络估算了精度高的3D 点云、 连接和一个或多个未保存对象图像的分割部分。 这些中间估计用于产生最终的 3D 重建 。 我们的方法是重建对象, 即使在以大基线和重新组合形状的动画的不同表达中观察到它们。 不同对象类别的定量和定性评估显示, 我们的方法能够达到高度的重建精度, 特别是增加了更多的视图 。