HI-LASSIE: 从微缩图像聚合体中发现高纤维电动形状和Skeleton发现 (Hi-LASSIE: High-Fidelity Articulated Shape and Skeleton Discovery from Sparse Image Ensemble)

Automatically estimating 3D skeleton, shape, camera viewpoints, and part articulation from sparse in-the-wild image ensembles is a severely under-constrained and challenging problem. Most prior methods rely on large-scale image datasets, dense temporal correspondence, or human annotations like camera pose, 2D keypoints, and shape templates. We propose Hi-LASSIE, which performs 3D articulated reconstruction from only 20-30 online images in the wild without any user-defined shape or skeleton templates. We follow the recent work of LASSIE that tackles a similar problem setting and make two significant advances. First, instead of relying on a manually annotated 3D skeleton, we automatically estimate a class-specific skeleton from the selected reference image. Second, we improve the shape reconstructions with novel instance-specific optimization strategies that allow reconstructions to faithful fit on each instance while preserving the class-specific priors learned across all images. Experiments on in-the-wild image ensembles show that Hi-LASSIE obtains higher fidelity state-of-the-art 3D reconstructions despite requiring minimum user input.

翻译：自动估算 3D 骨架、形状、相机视图以及来自稀少的边缘图像群的局部表达是一个严重受限制且具有挑战性的问题。多数先前的方法都依赖于大型图像数据集、密集的时间通信、或像相机姿势、 2D 键点和形状模板这样的人文说明。我们建议 Hi- LASSIE, 它从野外仅执行 20- 30 个在线图像的 3D 分解重建, 没有任何用户定义的形状或结构模板。我们跟踪 LASSIE 最近的工作, 它处理类似的问题设置, 并取得了两个重大进步。首先, 我们不依靠一个手动的 3D 3D 骨架, 我们自动根据所选的参考图像来估计一个特定等级的骨架。其次, 我们改进形状的重建, 使用新的具体实例优化战略, 使重建能够忠实于每个实例, 同时保存在所有图像中学习到的班级前的图像。实验显示 H- LASSIE 图像团获得更高的忠诚状态 3D 重建, 尽管需要最小的用户投入。