Much progress has been made in the supervised learning of 3D reconstruction of rigid objects from multi-view images or a video. However, it is more challenging to reconstruct severely deformed objects from a single-view RGB image in an unsupervised manner. Although training-based methods, such as specific category-level training, have been shown to successfully reconstruct rigid objects and slightly deformed objects like birds from a single-view image, they cannot effectively handle severely deformed objects and neither can be applied to some downstream tasks in the real world due to the inconsistent semantic meaning of vertices, which are crucial in defining the adopted 3D templates of objects to be reconstructed. In this work, we introduce a template-based method to infer 3D shapes from a single-view image and apply the reconstructed mesh to a downstream task, i.e., absolute length measurement. Without using 3D ground truth, our method faithfully reconstructs 3D meshes and achieves state-of-the-art accuracy in a length measurement task on a severely deformed fish dataset.
翻译:在监督地从多视图图像或视频中重建三维刻板物体方面,已经取得了很大进展。然而,以不受监督的方式从单一视图的 RGB 图像中重建严重变形的物体则更具挑战性。尽管已经展示出以培训为基础的方法,如特定类别培训,成功地重建僵硬物体和像鸟类一样的像单一视图图像中的稍变形物体,但是它们无法有效地处理严重变形物体,也不能应用于现实世界中的某些下游任务,因为脊椎的语义含义不一致,这对于确定已通过的要重建的三维模板至关重要。在这项工作中,我们采用了基于模板的方法,从单一视图图像中推断三维形状,并将经过重建的网格应用于下游任务,即绝对长度测量。不使用三维地面真相,我们的方法忠实地重建了三维米shes,并在一个严重变形的鱼数据集的长度测量任务中实现最新精确度。