Monocular 3D reconstruction is to reconstruct the shape of object and its other information from a single RGB image. In 3D reconstruction, polygon mesh, with detailed surface information and low computational cost, is the most prevalent expression form obtained from deep learning models. However, the state-of-the-art schemes fail to directly generate well-structured meshes, and most of meshes have two severe problems Vertices Clustering (VC) and Illegal Twist (IT). By diving into the mesh deformation process, we pinpoint that the inappropriate usage of Chamfer Distance (CD) loss is the root causes of VC and IT problems in the training of deep learning model. In this paper, we initially demonstrate these two problems induced by CD loss with visual examples and quantitative analyses. Then, we propose a fine-grained reconstruction method CD$^2$ by employing Chamfer distance twice to perform a plausible and adaptive deformation. Extensive experiments on two 3D datasets and comparisons with five latest schemes demonstrate that our CD$^2$ directly generates well-structured meshes and outperforms others by alleviating VC and IT problems.
翻译:单体3D重建是用一个 RGB 图像来重建对象形状和其他信息。 在 3D 重建中, 具有详细表面信息和低计算成本的多边网状是深层学习模型中最常用的表达形式, 然而, 最先进的方案未能直接产生结构完善的模类, 而大部分的网目有两种严重的问题: VC 和 Unal Twist 。 通过潜入网状变形过程, 我们发现不适当地使用Chamfer 距离(CD) 损失是培养深层学习模型中VC和信息技术问题的根源。 在本文中, 我们最初用视觉实例和定量分析来证明CD损失引起的这两个问题。 然后, 我们提出一个精细的重建方法 CD$2, 利用Chamfer 距离进行两次合理和适应性变形。 在两个3D 数据集上进行广泛的实验,并与五个最新方案进行比较,表明我们的CD$2 直接生成结构完善的模件, 并且通过减轻VC IT 和问题而超越了其它问题。