We present To The Point (TTP), a method for reconstructing 3D objects from a single image using 2D to 3D correspondences learned from weak supervision. We recover a 3D shape from a 2D image by first regressing the 2D positions corresponding to the 3D template vertices and then jointly estimating a rigid camera transform and non-rigid template deformation that optimally explain the 2D positions through the 3D shape projection. By relying on 3D-2D correspondences we use a simple per-sample optimization problem to replace CNN-based regression of camera pose and non-rigid deformation and thereby obtain substantially more accurate 3D reconstructions. We treat this optimization as a differentiable layer and train the whole system in an end-to-end manner. We report systematic quantitative improvements on multiple categories and provide qualitative results comprising diverse shape, pose and texture prediction examples. Project website: https://fkokkinos.github.io/to_the_point/.
翻译:我们向“点”(TTP)展示了一种方法,用从微弱监督中学习的2D到3D通信从单一图像中重建3D对象。我们从2D图像中恢复了3D形状,先是倒退与3D模板顶端相对的2D位置,然后共同估计一个硬相机变形和非硬模版变形,通过3D形状投影对2D位置作出最佳解释。我们依靠3D-2D对应文件,我们使用简单的每模版优化问题来取代CNN的镜头面容和非硬变形回归,从而获得更准确得多的3D重建。我们把这种优化当作一个不同的层,并以端到端的方式培训整个系统。我们报告多个类别的系统性数量改进,并提供质量结果,包括不同的形状、外形和文字预测实例。项目网站:https://fkokkinos.githuub.io/to_the_point/。