Humans can easily infer the underlying 3D geometry and texture of an object only from a single 2D image. Current computer vision methods can do this, too, but suffer from view generalization problems - the models inferred tend to make poor predictions of appearance in novel views. As for generalization problems in machine learning, the difficulty is balancing single-view accuracy (cf. training error; bias) with novel view accuracy (cf. test error; variance). We describe a class of models whose geometric rigidity is easily controlled to manage this tradeoff. We describe a cycle consistency loss that improves view generalization (roughly, a model from a generated view should predict the original view well). View generalization of textures requires that models share texture information, so a car seen from the back still has headlights because other cars have headlights. We describe a cycle consistency loss that encourages model textures to be aligned, so as to encourage sharing. We compare our method against the state-of-the-art method and show both qualitative and quantitative improvements.
翻译:人类可以很容易地从单一的 2D 图像中推断一个对象的底部 3D 几何和纹理。 目前计算机的视觉方法也可以做到这一点, 但也存在一般化问题, 但这些模型在新观点中往往对外观的预测较差。 关于机器学习的一般化问题, 困难在于以新颖的视觉准确性( 参考培训错误; 偏差) 来平衡单一视图的准确性( 参考测试错误; 差异) 。 我们描述一组模型的几何僵硬性很容易被控制来管理这一权衡。 我们描述周期一致性损失, 从而改进了一般化观点( 粗略地说, 一个来自生成视图的模型应该预知原始视图 ) 。 总体化的纹理要求模型共享质素信息, 所以从后面看的汽车仍然有头灯, 因为其他汽车有头灯。 我们描述一个周期一致性损失, 鼓励模型纹理相互匹配, 以便鼓励共享。 我们比较我们的方法和最先进的方法, 并显示质量和数量两方面的改进 。