Inverse rendering of an object under entirely unknown capture conditions is a fundamental challenge in computer vision and graphics. Neural approaches such as NeRF have achieved photorealistic results on novel view synthesis, but they require known camera poses. Solving this problem with unknown camera poses is highly challenging as it requires joint optimization over shape, radiance, and pose. This problem is exacerbated when the input images are captured in the wild with varying backgrounds and illuminations. Standard pose estimation techniques fail in such image collections in the wild due to very few estimated correspondences across images. Furthermore, NeRF cannot relight a scene under any illumination, as it operates on radiance (the product of reflectance and illumination). We propose a joint optimization framework to estimate the shape, BRDF, and per-image camera pose and illumination. Our method works on in-the-wild online image collections of an object and produces relightable 3D assets for several use-cases such as AR/VR. To our knowledge, our method is the first to tackle this severely unconstrained task with minimal user interaction. Project page: https://markboss.me/publication/2022-samurai/ Video: https://youtu.be/LlYuGDjXp-8
翻译:在完全不为人知的捕捉条件下反向转换物体是计算机视觉和图形中的一项根本挑战。 NeRF 等神经方法在新视觉合成中取得了光现实效果,但是它们需要已知的照相机。 用未知的照相机解决问题非常具有挑战性,因为它需要在形状、光亮和表面上联合优化。当输入图像在野外拍摄时,其背景和光度各不相同,这一问题就更加严重了。标准在野外的这种图像收藏中造成估计技术的失败,因为图像之间的通信估计很少。此外,NERF无法在光亮下重新点亮出一个场景,因为它在光亮(反射和光照的产品)上运作。我们提出了一个联合优化框架来估计形状、 BRDF 和 Per-image 相机的形状和光亮度。我们的方法是利用诸如 AR/VR 等几个使用案例在网上收集的图像并产生可重亮度的3D资产。 据我们所知,我们的方法是第一个用最小的用户互动处理这一严重不协调的任务的方法。 我们的项目页面: http://margngnbombbous/Lbemusumusumus/pubusususionas.