Understanding the 3D world from 2D projected natural images is a fundamental challenge in computer vision and graphics. Recently, an unsupervised learning approach has garnered considerable attention owing to its advantages in data collection. However, to mitigate training limitations, typical methods need to impose assumptions for viewpoint distribution (e.g., a dataset containing various viewpoint images) or object shape (e.g., symmetric objects). These assumptions often restrict applications; for instance, the application to non-rigid objects or images captured from similar viewpoints (e.g., flower or bird images) remains a challenge. To complement these approaches, we propose aperture rendering generative adversarial networks (AR-GANs), which equip aperture rendering on top of GANs, and adopt focus cues to learn the depth and depth-of-field (DoF) effect of unlabeled natural images. To address the ambiguities triggered by unsupervised setting (i.e., ambiguities between smooth texture and out-of-focus blurs, and between foreground and background blurs), we develop DoF mixture learning, which enables the generator to learn real image distribution while generating diverse DoF images. In addition, we devise a center focus prior to guiding the learning direction. In the experiments, we demonstrate the effectiveness of AR-GANs in various datasets, such as flower, bird, and face images, demonstrate their portability by incorporating them into other 3D representation learning GANs, and validate their applicability in shallow DoF rendering.
翻译:从 2D 预测的自然图像中了解 3D 世界是计算机视觉和图形中的一项根本挑战。 最近, 一种不受监督的学习方法因其在数据收集方面的优势而引起了相当大的关注。 但是, 为了减少培训限制, 典型的方法需要将视觉分布假设( 例如包含各种视图图像的数据集) 或对象形状( 例如对称天体) 。 这些假设往往限制应用程序; 比如, 应用从类似角度( 如花或鸟图像) 获取的非硬性对象或图像仍然是一项挑战。 为了补充这些方法, 我们建议采用孔径转换基因化的对立网络( AR- GANs), 以配置GANs顶部的孔径显示, 并采用焦点提示来学习未贴标签的自然图像的深度和深度效果( 如对称天体物体) 。 这些假设往往限制了应用程序( 例如, 光滑的文本和偏差的模糊度, 以及表面和背景之间的模糊度) 。 为了补充这些方法, 我们开发 DoF 混合物学习, 使发电机能够在 GAN 上安装真实的图像, 展示前端的图像, 并展示其真实的图像 。