Geometric camera calibration is often required for applications that understand the perspective of the image. We propose perspective fields as a representation that models the local perspective properties of an image. Perspective Fields contain per-pixel information about the camera view, parameterized as an up vector and a latitude value. This representation has a number of advantages as it makes minimal assumptions about the camera model and is invariant or equivariant to common image editing operations like cropping, warping, and rotation. It is also more interpretable and aligned with human perception. We train a neural network to predict Perspective Fields and the predicted Perspective Fields can be converted to calibration parameters easily. We demonstrate the robustness of our approach under various scenarios compared with camera calibration-based methods and show example applications in image compositing.
翻译:对于能够理解图像视角的应用程序,往往需要几何摄影机校准。我们提出视觉字段,作为模拟图像本地视角属性的模型。视觉字段包含关于相机视图的每像素信息,作为向上矢量和纬度值进行参数化。这种表示有若干优点,因为它对相机模型的假设极小,并且对裁剪、扭曲和旋转等常见图像编辑操作没有变数或变数。它更可解释,也更符合人类的认知。我们训练神经网络来预测视野字段,预测视野字段可以很容易地转换为校准参数。我们展示了我们在不同情景下的做法与以相机校准为基础的方法相比的稳健性,并展示了图像组合中的示例应用。