While frame-independent predictions with deep neural networks have become the prominent solutions to many computer vision tasks, the potential benefits of utilizing correlations between frames have received less attention. Even though probabilistic machine learning provides the ability to encode correlation as prior knowledge for inference, there is a tangible gap between the theory and practice of applying probabilistic methods to modern vision problems. For this, we derive a principled framework to combine information coupling between camera poses (translation and orientation) with deep models. We proposed a novel view kernel that generalizes the standard periodic kernel in $\mathrm{SO}(3)$. We show how this soft-prior knowledge can aid several pose-related vision tasks like novel view synthesis and predict arbitrary points in the latent space of generative models, pointing towards a range of new applications for inter-frame reasoning.
翻译:虽然具有深层神经网络的基于框架的预测已成为许多计算机视觉任务的主要解决办法,但利用各框架之间相互关系的潜在好处却没有受到多少注意。尽管概率机器学习提供了将相关性编码为先前的推断知识的能力,但在对现代视觉问题应用概率方法的理论和实践之间存在着明显的差距。为此,我们制定了一个原则框架,将照相机成形(翻译和定向)与深层模型之间的信息组合结合起来。我们提出了一个新的观点核心,将标准周期内核($\mathrm{SO}(3)美元)作为通用。我们展示了这种软原始知识如何帮助一些与表面有关的视觉任务,如新观点合成和预测基因模型潜在空间的任意点,指出了一系列用于框架间推理的新应用。