Perpetual view generation -- the task of generating long-range novel views by flying into a given image -- has been a novel yet promising task. We introduce DiffDreamer, an unsupervised framework capable of synthesizing novel views depicting a long camera trajectory while training solely on internet-collected images of nature scenes. We demonstrate that image-conditioned diffusion models can effectively perform long-range scene extrapolation while preserving both local and global consistency significantly better than prior GAN-based methods. Project page: https://primecai.github.io/diffdreamer .
翻译:永恒的视觉生成 -- -- 通过飞向特定图像生成长距离新观点的任务 -- -- 是一项新颖而又充满希望的任务。我们引入了DiffDreamer,这是一个不受监督的框架,能够综合描述长镜头轨迹的新观点,同时仅就互联网收集的自然场景图像进行培训。我们证明,图像成像的传播模型可以有效地进行长距离场景外推,同时保持地方和全球的一致性,大大优于以前基于GAN的方法。项目网页:https://primecai.github.io/diffdreamer。