This paper presents JAWS, an optimization-driven approach that achieves the robust transfer of visual cinematic features from a reference in-the-wild video clip to a newly generated clip. To this end, we rely on an implicit-neural-representation (INR) in a way to compute a clip that shares the same cinematic features as the reference clip. We propose a general formulation of a camera optimization problem in an INR that computes extrinsic and intrinsic camera parameters as well as timing. By leveraging the differentiability of neural representations, we can back-propagate our designed cinematic losses measured on proxy estimators through a NeRF network to the proposed cinematic parameters directly. We also introduce specific enhancements such as guidance maps to improve the overall quality and efficiency. Results display the capacity of our system to replicate well known camera sequences from movies, adapting the framing, camera parameters and timing of the generated video clip to maximize the similarity with the reference clip.
翻译:本文提出了JAWS,一种优化驱动的方法,实现了从参考野外视频剪辑到新生成的剪辑的视觉电影特征的强大传递。为此,我们依靠隐式神经表示(INR),以一种方式计算出一个剪辑,其具有与参考剪辑相同的电影特征。我们提出了在INR中的相机优化问题的通用公式,计算外部和内部摄像机参数以及时间。通过利用神经表示的可微性,我们可以将设计的电影损失通过代理估计器测量直接通过NeRF网络反向传播到提议的电影参数。我们还介绍了特定的增强措施,例如指导图,以提高整体质量和效率。结果显示了我们的系统复制了知名电影中的摄影序列的能力,调整了新生成的视频剪辑的构图、相机参数和时间,以最大化与参考剪辑的相似度。