通过视频特定自动创建者进行视频探索 (Video Exploration via Video-Specific Autoencoders)

We present simple video-specific autoencoders that enables human-controllable video exploration. This includes a wide variety of analytic tasks such as (but not limited to) spatial and temporal super-resolution, spatial and temporal editing, object removal, video textures, average video exploration, and correspondence estimation within and across videos. Prior work has independently looked at each of these problems and proposed different formulations. In this work, we observe that a simple autoencoder trained (from scratch) on multiple frames of a specific video enables one to perform a large variety of video processing and editing tasks. Our tasks are enabled by two key observations: (1) latent codes learned by the autoencoder capture spatial and temporal properties of that video and (2) autoencoders can project out-of-sample inputs onto the video-specific manifold. For e.g. (1) interpolating latent codes enables temporal super-resolution and user-controllable video textures; (2) manifold reprojection enables spatial super-resolution, object removal, and denoising without training for any of the tasks. Importantly, a two-dimensional visualization of latent codes via principal component analysis acts as a tool for users to both visualize and intuitively control video edits. Finally, we quantitatively contrast our approach with the prior art and found that without any supervision and task-specific knowledge, our approach can perform comparably to supervised approaches specifically trained for a task.

翻译：我们展示了便于人控制的视频探索的简单视频专用自动解码器,其中包括多种分析任务,例如(但不限于)空间和时间超分辨率、空间和时间超分辨率、空间和时间编辑、物体删除、视频纹理、平均视频探索以及视频内部和视频之间的通信估计。以前的工作独立地审视了这些问题中的每一个问题,并提出了不同的配方。在这项工作中,我们观察到,在特定视频的多个框架上受过培训的简单自动解码器(从零开始)能够执行大量视频处理和编辑任务。我们的任务由两项关键观察促成:(1)自动解码器所学的隐含代码能够捕捉到该视频的空间和时间特性,空间和时间超分辨率,空间解码器可以捕捉到该视频的时空特性,(2)自动解析器可以预测到该视频的时空代码的时空特性,通过主要部分分析,自动解码可以预测出对视频图案的外输入内容。例如(1) 潜在代码可以进行时间超分辨率和用户的调控控用视频文本;(2) 多重再预测,可以在任何任务中进行空间超分辨率解析和不经过训练的对用户进行定量分析。

相关内容

自编码器

关注 140

自动编码器是一种人工神经网络，用于以无监督的方式学习有效的数据编码。自动编码器的目的是通过训练网络忽略信号“噪声”来学习一组数据的表示（编码），通常用于降维。与简化方面一起，学习了重构方面，在此，自动编码器尝试从简化编码中生成尽可能接近其原始输入的表示形式，从而得到其名称。基本模型存在几种变体，其目的是迫使学习的输入表示形式具有有用的属性。自动编码器可有效地解决许多应用问题，从面部识别到获取单词的语义。

专知会员服务

39+阅读 · 2020年11月3日

最新《生成式对抗网络》简介，25页ppt

专知会员服务

176+阅读 · 2020年6月28日

【Google】监督对比学习，Supervised Contrastive Learning

专知会员服务

75+阅读 · 2020年4月24日

强化学习的对比无监督表示，CURL: Contrastive Unsupervised Representations for Reinforcement Learning

专知会员服务

41+阅读 · 2020年4月11日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集