基于混合神经表示的视频表征（HNeRV） (HNeRV: A Hybrid Neural Representation for Videos) - 专知论文

会员服务 ·

0

视频 · 嵌入 · 内容自适应 · 表示 · 解码 ·

2023 年 4 月 5 日

HNeRV: A Hybrid Neural Representation for Videos

翻译：基于混合神经表示的视频表征（HNeRV）

Hao Chen,Matt Gwilliam,Ser-Nam Lim,Abhinav Shrivastava

from arxiv, CVPR 2023. Project page at https://haochen-rye.github.io/HNeRV, and Code at https://github.com/haochen-rye/HNeRV

Implicit neural representations store videos as neural networks and have performed well for various vision tasks such as video compression and denoising. With frame index or positional index as input, implicit representations (NeRV, E-NeRV, \etc) reconstruct video from fixed and content-agnostic embeddings. Such embedding largely limits the regression capacity and internal generalization for video interpolation. In this paper, we propose a Hybrid Neural Representation for Videos (HNeRV), where a learnable encoder generates content-adaptive embeddings, which act as the decoder input. Besides the input embedding, we introduce HNeRV blocks, which ensure model parameters are evenly distributed across the entire network, such that higher layers (layers near the output) can have more capacity to store high-resolution content and video details. With content-adaptive embeddings and re-designed architecture, HNeRV outperforms implicit methods in video regression tasks for both reconstruction quality ($+4.7$ PSNR) and convergence speed ($16\times$ faster), and shows better internal generalization. As a simple and efficient video representation, HNeRV also shows decoding advantages for speed, flexibility, and deployment, compared to traditional codecs~(H.264, H.265) and learning-based compression methods. Finally, we explore the effectiveness of HNeRV on downstream tasks such as video compression and video inpainting. We provide project page at https://haochen-rye.github.io/HNeRV, and Code at https://github.com/haochen-rye/HNeRV

翻译：隐式神经表示以神经网络存储视频，在诸多视觉任务中表现良好，如视频压缩和降噪。使用帧索引或位置索引作为输入，隐式表示（NeRV，E-NeRV等）从固定和内容无关的嵌入中重构视频。这种嵌入在很大程度上限制了视频插值的回归能力和内部泛化。本文提出了一种基于混合神经表示的视频表征（HNeRV），其中可学习的编码器生成内容自适应嵌入，充当解码器输入。除了输入嵌入外，我们引入了HNeRV块，确保模型参数均匀分布在整个网络中，从而使更高层（接近输出层的层）可以具有存储高分辨率内容和视频细节的更大容量。通过内容自适应嵌入和重新设计的架构，HNeRV在视频回归任务中的重构质量（+4.7 PSNR）和收敛速度（16倍）方面均优于隐式方法，并显示出更好的内部泛化。作为一种简单而高效的视频表示，与传统编解码器（H.264，H.265）和基于学习的压缩方法相比，HNeRV还显示出速度、灵活性和部署方面的解码优势。最后，我们探讨了HNeRV在视频压缩和视频修补等下游任务中的有效性。我们在https://haochen-rye.github.io/HNeRV提供项目页面，在https://github.com/haochen-rye/HNeRV提供代码。

0

相关内容

视频

【CVPR2023】面向不同视频的可扩展神经表示，

【CVPR2023】面向不同视频的可扩展神经表示，

专知会员服务

20+阅读 · 2023年3月28日

【CVPR2022】视频对比学习的概率表示，Probabilistic Representations for Video Contrastive Learning

【CVPR2022】视频对比学习的概率表示，Probabilistic Representations for Video Contrastive Learning

专知会员服务

16+阅读 · 2022年4月11日

【CVPR 2022】多模态视频字幕的端到端生成预训练，End-to-end Generative Pretraining for Multimodal Video Captioning

【CVPR 2022】多模态视频字幕的端到端生成预训练，End-to-end Generative Pretraining for Multimodal Video Captioning

专知会员服务

27+阅读 · 2022年3月3日

【NeurIPS2021】NeRV:视频的神经表示

【NeurIPS2021】NeRV:视频的神经表示

专知会员服务

12+阅读 · 2021年10月28日

【KDD2020-清华大学】自适应图编码器，Adaptive Graph Encoder for Attributed Graph Embedding

【KDD2020-清华大学】自适应图编码器，Adaptive Graph Encoder for Attributed Graph Embedding

专知会员服务

99+阅读 · 2020年7月6日

【ACL2020-亚马逊】Transformers多分辨率和多模态语音识别，Multiresolution and Multimodal Speech Recognition with Transformers

【ACL2020-亚马逊】Transformers多分辨率和多模态语音识别，Multiresolution and Multimodal Speech Recognition with Transformers

专知会员服务

15+阅读 · 2020年5月5日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

【表示学习(Representation Learning)】8篇 NeurIPS 2019论文选读

专知会员服务

54+阅读 · 2019年12月22日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文推荐】最新八篇情感分析相关论文—Pair-wise判别器、多模态情感分析、上下文语境、Gated 卷积网络

【论文推荐】最新八篇情感分析相关论文—Pair-wise判别器、多模态情感分析、上下文语境、Gated 卷积网络

专知

20+阅读 · 2018年6月29日

【论文推荐】最新七篇图像检索相关论文—草图、Tie-Aware、场景图解析、叠加跨注意力机制、深度哈希、人群估计

【论文推荐】最新七篇图像检索相关论文—草图、Tie-Aware、场景图解析、叠加跨注意力机制、深度哈希、人群估计

专知

10+阅读 · 2018年4月22日

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

专知

66+阅读 · 2018年1月31日

可解释的CNN

可解释的CNN

CreateAMind

17+阅读 · 2017年10月5日

注意缺陷多动障碍者的网络成瘾：认知缺陷和动机风格易感因素及追踪研究

国家自然科学基金

0+阅读 · 2013年12月31日

两相流区域耦合问题的研究

国家自然科学基金

0+阅读 · 2013年12月31日

一类两相流的适定性问题研究

国家自然科学基金

0+阅读 · 2013年12月31日

无界区域最优控制问题的无限元方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

面向遥感图像高保真压缩的变换与量化方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

面向图像与视频特征表示的深度编码方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

Groebner 基计算的新理论和快速算法研究

国家自然科学基金

0+阅读 · 2011年12月31日

广义Kloosterman和的均值估计

国家自然科学基金

0+阅读 · 2011年12月31日

最优控制问题自适应混合有限元方法

国家自然科学基金

0+阅读 · 2009年12月31日

奇异摄动问题DG方法一致超收敛与非线性偏微分方程多解高效算法研究

国家自然科学基金

1+阅读 · 2008年12月31日

Trans-Dimensional Generative Modeling via Jump Diffusion Models

Trans-Dimensional Generative Modeling via Jump Diffusion Models

Arxiv

0+阅读 · 2023年5月25日

DDDM-VC: Decoupled Denoising Diffusion Models with Disentangled Representation and Prior Mixup for Verified Robust Voice Conversion

Arxiv

0+阅读 · 2023年5月25日

A Neural Space-Time Representation for Text-to-Image Personalization

Arxiv

0+阅读 · 2023年5月24日

DC-Net: Divide-and-Conquer for Salient Object Detection

Arxiv

0+阅读 · 2023年5月24日

Survey: Transformer based Video-Language Pre-training

Arxiv

20+阅读 · 2021年9月21日

Adaptive Transfer Learning on Graph Neural Networks

Arxiv

14+阅读 · 2021年7月20日

Cross-Modal Discrete Representation Learning

Arxiv

18+阅读 · 2021年6月10日

A Comparative Study for Unsupervised Network Representation Learning

Arxiv

24+阅读 · 2020年3月11日

Evolving Losses for Unsupervised Video Representation Learning

Arxiv

23+阅读 · 2020年2月26日

Deep Learning for Sentiment Analysis : A Survey

Arxiv

25+阅读 · 2018年1月24日

VIP会员

文章信息

相关主题

内容自适应

相关VIP内容

【CVPR2023】面向不同视频的可扩展神经表示，

【CVPR2023】面向不同视频的可扩展神经表示，

专知会员服务

20+阅读 · 2023年3月28日

【CVPR2022】视频对比学习的概率表示，Probabilistic Representations for Video Contrastive Learning

【CVPR2022】视频对比学习的概率表示，Probabilistic Representations for Video Contrastive Learning

专知会员服务

16+阅读 · 2022年4月11日

【CVPR 2022】多模态视频字幕的端到端生成预训练，End-to-end Generative Pretraining for Multimodal Video Captioning

【CVPR 2022】多模态视频字幕的端到端生成预训练，End-to-end Generative Pretraining for Multimodal Video Captioning

专知会员服务

27+阅读 · 2022年3月3日

【NeurIPS2021】NeRV:视频的神经表示

【NeurIPS2021】NeRV:视频的神经表示

专知会员服务

12+阅读 · 2021年10月28日

【KDD2020-清华大学】自适应图编码器，Adaptive Graph Encoder for Attributed Graph Embedding

【KDD2020-清华大学】自适应图编码器，Adaptive Graph Encoder for Attributed Graph Embedding

专知会员服务

99+阅读 · 2020年7月6日

【ACL2020-亚马逊】Transformers多分辨率和多模态语音识别，Multiresolution and Multimodal Speech Recognition with Transformers

【ACL2020-亚马逊】Transformers多分辨率和多模态语音识别，Multiresolution and Multimodal Speech Recognition with Transformers

专知会员服务

15+阅读 · 2020年5月5日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

【表示学习(Representation Learning)】8篇 NeurIPS 2019论文选读

专知会员服务

54+阅读 · 2019年12月22日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【伯克利博士论文】通过真实世界实践赋能机器人自主性

军用无人机集群技术尚未成熟——但潜力可期

人工智能安全治理白皮书（2025）

AgentOps综述：分类、挑战与未来方向

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文推荐】最新八篇情感分析相关论文—Pair-wise判别器、多模态情感分析、上下文语境、Gated 卷积网络

【论文推荐】最新八篇情感分析相关论文—Pair-wise判别器、多模态情感分析、上下文语境、Gated 卷积网络

专知

20+阅读 · 2018年6月29日

【论文推荐】最新七篇图像检索相关论文—草图、Tie-Aware、场景图解析、叠加跨注意力机制、深度哈希、人群估计

【论文推荐】最新七篇图像检索相关论文—草图、Tie-Aware、场景图解析、叠加跨注意力机制、深度哈希、人群估计

专知

10+阅读 · 2018年4月22日

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

专知

66+阅读 · 2018年1月31日

可解释的CNN

可解释的CNN

CreateAMind

17+阅读 · 2017年10月5日

相关论文

Trans-Dimensional Generative Modeling via Jump Diffusion Models

Trans-Dimensional Generative Modeling via Jump Diffusion Models

Arxiv

0+阅读 · 2023年5月25日

DDDM-VC: Decoupled Denoising Diffusion Models with Disentangled Representation and Prior Mixup for Verified Robust Voice Conversion

Arxiv

0+阅读 · 2023年5月25日

A Neural Space-Time Representation for Text-to-Image Personalization

Arxiv

0+阅读 · 2023年5月24日

DC-Net: Divide-and-Conquer for Salient Object Detection

Arxiv

0+阅读 · 2023年5月24日

Survey: Transformer based Video-Language Pre-training

Arxiv

20+阅读 · 2021年9月21日

Adaptive Transfer Learning on Graph Neural Networks

Arxiv

14+阅读 · 2021年7月20日

Cross-Modal Discrete Representation Learning

Arxiv

18+阅读 · 2021年6月10日

A Comparative Study for Unsupervised Network Representation Learning

Arxiv

24+阅读 · 2020年3月11日

Evolving Losses for Unsupervised Video Representation Learning

Arxiv

23+阅读 · 2020年2月26日

Deep Learning for Sentiment Analysis : A Survey

Arxiv

25+阅读 · 2018年1月24日

相关基金

注意缺陷多动障碍者的网络成瘾：认知缺陷和动机风格易感因素及追踪研究

国家自然科学基金

0+阅读 · 2013年12月31日

两相流区域耦合问题的研究

国家自然科学基金

0+阅读 · 2013年12月31日

一类两相流的适定性问题研究

国家自然科学基金

0+阅读 · 2013年12月31日

无界区域最优控制问题的无限元方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

面向遥感图像高保真压缩的变换与量化方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

面向图像与视频特征表示的深度编码方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

Groebner 基计算的新理论和快速算法研究

国家自然科学基金

0+阅读 · 2011年12月31日

广义Kloosterman和的均值估计

国家自然科学基金

0+阅读 · 2011年12月31日

最优控制问题自适应混合有限元方法

国家自然科学基金

0+阅读 · 2009年12月31日

奇异摄动问题DG方法一致超收敛与非线性偏微分方程多解高效算法研究

国家自然科学基金

1+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员