感动和背景软件视听有条件视频预测 (Motion and Context-Aware Audio-Visual Conditioned Video Prediction) - 专知论文

会员服务 ·

0

Next · 多峰值 · 潜在 · state-of-the-art · INFORMS ·

2022 年 12 月 9 日

Motion and Context-Aware Audio-Visual Conditioned Video Prediction

翻译：感动和背景软件视听有条件视频预测

Yating Xu,Gim Hee Lee

from arxiv, under consideration at Computer Vision and Image Understanding

Existing state-of-the-art method for audio-visual conditioned video prediction uses the latent codes of the audio-visual frames from a multimodal stochastic network and a frame encoder to predict the next visual frame. However, a direct inference of per-pixel intensity for the next visual frame from the latent codes is extremely challenging because of the high-dimensional image space. To this end, we propose to decouple the audio-visual conditioned video prediction into motion and appearance modeling. The first part is the multimodal motion estimation module that learns motion information as optical flow from the given audio-visual clip. The second part is the context-aware refinement module that uses the predicted optical flow to warp the current visual frame into the next visual frame and refines it base on the given audio-visual context. Experimental results show that our method achieves competitive results on existing benchmarks.

翻译：现有最新的视听定性视频预测方法使用来自多式视听网络和框架编码器的视听框架潜在代码来预测下一个视觉框架,然而,由于高维图像空间,从潜在代码中直接推断下一个视觉框架的人均像素强度极具挑战性。为此,我们建议将视听视频附加视频预测与运动和外观模型脱钩。第一部分是多式运动估计模块,该模块将运动信息作为来自特定视听剪辑的光学流来学习。第二部分是背景认知改进模块,该模块利用预测的光学流将当前视觉框架转换为下一个视觉框架,并根据特定视听背景加以完善。实验结果表明,我们的方法在现有的基准上取得了竞争性成果。

0

相关内容

Next

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

ICLR 2021杰出论文奖出炉，8篇论文上榜！

专知会员服务

26+阅读 · 2021年4月2日

【快讯】ICML 2020论文出炉，1088篇上榜，你的paper中了吗？

【快讯】ICML 2020论文出炉，1088篇上榜，你的paper中了吗？

专知会员服务

52+阅读 · 2020年6月1日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

近期必读的6篇 NeurIPS 2019 的零样本学习(Zero-Shot Learning)论文

近期必读的6篇 NeurIPS 2019 的零样本学习(Zero-Shot Learning)论文

专知会员服务

60+阅读 · 2019年12月24日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

【ICIG2021】Latest News & Announcements of the Plenary Talk2

【ICIG2021】Latest News & Announcements of the Plenary Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年11月2日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文推荐】最新四篇CVPR2018 视频描述生成相关论文—双向注意力、Transformer、重构网络、层次强化学习

【论文推荐】最新四篇CVPR2018 视频描述生成相关论文—双向注意力、Transformer、重构网络、层次强化学习

专知

31+阅读 · 2018年6月4日

视频超分辨 Detail-revealing Deep Video Super-resolution 论文笔记

视频超分辨 Detail-revealing Deep Video Super-resolution 论文笔记

统计学习与视觉计算组

17+阅读 · 2018年3月16日

《数学学报》期刊

国家自然科学基金

5+阅读 · 2015年12月31日

新型Plectin-1荧光、MRI靶向分子探针对胰腺癌早期诊断的实验研究

国家自然科学基金

0+阅读 · 2014年12月31日

PDCD5对多发性骨髓瘤survivin表达的影响及分子机制

国家自然科学基金

0+阅读 · 2014年12月31日

湿颗粒体系的流态化行为及机理研究

国家自然科学基金

0+阅读 · 2013年12月31日

多观测量融合的水下被动目标跟踪方法研究

国家自然科学基金

2+阅读 · 2013年12月31日

液滴撞击条件下盐水液膜蒸发形态转捩机理的探索

国家自然科学基金

0+阅读 · 2012年12月31日

针对强关联体系的第一性原理方法发展

国家自然科学基金

0+阅读 · 2012年12月31日

水下重力梯度辅助惯性导航匹配算法研究

国家自然科学基金

0+阅读 · 2012年12月31日

新型肿瘤血管靶向性栓塞体系的建立及其抗肿瘤作用

国家自然科学基金

0+阅读 · 2009年12月31日

碳纳米管与有机污染物的相互作用机理研究

国家自然科学基金

0+阅读 · 2009年12月31日

UNAEN: Unsupervised Abnomality Extraction Network for MRI Motion Artifact Reduction

Arxiv

0+阅读 · 2023年2月9日

An Educated Warm Start For Deep Image Prior-Based Micro CT Reconstruction

Arxiv

0+阅读 · 2023年2月8日

Weakly-supervised Representation Learning for Video Alignment and Analysis

Arxiv

0+阅读 · 2023年2月8日

Contrastive Spatio-Temporal Pretext Learning for Self-supervised Video Representation

Arxiv

11+阅读 · 2021年12月16日

Question-controlled Text-aware Image Captioning

Arxiv

10+阅读 · 2021年8月4日

Image/Video Deep Anomaly Detection: A Survey

Arxiv

16+阅读 · 2021年3月2日

MVFNet: Multi-View Fusion Network for Efficient Video Recognition

Arxiv

13+阅读 · 2021年1月5日

Multi-view Graph Contrastive Representation Learning for Drug-Drug Interaction Prediction

Arxiv

26+阅读 · 2020年12月29日

Spatio-Temporal Graph for Video Captioning with Knowledge Distillation

Spatio-Temporal Graph for Video Captioning with Knowledge Distillation

Arxiv

19+阅读 · 2020年3月31日

Pose-Normalized Image Generation for Person Re-identification

Arxiv

11+阅读 · 2018年1月18日

VIP会员

文章信息

相关主题

state-of-the-art

相关VIP内容

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

ICLR 2021杰出论文奖出炉，8篇论文上榜！

专知会员服务

26+阅读 · 2021年4月2日

【快讯】ICML 2020论文出炉，1088篇上榜，你的paper中了吗？

【快讯】ICML 2020论文出炉，1088篇上榜，你的paper中了吗？

专知会员服务

52+阅读 · 2020年6月1日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

近期必读的6篇 NeurIPS 2019 的零样本学习(Zero-Shot Learning)论文

近期必读的6篇 NeurIPS 2019 的零样本学习(Zero-Shot Learning)论文

专知会员服务

60+阅读 · 2019年12月24日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【博士论文】低维与高维空间中潜在表征的分析、建模与变换

《生态建模密码破译：建模与编程实践》美陆军最新报告

大模型解决方案白皮书：社交陪伴场景全流程落地指南

面向具身操作的视觉-语言-动作模型综述

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

【ICIG2021】Latest News & Announcements of the Plenary Talk2

【ICIG2021】Latest News & Announcements of the Plenary Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年11月2日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文推荐】最新四篇CVPR2018 视频描述生成相关论文—双向注意力、Transformer、重构网络、层次强化学习

【论文推荐】最新四篇CVPR2018 视频描述生成相关论文—双向注意力、Transformer、重构网络、层次强化学习

专知

31+阅读 · 2018年6月4日

视频超分辨 Detail-revealing Deep Video Super-resolution 论文笔记

视频超分辨 Detail-revealing Deep Video Super-resolution 论文笔记

统计学习与视觉计算组

17+阅读 · 2018年3月16日

相关论文

UNAEN: Unsupervised Abnomality Extraction Network for MRI Motion Artifact Reduction

Arxiv

0+阅读 · 2023年2月9日

An Educated Warm Start For Deep Image Prior-Based Micro CT Reconstruction

Arxiv

0+阅读 · 2023年2月8日

Weakly-supervised Representation Learning for Video Alignment and Analysis

Arxiv

0+阅读 · 2023年2月8日

Contrastive Spatio-Temporal Pretext Learning for Self-supervised Video Representation

Arxiv

11+阅读 · 2021年12月16日

Question-controlled Text-aware Image Captioning

Arxiv

10+阅读 · 2021年8月4日

Image/Video Deep Anomaly Detection: A Survey

Arxiv

16+阅读 · 2021年3月2日

MVFNet: Multi-View Fusion Network for Efficient Video Recognition

Arxiv

13+阅读 · 2021年1月5日

Multi-view Graph Contrastive Representation Learning for Drug-Drug Interaction Prediction

Arxiv

26+阅读 · 2020年12月29日

Spatio-Temporal Graph for Video Captioning with Knowledge Distillation

Spatio-Temporal Graph for Video Captioning with Knowledge Distillation

Arxiv

19+阅读 · 2020年3月31日

Pose-Normalized Image Generation for Person Re-identification

Arxiv

11+阅读 · 2018年1月18日

相关基金

《数学学报》期刊

国家自然科学基金

5+阅读 · 2015年12月31日

新型Plectin-1荧光、MRI靶向分子探针对胰腺癌早期诊断的实验研究

国家自然科学基金

0+阅读 · 2014年12月31日

PDCD5对多发性骨髓瘤survivin表达的影响及分子机制

国家自然科学基金

0+阅读 · 2014年12月31日

湿颗粒体系的流态化行为及机理研究

国家自然科学基金

0+阅读 · 2013年12月31日

多观测量融合的水下被动目标跟踪方法研究

国家自然科学基金

2+阅读 · 2013年12月31日

液滴撞击条件下盐水液膜蒸发形态转捩机理的探索

国家自然科学基金

0+阅读 · 2012年12月31日

针对强关联体系的第一性原理方法发展

国家自然科学基金

0+阅读 · 2012年12月31日

水下重力梯度辅助惯性导航匹配算法研究

国家自然科学基金

0+阅读 · 2012年12月31日

新型肿瘤血管靶向性栓塞体系的建立及其抗肿瘤作用

国家自然科学基金

0+阅读 · 2009年12月31日

碳纳米管与有机污染物的相互作用机理研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员