SSGVS: 语义极地图形合成合成 (SSGVS: Semantic Scene Graph-to-Video Synthesis) - 专知论文

会员服务 ·

0

图 · Guidance · Extensibility · 类标记 · state-of-the-art ·

2022 年 11 月 17 日

SSGVS: Semantic Scene Graph-to-Video Synthesis

翻译：SSGVS: 语义极地图形合成合成

Yuren Cong,Jinhui Yi,Bodo Rosenhahn,Michael Ying Yang

As a natural extension of the image synthesis task, video synthesis has attracted a lot of interest recently. Many image synthesis works utilize class labels or text as guidance. However, neither labels nor text can provide explicit temporal guidance, such as when an action starts or ends. To overcome this limitation, we introduce semantic video scene graphs as input for video synthesis, as they represent the spatial and temporal relationships between objects in the scene. Since video scene graphs are usually temporally discrete annotations, we propose a video scene graph (VSG) encoder that not only encodes the existing video scene graphs but also predicts the graph representations for unlabeled frames. The VSG encoder is pre-trained with different contrastive multi-modal losses. A semantic scene graph-to-video synthesis framework (SSGVS), based on the pre-trained VSG encoder, VQ-VAE, and auto-regressive Transformer, is proposed to synthesize a video given an initial scene image and a non-fixed number of semantic scene graphs. We evaluate SSGVS and other state-of-the-art video synthesis models on the Action Genome dataset and demonstrate the positive significance of video scene graphs in video synthesis. The source code will be released.

翻译：作为图像合成任务的自然延伸,视频合成最近吸引了许多人的兴趣。许多图像合成工作使用类标签或文本作为指导。但是, 标签和文本都不能提供明确的时间指导, 如动作开始或结束时。为了克服这一限制, 我们引入了语义视频场景图作为视频合成的输入, 因为它们代表了现场天体之间的空间和时间关系。由于视频场景图通常是时间上不相连的附加说明, 我们提议了一个视频场景图( VSGG) 编码器, 不仅将现有的视频场景图解码, 而且还预测了未标框的图形示意图。 VSGS 编码器和文本的图像显示器事先经过训练, 具有不同的对比性多模式损失。一个语义图像场景图- 图像合成框架( SSGVVS), 以经过预先训练的 VSGG coder, VQ- VAE, 和自动递增变变变变变器为基础, 提议对视频场景图进行合成, 将SSGVVS 和其他状态图像源集合成模型展示。

0

相关内容

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

【深度学习视频分析/多模态学习资源大列表】

【深度学习视频分析/多模态学习资源大列表】

专知会员服务

92+阅读 · 2019年10月16日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【论文推荐】最新7篇条件随机场（CRF）相关论文—图像标注、对抗学习、端到端、注意力机制、三维人体姿态、图像分割、行为分割和识别

【论文推荐】最新7篇条件随机场（CRF）相关论文—图像标注、对抗学习、端到端、注意力机制、三维人体姿态、图像分割、行为分割和识别

专知

15+阅读 · 2018年2月13日

【推荐】全卷积语义分割综述

【推荐】全卷积语义分割综述

机器学习研究会

19+阅读 · 2017年8月31日

TGM2在活化肝星状细胞诱导肝癌细胞糖代谢重构中的作用及机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

褐藻新品种厚叶海带岩藻聚糖硫酸酯的结构分析及其对脂质代谢的影响

国家自然科学基金

0+阅读 · 2014年12月31日

基于局部不变性特征和几何结构相似性的异源遥感影像自动配准

国家自然科学基金

1+阅读 · 2013年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

膜结合血红蛋白在日本沼虾低氧应激中的作用及其调控机制

国家自然科学基金

0+阅读 · 2013年12月31日

荧光-磁双模态纳米载体装载Survivin siRNA 对胶质瘤干细胞增殖的影响及作用机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

自底向上的静态图像显著性检测

国家自然科学基金

1+阅读 · 2012年12月31日

低温基质隔离红外光谱研究硅-过渡金属氢桥键

国家自然科学基金

0+阅读 · 2012年12月31日

PPARγ拮抗Egr-1对增生性瘢痕TGF-β1促纤维化信号的作用及机制

国家自然科学基金

0+阅读 · 2012年12月31日

钙钛矿结构钒基氧化物光诱导效应研究

国家自然科学基金

0+阅读 · 2009年12月31日

USER: Unified Semantic Enhancement with Momentum Contrast for Image-Text Retrieval

Arxiv

0+阅读 · 2023年1月17日

Multi-source Pseudo-label Learning of Semantic Segmentation for the Scene Recognition of Agricultural Mobile Robots

Arxiv

0+阅读 · 2023年1月13日

Multi-Task Learning for Visual Scene Understanding

Arxiv

29+阅读 · 2022年3月28日

Scene Graph Generation: A Comprehensive Survey

Arxiv

26+阅读 · 2022年1月3日

Image-to-Image Retrieval by Learning Similarity between Scene Graphs

Arxiv

21+阅读 · 2020年12月29日

ERNIE-ViL: Knowledge Enhanced Vision-Language Representations Through Scene Graph

Arxiv

11+阅读 · 2020年7月31日

Multi-Modal Graph Neural Network for Joint Reasoning on Vision and Scene Text

Multi-Modal Graph Neural Network for Joint Reasoning on Vision and Scene Text

Arxiv

10+阅读 · 2020年3月31日

Total3DUnderstanding: Joint Layout, Object Pose and Mesh Reconstruction for Indoor Scenes from a Single Image

Total3DUnderstanding: Joint Layout, Object Pose and Mesh Reconstruction for Indoor Scenes from a Single Image

Arxiv

12+阅读 · 2020年2月27日

Exploring Visual Relationship for Image Captioning

Exploring Visual Relationship for Image Captioning

Arxiv

15+阅读 · 2018年9月19日

Deep Representation Learning for Domain Adaptation of Semantic Image Segmentation

Arxiv

10+阅读 · 2018年5月10日

VIP会员

文章信息

相关主题

state-of-the-art

相关VIP内容

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

【深度学习视频分析/多模态学习资源大列表】

【深度学习视频分析/多模态学习资源大列表】

专知会员服务

92+阅读 · 2019年10月16日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

从社会学实验到行为仿真：理解基于Agent的观点动力学建模思维

中英文版《GPT-5 System Card速览》报告

ACL 2025 | 大模型结构化知识提示的泛化能力研究

【普林斯顿博士论文】大型模型的高效推理

相关资讯

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【论文推荐】最新7篇条件随机场（CRF）相关论文—图像标注、对抗学习、端到端、注意力机制、三维人体姿态、图像分割、行为分割和识别

【论文推荐】最新7篇条件随机场（CRF）相关论文—图像标注、对抗学习、端到端、注意力机制、三维人体姿态、图像分割、行为分割和识别

专知

15+阅读 · 2018年2月13日

【推荐】全卷积语义分割综述

【推荐】全卷积语义分割综述

机器学习研究会

19+阅读 · 2017年8月31日

相关论文

USER: Unified Semantic Enhancement with Momentum Contrast for Image-Text Retrieval

Arxiv

0+阅读 · 2023年1月17日

Multi-source Pseudo-label Learning of Semantic Segmentation for the Scene Recognition of Agricultural Mobile Robots

Arxiv

0+阅读 · 2023年1月13日

Multi-Task Learning for Visual Scene Understanding

Arxiv

29+阅读 · 2022年3月28日

Scene Graph Generation: A Comprehensive Survey

Arxiv

26+阅读 · 2022年1月3日

Image-to-Image Retrieval by Learning Similarity between Scene Graphs

Arxiv

21+阅读 · 2020年12月29日

ERNIE-ViL: Knowledge Enhanced Vision-Language Representations Through Scene Graph

Arxiv

11+阅读 · 2020年7月31日

Multi-Modal Graph Neural Network for Joint Reasoning on Vision and Scene Text

Multi-Modal Graph Neural Network for Joint Reasoning on Vision and Scene Text

Arxiv

10+阅读 · 2020年3月31日

Total3DUnderstanding: Joint Layout, Object Pose and Mesh Reconstruction for Indoor Scenes from a Single Image

Total3DUnderstanding: Joint Layout, Object Pose and Mesh Reconstruction for Indoor Scenes from a Single Image

Arxiv

12+阅读 · 2020年2月27日

Exploring Visual Relationship for Image Captioning

Exploring Visual Relationship for Image Captioning

Arxiv

15+阅读 · 2018年9月19日

Deep Representation Learning for Domain Adaptation of Semantic Image Segmentation

Arxiv

10+阅读 · 2018年5月10日

相关基金

TGM2在活化肝星状细胞诱导肝癌细胞糖代谢重构中的作用及机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

褐藻新品种厚叶海带岩藻聚糖硫酸酯的结构分析及其对脂质代谢的影响

国家自然科学基金

0+阅读 · 2014年12月31日

基于局部不变性特征和几何结构相似性的异源遥感影像自动配准

国家自然科学基金

1+阅读 · 2013年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

膜结合血红蛋白在日本沼虾低氧应激中的作用及其调控机制

国家自然科学基金

0+阅读 · 2013年12月31日

荧光-磁双模态纳米载体装载Survivin siRNA 对胶质瘤干细胞增殖的影响及作用机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

自底向上的静态图像显著性检测

国家自然科学基金

1+阅读 · 2012年12月31日

低温基质隔离红外光谱研究硅-过渡金属氢桥键

国家自然科学基金

0+阅读 · 2012年12月31日

PPARγ拮抗Egr-1对增生性瘢痕TGF-β1促纤维化信号的作用及机制

国家自然科学基金

0+阅读 · 2012年12月31日

钙钛矿结构钒基氧化物光诱导效应研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员