SIMONe: 通过不受监督的视频分解查看变量、临时抽取对象表达式 (SIMONe: View-Invariant, Temporally-Abstracted Object Representations via Unsupervised Video Decomposition) - 专知论文

会员服务 ·

0

无监督 · MoDELS · 推断 · Performer · 估计/估计量 ·

2021 年 6 月 7 日

SIMONe: View-Invariant, Temporally-Abstracted Object Representations via Unsupervised Video Decomposition

翻译：SIMONe: 通过不受监督的视频分解查看变量、临时抽取对象表达式

Rishabh Kabra,Daniel Zoran,Goker Erdogan,Loic Matthey,Antonia Creswell,Matthew Botvinick,Alexander Lerchner,Christopher P. Burgess

from arxiv, Animated figures are available at https://sites.google.com/view/simone-scene-understanding/

To help agents reason about scenes in terms of their building blocks, we wish to extract the compositional structure of any given scene (in particular, the configuration and characteristics of objects comprising the scene). This problem is especially difficult when scene structure needs to be inferred while also estimating the agent's location/viewpoint, as the two variables jointly give rise to the agent's observations. We present an unsupervised variational approach to this problem. Leveraging the shared structure that exists across different scenes, our model learns to infer two sets of latent representations from RGB video input alone: a set of "object" latents, corresponding to the time-invariant, object-level contents of the scene, as well as a set of "frame" latents, corresponding to global time-varying elements such as viewpoint. This factorization of latents allows our model, SIMONe, to represent object attributes in an allocentric manner which does not depend on viewpoint. Moreover, it allows us to disentangle object dynamics and summarize their trajectories as time-abstracted, view-invariant, per-object properties. We demonstrate these capabilities, as well as the model's performance in terms of view synthesis and instance segmentation, across three procedurally generated video datasets.

翻译：为了帮助代理商从构件的角度理解场景,我们希望从任何特定场景的构成结构(特别是构成场景的物体的配置和特征)中提取出一个图案结构。当需要推断场景结构时,这一问题特别困难,同时需要估算代理商的位置/视图点,因为这两个变量共同导致代理商的观察。我们对这一问题提出了一种未经监督的变异方法。利用不同场景的共享结构,我们的模型从 RGB 视频输入中单独推断出两组潜在代表:一组“对象”潜伏,与场景的时间变量、对象级内容相对应,以及一套“框架”潜伏,与全球时间变化要素如观点相对应。这种潜伏因素使得我们的模型SIMONTE能够以不依赖视角的偏向方式代表对象属性。此外,它允许我们从 RGB 视频输入中分离出两组潜在表达的图象:一组“对象”潜伏性,与场景中的时间变量、视图、对象级、对象级内容相对,以及一系列“框架”潜值,与视角相对。我们用这些功能展示了模拟合成过程段的功能。

0

相关内容

无监督

“CVPR 2021 接受论文列表 1663篇论文都在这了

专知会员服务

32+阅读 · 2021年6月12日

最新《自监督表示学习》报告，70页ppt

最新《自监督表示学习》报告，70页ppt

专知会员服务

86+阅读 · 2020年12月22日

【CVPR2020】在线深度聚类的无监督表示学习, Online Deep Clustering for Unsupervised Representation Learning

【CVPR2020】在线深度聚类的无监督表示学习, Online Deep Clustering for Unsupervised Representation Learning

专知会员服务

69+阅读 · 2020年6月19日

【CVPR2020】自监督的深度视觉测程与在线适应，Self-Supervised Deep Visual Odometry

【CVPR2020】自监督的深度视觉测程与在线适应，Self-Supervised Deep Visual Odometry

专知会员服务

32+阅读 · 2020年5月14日

【CVPR2020】通过潦草注释的弱监督显著目标检测，Weakly-Supervised Salient Object Detection via Scribble Annotations

【CVPR2020】通过潦草注释的弱监督显著目标检测，Weakly-Supervised Salient Object Detection via Scribble Annotations

专知会员服务

39+阅读 · 2020年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

【CVPR2020】从未标记的视频中学习视频对象分割，Learning Video Object Segmentation from Unlabeled Videos

【CVPR2020】从未标记的视频中学习视频对象分割，Learning Video Object Segmentation from Unlabeled Videos

专知会员服务

36+阅读 · 2020年3月12日

【Google】视频诱导视觉不变性的自监督学习（Self-Supervised Learning of Video-Induced Visual Invariances），谷歌博士后研究员| Michael Tschannen等

【Google】视频诱导视觉不变性的自监督学习（Self-Supervised Learning of Video-Induced Visual Invariances），谷歌博士后研究员| Michael Tschannen等

专知会员服务

12+阅读 · 2019年12月8日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知

133+阅读 · 2020年3月18日

已删除

将门创投

3+阅读 · 2019年6月12日

无人机视觉挑战赛 | ICCV 2019 Workshop—VisDrone2019

无人机视觉挑战赛 | ICCV 2019 Workshop—VisDrone2019

PaperWeekly

7+阅读 · 2019年5月5日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

MoCoGAN 分解运动和内容的视频生成

MoCoGAN 分解运动和内容的视频生成

CreateAMind

18+阅读 · 2017年10月21日

【ICCV 2017论文集】计算机视觉顶级会议ICCV2017 Open Access Repository

【ICCV 2017论文集】计算机视觉顶级会议ICCV2017 Open Access Repository

专知

6+阅读 · 2017年10月14日

Visual Object Recognition in Indoor Environments Using Topologically Persistent Features

Arxiv

0+阅读 · 2021年7月28日

A Large-Scale Study on Unsupervised Spatiotemporal Representation Learning

Arxiv

11+阅读 · 2021年4月29日

Self-supervised Video Representation Learning by Context and Motion Decoupling

Arxiv

6+阅读 · 2021年4月2日

Rethinking Self-supervised Correspondence Learning: A Video Frame-level Similarity Perspective

Arxiv

4+阅读 · 2021年3月31日

Enhancing Unsupervised Video Representation Learning by Decoupling the Scene and the Motion

Arxiv

4+阅读 · 2020年12月4日

Evolving Losses for Unsupervised Video Representation Learning

Arxiv

23+阅读 · 2020年2月26日

GraphTER: Unsupervised Learning of Graph Transformation Equivariant Representations via Auto-Encoding Node-wise Transformations

GraphTER: Unsupervised Learning of Graph Transformation Equivariant Representations via Auto-Encoding Node-wise Transformations

Arxiv

7+阅读 · 2019年11月19日

Physical Primitive Decomposition

Physical Primitive Decomposition

Arxiv

4+阅读 · 2018年9月13日

End-to-end Active Object Tracking via Reinforcement Learning

Arxiv

3+阅读 · 2018年6月1日

Saliency-Enhanced Robust Visual Tracking

Arxiv

6+阅读 · 2018年2月8日

VIP会员

文章信息

相关主题

估计/估计量

相关VIP内容

“CVPR 2021 接受论文列表 1663篇论文都在这了

专知会员服务

32+阅读 · 2021年6月12日

最新《自监督表示学习》报告，70页ppt

最新《自监督表示学习》报告，70页ppt

专知会员服务

86+阅读 · 2020年12月22日

【CVPR2020】在线深度聚类的无监督表示学习, Online Deep Clustering for Unsupervised Representation Learning

【CVPR2020】在线深度聚类的无监督表示学习, Online Deep Clustering for Unsupervised Representation Learning

专知会员服务

69+阅读 · 2020年6月19日

【CVPR2020】自监督的深度视觉测程与在线适应，Self-Supervised Deep Visual Odometry

【CVPR2020】自监督的深度视觉测程与在线适应，Self-Supervised Deep Visual Odometry

专知会员服务

32+阅读 · 2020年5月14日

【CVPR2020】通过潦草注释的弱监督显著目标检测，Weakly-Supervised Salient Object Detection via Scribble Annotations

【CVPR2020】通过潦草注释的弱监督显著目标检测，Weakly-Supervised Salient Object Detection via Scribble Annotations

专知会员服务

39+阅读 · 2020年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

【CVPR2020】从未标记的视频中学习视频对象分割，Learning Video Object Segmentation from Unlabeled Videos

【CVPR2020】从未标记的视频中学习视频对象分割，Learning Video Object Segmentation from Unlabeled Videos

专知会员服务

36+阅读 · 2020年3月12日

【Google】视频诱导视觉不变性的自监督学习（Self-Supervised Learning of Video-Induced Visual Invariances），谷歌博士后研究员| Michael Tschannen等

【Google】视频诱导视觉不变性的自监督学习（Self-Supervised Learning of Video-Induced Visual Invariances），谷歌博士后研究员| Michael Tschannen等

专知会员服务

12+阅读 · 2019年12月8日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《美陆军特种作战条令》最新102页

《洛克希德SR-71“黑鸟”侦察机动力系统》21页slides

美空军作战实验室通过人工智能和指挥控制技术创新推进杀伤链

《指挥控制能力分析方法论》最新报告

相关资讯

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知

133+阅读 · 2020年3月18日

已删除

将门创投

3+阅读 · 2019年6月12日

无人机视觉挑战赛 | ICCV 2019 Workshop—VisDrone2019

无人机视觉挑战赛 | ICCV 2019 Workshop—VisDrone2019

PaperWeekly

7+阅读 · 2019年5月5日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

MoCoGAN 分解运动和内容的视频生成

MoCoGAN 分解运动和内容的视频生成

CreateAMind

18+阅读 · 2017年10月21日

【ICCV 2017论文集】计算机视觉顶级会议ICCV2017 Open Access Repository

【ICCV 2017论文集】计算机视觉顶级会议ICCV2017 Open Access Repository

专知

6+阅读 · 2017年10月14日

相关论文

Visual Object Recognition in Indoor Environments Using Topologically Persistent Features

Arxiv

0+阅读 · 2021年7月28日

A Large-Scale Study on Unsupervised Spatiotemporal Representation Learning

Arxiv

11+阅读 · 2021年4月29日

Self-supervised Video Representation Learning by Context and Motion Decoupling

Arxiv

6+阅读 · 2021年4月2日

Rethinking Self-supervised Correspondence Learning: A Video Frame-level Similarity Perspective

Arxiv

4+阅读 · 2021年3月31日

Enhancing Unsupervised Video Representation Learning by Decoupling the Scene and the Motion

Arxiv

4+阅读 · 2020年12月4日

Evolving Losses for Unsupervised Video Representation Learning

Arxiv

23+阅读 · 2020年2月26日

GraphTER: Unsupervised Learning of Graph Transformation Equivariant Representations via Auto-Encoding Node-wise Transformations

GraphTER: Unsupervised Learning of Graph Transformation Equivariant Representations via Auto-Encoding Node-wise Transformations

Arxiv

7+阅读 · 2019年11月19日

Physical Primitive Decomposition

Physical Primitive Decomposition

Arxiv

4+阅读 · 2018年9月13日

End-to-end Active Object Tracking via Reinforcement Learning

Arxiv

3+阅读 · 2018年6月1日

Saliency-Enhanced Robust Visual Tracking

Arxiv

6+阅读 · 2018年2月8日

微信扫码咨询专知VIP会员