从未见视图点的视频中确认行动 (Recognizing Actions in Videos from Unseen Viewpoints) - 专知论文

会员服务 ·

0

训练数据 · 视频分类 · 卷积 · MoDELS · 学成 ·

2021 年 3 月 30 日

Recognizing Actions in Videos from Unseen Viewpoints

翻译：从未见视图点的视频中确认行动

AJ Piergiovanni,Michael S. Ryoo

Standard methods for video recognition use large CNNs designed to capture spatio-temporal data. However, training these models requires a large amount of labeled training data, containing a wide variety of actions, scenes, settings and camera viewpoints. In this paper, we show that current convolutional neural network models are unable to recognize actions from camera viewpoints not present in their training data (i.e., unseen view action recognition). To address this, we develop approaches based on 3D representations and introduce a new geometric convolutional layer that can learn viewpoint invariant representations. Further, we introduce a new, challenging dataset for unseen view recognition and show the approaches ability to learn viewpoint invariant representations.

翻译：视频识别的标准方法使用大型CNN, 旨在捕捉时空数据。但是,培训这些模型需要大量的标签培训数据,包含各种各样的行动、场景、设置和相机观点。在本文中,我们表明,当前的进化神经网络模型无法从培训数据中不存在的相机观点(即无形视图动作识别)中识别行动。为了解决这个问题,我们制定了基于3D表示法的方法,并引入了一个新的几何相向层,可以学习变量表达法中的观点。此外,我们引入了一套新的、具有挑战性的无形视图识别数据集,并展示了学习变量表达法的能力。

0

相关内容

训练数据

最新《自监督表示学习》报告，70页ppt

最新《自监督表示学习》报告，70页ppt

专知会员服务

86+阅读 · 2020年12月22日

【文献综述】Text Detection and Recognition in the Wild: A Review 自然文本检测与识别

【文献综述】Text Detection and Recognition in the Wild: A Review 自然文本检测与识别

专知会员服务

46+阅读 · 2020年6月11日

【CVPR2020】视频符号语言识别中跨领域知识的传递, Transferring Cross-domain Knowledge for Video Sign Language Recognition

【CVPR2020】视频符号语言识别中跨领域知识的传递, Transferring Cross-domain Knowledge for Video Sign Language Recognition

专知会员服务

9+阅读 · 2020年4月17日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【ICCV 2019 Tutorial】Holistic 3D Reconstruction: Learning to Reconstruct Holistic 3D Structures from Sensorial Data（整体3D重建：学习从感官数据重建整体3D结构），宾夕法尼亚州立大学 Zihan Zhou，西蒙弗雷泽大学计算机科学系 Yasutaka Furukawa，UCB 马毅

【ICCV 2019 Tutorial】Holistic 3D Reconstruction: Learning to Reconstruct Holistic 3D Structures from Sensorial Data（整体3D重建：学习从感官数据重建整体3D结构），宾夕法尼亚州立大学 Zihan Zhou，西蒙弗雷泽大学计算机科学系 Yasutaka Furukawa，UCB 马毅

专知会员服务

29+阅读 · 2019年10月31日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

【视频中的零样本动作识别：综述】Zero-Shot Action Recognition in Videos: A Survey

【视频中的零样本动作识别：综述】Zero-Shot Action Recognition in Videos: A Survey

专知会员服务

39+阅读 · 2019年10月12日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

已删除

将门创投

6+阅读 · 2019年9月3日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

Synthetic Humans for Action Recognition from Unseen Viewpoints

Arxiv

0+阅读 · 2021年5月23日

M4Depth: A motion-based approach for monocular depth estimation on video sequences

Arxiv

0+阅读 · 2021年5月21日

Visual Object Recognition in Indoor Environments Using Topologically Persistent Features

Arxiv

0+阅读 · 2021年5月20日

Egocentric Activity Recognition and Localization on a 3D Map

Arxiv

0+阅读 · 2021年5月20日

Text Detection and Recognition in the Wild: A Review

Arxiv

20+阅读 · 2020年6月8日

Video-to-Video Synthesis

Video-to-Video Synthesis

Arxiv

9+阅读 · 2018年8月20日

Learning Human Pose Models from Synthesized Data for Robust RGB-D Action Recognition

Arxiv

3+阅读 · 2018年5月1日

Fine-grained Activity Recognition in Baseball Videos

Arxiv

6+阅读 · 2018年4月9日

Learning Representative Temporal Features for Action Recognition

Arxiv

4+阅读 · 2018年3月14日

Comparative Study of ECO and CFNet Trackers in Noisy Environment

Arxiv

5+阅读 · 2018年1月29日

VIP会员

文章信息

相关主题

相关VIP内容

最新《自监督表示学习》报告，70页ppt

最新《自监督表示学习》报告，70页ppt

专知会员服务

86+阅读 · 2020年12月22日

【文献综述】Text Detection and Recognition in the Wild: A Review 自然文本检测与识别

【文献综述】Text Detection and Recognition in the Wild: A Review 自然文本检测与识别

专知会员服务

46+阅读 · 2020年6月11日

【CVPR2020】视频符号语言识别中跨领域知识的传递, Transferring Cross-domain Knowledge for Video Sign Language Recognition

【CVPR2020】视频符号语言识别中跨领域知识的传递, Transferring Cross-domain Knowledge for Video Sign Language Recognition

专知会员服务

9+阅读 · 2020年4月17日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【ICCV 2019 Tutorial】Holistic 3D Reconstruction: Learning to Reconstruct Holistic 3D Structures from Sensorial Data（整体3D重建：学习从感官数据重建整体3D结构），宾夕法尼亚州立大学 Zihan Zhou，西蒙弗雷泽大学计算机科学系 Yasutaka Furukawa，UCB 马毅

【ICCV 2019 Tutorial】Holistic 3D Reconstruction: Learning to Reconstruct Holistic 3D Structures from Sensorial Data（整体3D重建：学习从感官数据重建整体3D结构），宾夕法尼亚州立大学 Zihan Zhou，西蒙弗雷泽大学计算机科学系 Yasutaka Furukawa，UCB 马毅

专知会员服务

29+阅读 · 2019年10月31日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

【视频中的零样本动作识别：综述】Zero-Shot Action Recognition in Videos: A Survey

【视频中的零样本动作识别：综述】Zero-Shot Action Recognition in Videos: A Survey

专知会员服务

39+阅读 · 2019年10月12日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【NeurIPS2025】面向时间序列基础模型的合成序列符号数据生成方法

军事通信市场七大趋势概述

【CMU博士论文】深度学习中泛化的量化、理解与改进

面向低光照图像增强的扩散模型

相关资讯

已删除

将门创投

6+阅读 · 2019年9月3日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

相关论文

Synthetic Humans for Action Recognition from Unseen Viewpoints

Arxiv

0+阅读 · 2021年5月23日

M4Depth: A motion-based approach for monocular depth estimation on video sequences

Arxiv

0+阅读 · 2021年5月21日

Visual Object Recognition in Indoor Environments Using Topologically Persistent Features

Arxiv

0+阅读 · 2021年5月20日

Egocentric Activity Recognition and Localization on a 3D Map

Arxiv

0+阅读 · 2021年5月20日

Text Detection and Recognition in the Wild: A Review

Arxiv

20+阅读 · 2020年6月8日

Video-to-Video Synthesis

Video-to-Video Synthesis

Arxiv

9+阅读 · 2018年8月20日

Learning Human Pose Models from Synthesized Data for Robust RGB-D Action Recognition

Arxiv

3+阅读 · 2018年5月1日

Fine-grained Activity Recognition in Baseball Videos

Arxiv

6+阅读 · 2018年4月9日

Learning Representative Temporal Features for Action Recognition

Arxiv

4+阅读 · 2018年3月14日

Comparative Study of ECO and CFNet Trackers in Noisy Environment

Arxiv

5+阅读 · 2018年1月29日

微信扫码咨询专知VIP会员