视频行动理解 (Video Action Understanding) - 专知论文

会员服务 ·

0

可理解性 · 学成 · Notability · Taxonomy · 深度学习 ·

2021 年 10 月 3 日

Video Action Understanding

翻译：视频行动理解

Matthew Hutchinson,Vijay Gadepally

from arxiv, Accepted for publication in IEEE Access

Many believe that the successes of deep learning on image understanding problems can be replicated in the realm of video understanding. However, due to the scale and temporal nature of video, the span of video understanding problems and the set of proposed deep learning solutions is arguably wider and more diverse than those of their 2D image siblings. Finding, identifying, and predicting actions are a few of the most salient tasks in this emerging and rapidly evolving field. With a pedagogical emphasis, this tutorial introduces and systematizes fundamental topics, basic concepts, and notable examples in supervised video action understanding. Specifically, we clarify a taxonomy of action problems, catalog and highlight video datasets, describe common video data preparation methods, present the building blocks of state-of-the art deep learning model architectures, and formalize domain-specific metrics to baseline proposed solutions. This tutorial is intended to be accessible to a general computer science audience and assumes a conceptual understanding of supervised learning.

翻译：许多人认为,关于图像理解问题的深层次学习的成功可以在视频理解领域复制,但是,由于视频的规模和时间性质,视频理解问题的范围以及拟议的一套深层次学习解决办法可以说比其2D形象兄弟姐妹的范围更广,而且更加多样化。寻找、确定和预测行动是这个新兴和迅速变化的领域最突出的任务之一。在强调教学的同时,这种辅导性介绍和系统化了基本主题、基本概念以及监督视频行动理解的显著例子。具体地说,我们澄清了行动问题的分类、目录和突出视频数据集,描述了共同的视频数据编制方法,介绍了最先进的深层学习模型结构的构件,并正式确定了用于基线拟议解决办法的针对具体域的衡量标准。这种辅导性旨在让一般计算机科学受众了解,并对监督的学习形成概念性理解。

0

相关内容

可理解性

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

【视频预测深度学习综述论文】A Review on Deep Learning Techniques for Video Prediction

【视频预测深度学习综述论文】A Review on Deep Learning Techniques for Video Prediction

专知会员服务

52+阅读 · 2020年4月15日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

【深度学习视频分析/多模态学习资源大列表】

【深度学习视频分析/多模态学习资源大列表】

专知会员服务

92+阅读 · 2019年10月16日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

人工智能乳房x线照相术和数字化乳房人工合成:当前的概念和未来的展望综述论文

人工智能乳房x线照相术和数字化乳房人工合成:当前的概念和未来的展望综述论文

专知会员服务

5+阅读 · 2019年9月25日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

已删除

将门创投

3+阅读 · 2019年4月25日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

【推荐】视频目标分割基础

【推荐】视频目标分割基础

机器学习研究会

9+阅读 · 2017年9月19日

【推荐】GAN架构入门综述(资源汇总)

【推荐】GAN架构入门综述(资源汇总)

机器学习研究会

10+阅读 · 2017年9月3日

【推荐】深度学习目标检测概览

【推荐】深度学习目标检测概览

机器学习研究会

10+阅读 · 2017年9月1日

【推荐】SVM实例教程

【推荐】SVM实例教程

机器学习研究会

17+阅读 · 2017年8月26日

Self-supervised Pretraining with Classification Labels for Temporal Activity Detection

Self-supervised Pretraining with Classification Labels for Temporal Activity Detection

Arxiv

0+阅读 · 2021年11月26日

TAda! Temporally-Adaptive Convolutions for Video Understanding

Arxiv

0+阅读 · 2021年11月24日

Video Super Resolution Based on Deep Learning: A Comprehensive Survey

Video Super Resolution Based on Deep Learning: A Comprehensive Survey

Arxiv

8+阅读 · 2020年12月20日

Deep learning for time series classification: a review

Arxiv

12+阅读 · 2019年3月14日

SlowFast Networks for Video Recognition

SlowFast Networks for Video Recognition

Arxiv

19+阅读 · 2018年12月10日

Video-to-Video Synthesis

Video-to-Video Synthesis

Arxiv

9+阅读 · 2018年8月20日

Geometric Understanding of Deep Learning

Arxiv

5+阅读 · 2018年5月31日

ECO: Efficient Convolutional Network for Online Video Understanding

Arxiv

5+阅读 · 2018年5月7日

Fine-grained Video Classification and Captioning

Arxiv

7+阅读 · 2018年4月24日

DVQA: Understanding Data Visualizations via Question Answering

Arxiv

8+阅读 · 2018年1月24日

VIP会员

文章信息

相关主题

相关VIP内容

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

【视频预测深度学习综述论文】A Review on Deep Learning Techniques for Video Prediction

【视频预测深度学习综述论文】A Review on Deep Learning Techniques for Video Prediction

专知会员服务

52+阅读 · 2020年4月15日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

【深度学习视频分析/多模态学习资源大列表】

【深度学习视频分析/多模态学习资源大列表】

专知会员服务

92+阅读 · 2019年10月16日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

人工智能乳房x线照相术和数字化乳房人工合成:当前的概念和未来的展望综述论文

人工智能乳房x线照相术和数字化乳房人工合成:当前的概念和未来的展望综述论文

专知会员服务

5+阅读 · 2019年9月25日

热门VIP内容

开通专知VIP会员享更多权益服务

【NTU博士论文】反事实推理在多模态对话生成中的应用

基于强化学习的智能体化搜索全面综述：基础、角色、优化、评估与应用

ICCV最佳论文出炉，朱俊彦团队用砖块积木摘得桂冠

面向具身操作的高效视觉–语言–动作模型：系统综述

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

已删除

将门创投

3+阅读 · 2019年4月25日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

【推荐】视频目标分割基础

【推荐】视频目标分割基础

机器学习研究会

9+阅读 · 2017年9月19日

【推荐】GAN架构入门综述(资源汇总)

【推荐】GAN架构入门综述(资源汇总)

机器学习研究会

10+阅读 · 2017年9月3日

【推荐】深度学习目标检测概览

【推荐】深度学习目标检测概览

机器学习研究会

10+阅读 · 2017年9月1日

【推荐】SVM实例教程

【推荐】SVM实例教程

机器学习研究会

17+阅读 · 2017年8月26日

相关论文

Self-supervised Pretraining with Classification Labels for Temporal Activity Detection

Self-supervised Pretraining with Classification Labels for Temporal Activity Detection

Arxiv

0+阅读 · 2021年11月26日

TAda! Temporally-Adaptive Convolutions for Video Understanding

Arxiv

0+阅读 · 2021年11月24日

Video Super Resolution Based on Deep Learning: A Comprehensive Survey

Video Super Resolution Based on Deep Learning: A Comprehensive Survey

Arxiv

8+阅读 · 2020年12月20日

Deep learning for time series classification: a review

Arxiv

12+阅读 · 2019年3月14日

SlowFast Networks for Video Recognition

SlowFast Networks for Video Recognition

Arxiv

19+阅读 · 2018年12月10日

Video-to-Video Synthesis

Video-to-Video Synthesis

Arxiv

9+阅读 · 2018年8月20日

Geometric Understanding of Deep Learning

Arxiv

5+阅读 · 2018年5月31日

ECO: Efficient Convolutional Network for Online Video Understanding

Arxiv

5+阅读 · 2018年5月7日

Fine-grained Video Classification and Captioning

Arxiv

7+阅读 · 2018年4月24日

DVQA: Understanding Data Visualizations via Question Answering

Arxiv

8+阅读 · 2018年1月24日

微信扫码咨询专知VIP会员