噪音行动2M:一个多媒体数据集,用于从噪音标签获取视频理解 (NoisyActions2M: A Multimedia Dataset for Video Understanding from Noisy Labels)

Deep learning has shown remarkable progress in a wide range of problems. However, efficient training of such models requires large-scale datasets, and getting annotations for such datasets can be challenging and costly. In this work, we explore the use of user-generated freely available labels from web videos for video understanding. We create a benchmark dataset consisting of around 2 million videos with associated user-generated annotations and other meta information. We utilize the collected dataset for action classification and demonstrate its usefulness with existing small-scale annotated datasets, UCF101 and HMDB51. We study different loss functions and two pretraining strategies, simple and self-supervised learning. We also show how a network pretrained on the proposed dataset can help against video corruption and label noise in downstream datasets. We present this as a benchmark dataset in noisy learning for video understanding. The dataset, code, and trained models will be publicly available for future research.

翻译：深层学习显示,在一系列广泛问题上取得了显著进展,然而,高效培训此类模型需要大规模数据集,获取此类数据集的说明可能具有挑战性和成本。在这项工作中,我们探索如何使用用户从网络视频中自由生成的标签,以了解视频。我们创建了一个由约200万个视频组成的基准数据集,其中含有相关用户生成的注释和其他元信息。我们利用所收集的数据集进行行动分类,并通过现有的小规模附加说明数据集(UCF101和HMDB51)展示其有用性。我们研究了不同的损失功能和两个预培训策略,简单和自监督的学习。我们还展示了在拟议数据集上预先培训的网络如何有助于防止视频腐败和下游数据集中的标签噪音。我们将此作为为了解视频而进行吵闹学习的基准数据集。数据集、代码和经过培训的模型将公开供今后研究使用。

相关内容

数据集

关注 88

数据集，又称为资料集、数据集合或资料集合，是一种由数据所组成的集合。
Data set（或dataset）是一个数据的集合，通常以表格形式出现。每一列代表一个特定变量。每一行都对应于某一成员的数据集的问题。它列出的价值观为每一个变量，如身高和体重的一个物体或价值的随机数。每个数值被称为数据资料。对应于行数，该数据集的数据可能包括一个或多个成员。

最新《自监督表示学习》报告，70页ppt

专知会员服务

86+阅读 · 2020年12月22日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

神经常微分方程教程，50页ppt，A brief tutorial on Neural ODEs

专知会员服务

74+阅读 · 2020年8月2日

【视频预测深度学习综述论文】A Review on Deep Learning Techniques for Video Prediction

专知会员服务

52+阅读 · 2020年4月15日