递增视频提示探测全球原型编码 (Global Prototype Encoding for Incremental Video Highlights Detection)

Video highlights detection has been long researched as a topic in computer vision tasks, digging the user-appealing clips out given unexposed raw video inputs. However, in most case, the mainstream methods in this line of research are built on the closed world assumption, where a fixed number of highlight categories is defined properly in advance and need all training data to be available at the same time, and as a result, leads to poor scalability with respect to both the highlight categories and the size of the dataset. To tackle the problem mentioned above, we propose a video highlights detector that is able to learn incrementally, namely \textbf{G}lobal \textbf{P}rototype \textbf{E}ncoding (GPE), capturing newly defined video highlights in the extended dataset via their corresponding prototypes. Alongside, we present a well annotated and costly dataset termed \emph{ByteFood}, including more than 5.1k gourmet videos belongs to four different domains which are \emph{cooking}, \emph{eating}, \emph{food material}, and \emph{presentation} respectively. To the best of our knowledge, this is the first time the incremental learning settings are introduced to video highlights detection, which in turn relieves the burden of training video inputs and promotes the scalability of conventional neural networks in proportion to both the size of the dataset and the quantity of domains. Moreover, the proposed GPE surpasses current incremental learning methods on \emph{ByteFood}, reporting an improvement of 1.57\% mAP at least. The code and dataset will be made available sooner.

翻译：在计算机视觉任务中,长期研究视频亮点探测,作为计算机视觉任务的一个专题, 挖掘用户- 请求剪辑的剪辑, 给未曝光的原始视频输入。然而, 在多数情况下, 此研究线的主流方法建在封闭世界的假设上, 在封闭世界的假设中, 一个固定数量的亮点类别能够提前正确定义, 并且需要同时提供所有培训数据, 从而导致在突出类别和数据集大小方面, 调频的可缩放性不强。为了解决上述问题, 我们提议了一个视频亮点检测器, 能够不断学习, 即\ textbf{ G} Lobal\ textb{P} P} rototypele kind\ textbf{E}ncode( GPGPEE), 在扩展的数据集中新定义的亮度, 需要同时提供所有培训数据, 称为 emph{Byfood food } 的缩略。包括超过 5.1k gome 调调的调视频属于四个不同的域域, 正在显示的递增,, 、缩缩缩缩缩和变缩的缩化和变缩缩缩化数据数据的缩化的缩化的缩化的缩化, 和缩化的缩化的缩化的缩化和缩化的缩化的缩化的缩化的缩化, 和缩化的缩化的缩化的缩化的缩化的缩化的缩化, 和缩化的缩略图图图。

相关内容

数据集

关注 88

数据集，又称为资料集、数据集合或资料集合，是一种由数据所组成的集合。
Data set（或dataset）是一个数据的集合，通常以表格形式出现。每一列代表一个特定变量。每一行都对应于某一成员的数据集的问题。它列出的价值观为每一个变量，如身高和体重的一个物体或价值的随机数。每个数值被称为数据资料。对应于行数，该数据集的数据可能包括一个或多个成员。

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日