Existing video summarization approaches mainly concentrate on sequential or structural characteristic of video data. However, they do not pay enough attention to the video summarization task itself. In this paper, we propose a meta learning method for performing task-driven video summarization, denoted by MetaL-TDVS, to explicitly explore the video summarization mechanism among summarizing processes on different videos. Particularly, MetaL-TDVS aims to excavate the latent mechanism for summarizing video by reformulating video summarization as a meta learning problem and promote generalization ability of the trained model. MetaL-TDVS regards summarizing each video as a single task to make better use of the experience and knowledge learned from processes of summarizing other videos to summarize new ones. Furthermore, MetaL-TDVS updates models via a two-fold back propagation which forces the model optimized on one video to obtain high accuracy on another video in every training step. Extensive experiments on benchmark datasets demonstrate the superiority and better generalization ability of MetaL-TDVS against several state-of-the-art methods.
翻译:现有视频总结方法主要集中于视频数据的顺序或结构特征,但是,它们没有足够重视视频总结任务本身。在本文件中,我们提出一个执行任务驱动视频总结的元学习方法,由MetaL-TDVS指出,以明确探索视频总结机制,对不同视频的过程进行总结。特别是,MetaL-TDVS旨在挖掘通过重新将视频总结作为元学习问题来总结视频的潜在机制,并促进经过培训的模式的普及能力。MetaL-TDVS将每部视频总结为一项单一任务,以便更好地利用从总结其他视频的过程中获得的经验和知识,总结新的视频。此外,MetaL-TDVS通过双面的传播更新模型,使模型优化在每部视频上获得另一部视频的高精度,从而在每部培训步骤中获得高精度的视频。关于基准数据集的广泛实验表明MetL-TDVS的优越性和更好的概括能力,以若干先进的方法进行。