In this paper, a Video Summarization Method using Temporal Interest Detection and Key Frame Prediction is proposed for supervised video summarization, where video summarization is formulated as a combination of sequence labeling and temporal interest detection problem. In our method, we firstly built a flexible universal network frame to simultaneously predicts frame-level importance scores and temporal interest segments, and then combine the two components with different weights to achieve a more detailed video summarization. Extensive experiments and analysis on two benchmark datasets prove the effectiveness of our method. Specifically, compared with other state-of-the-art methods, its performance is increased by at least 2.6% and 4.2% on TVSum and SumMe respectively.
翻译:在本文中,提出了使用时间利益探测和关键框架预测的视频总结方法,用于监督视频总结,其中视频总结结合了序列标签和时间利益探测问题。在我们的方法中,我们首先建立了一个灵活的通用网络框架,以同时预测框架级别重要性分数和时间利益区段,然后将两个组成部分与不同重量结合起来,以便实现更详细的视频总结。关于两个基准数据集的广泛实验和分析证明了我们的方法的有效性。具体地说,与其他最先进的方法相比,其性能在TVSum和SumMe上分别提高了2.6%和4.2%。