用于部分一级视频复制检测的大规模综合数据集和复制了解评价程序 (A Large-scale Comprehensive Dataset and Copy-overlap Aware Evaluation Protocol for Segment-level Video Copy Detection)

Sifeng He,Xudong Yang,Chen Jiang,Gang Liang,Wei Zhang,Tan Pan,Qing Wang,Furong Xu,Chunguang Li,Jingxiong Liu,Hui Xu,Kaiming Huang,Yuan Cheng,Feng Qian,Xiaobo Zhang,Lei Yang

from arxiv, Accepted by CVPR 2022. Codes are all publicly available at https://github.com/alipay/VCSL

In this paper, we introduce VCSL (Video Copy Segment Localization), a new comprehensive segment-level annotated video copy dataset. Compared with existing copy detection datasets restricted by either video-level annotation or small-scale, VCSL not only has two orders of magnitude more segment-level labelled data, with 160k realistic video copy pairs containing more than 280k localized copied segment pairs, but also covers a variety of video categories and a wide range of video duration. All the copied segments inside each collected video pair are manually extracted and accompanied by precisely annotated starting and ending timestamps. Alongside the dataset, we also propose a novel evaluation protocol that better measures the prediction accuracy of copy overlapping segments between a video pair and shows improved adaptability in different scenarios. By benchmarking several baseline and state-of-the-art segment-level video copy detection methods with the proposed dataset and evaluation metric, we provide a comprehensive analysis that uncovers the strengths and weaknesses of current approaches, hoping to open up promising directions for future works. The VCSL dataset, metric and benchmark codes are all publicly available at https://github.com/alipay/VCSL.

翻译：在本文中,我们引入了VCSL( VCSL( Video Copy Supplication Sclocation),这是一个新的综合性的视频副本数据集。与现有的受视频级别注解或小规模限制的复制检测数据集相比,VCSL不仅拥有两个数量级的层次,而更多的分层贴标签数据,包括160k现实的视频拷贝配对,其中包含280k本地复制的片段配对,而且还涵盖各种视频类别和广泛的视频持续时间。每个收集的视频配对中所有复制的片段都是手工提取的,并配有准确的附加说明的起始和结束时间戳。除了数据集外,我们还提出一个新的评估协议,以更好地衡量影视配对重叠部分的预测准确性,并显示在不同情况下的适应性。通过将若干基线和最新版本的视频级视频拷贝检测方法与拟议的数据集和评价指标进行基准化,我们提供全面分析,揭示当前方法的长处和短处,希望为未来工作打开有希望的方向。VCSCSL数据集、基准代码和基准代码都在 https://githubb.com/aliL.com/palVCSS。

相关内容

数据集

关注 88

数据集，又称为资料集、数据集合或资料集合，是一种由数据所组成的集合。
Data set（或dataset）是一个数据的集合，通常以表格形式出现。每一列代表一个特定变量。每一行都对应于某一成员的数据集的问题。它列出的价值观为每一个变量，如身高和体重的一个物体或价值的随机数。每个数值被称为数据资料。对应于行数，该数据集的数据可能包括一个或多个成员。

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日