In this paper, we introduce VCSL (Video Copy Segment Localization), a new comprehensive segment-level annotated video copy dataset. Compared with existing copy detection datasets restricted by either video-level annotation or small-scale, VCSL not only has two orders of magnitude more segment-level labelled data, with 160k realistic video copy pairs containing more than 280k localized copied segment pairs, but also covers a variety of video categories and a wide range of video duration. All the copied segments inside each collected video pair are manually extracted and accompanied by precisely annotated starting and ending timestamps. Alongside the dataset, we also propose a novel evaluation protocol that better measures the prediction accuracy of copy overlapping segments between a video pair and shows improved adaptability in different scenarios. By benchmarking several baseline and state-of-the-art segment-level video copy detection methods with the proposed dataset and evaluation metric, we provide a comprehensive analysis that uncovers the strengths and weaknesses of current approaches, hoping to open up promising directions for future works. The VCSL dataset, metric and benchmark codes are all publicly available at https://github.com/alipay/VCSL.
翻译:在本文中,我们引入了VCSL( VCSL( Video Copy Supplication Sclocation),这是一个新的综合性的视频副本数据集。与现有的受视频级别注解或小规模限制的复制检测数据集相比,VCSL不仅拥有两个数量级的层次,而更多的分层贴标签数据,包括160k现实的视频拷贝配对,其中包含280k本地复制的片段配对,而且还涵盖各种视频类别和广泛的视频持续时间。每个收集的视频配对中所有复制的片段都是手工提取的,并配有准确的附加说明的起始和结束时间戳。除了数据集外,我们还提出一个新的评估协议,以更好地衡量影视配对重叠部分的预测准确性,并显示在不同情况下的适应性。通过将若干基线和最新版本的视频级视频拷贝检测方法与拟议的数据集和评价指标进行基准化,我们提供全面分析,揭示当前方法的长处和短处,希望为未来工作打开有希望的方向。VCSCSL数据集、基准代码和基准代码都在 https://githubb.com/aliL.com/palVCSS。