一般性事件边界探测:事件分类基准 (Generic Event Boundary Detection: A Benchmark for Event Segmentation)

This paper presents a novel task together with a new benchmark for detecting generic, taxonomy-free event boundaries that segment a whole video into chunks. Conventional work in temporal video segmentation and action detection focuses on localizing pre-defined action categories and thus does not scale to generic videos. Cognitive Science has known since last century that humans consistently segment videos into meaningful temporal chunks. This segmentation happens naturally, without pre-defined event categories and without being explicitly asked to do so. Here, we repeat these cognitive experiments on mainstream CV datasets; with our novel annotation guideline which addresses the complexities of taxonomy-free event boundary annotation, we introduce the task of Generic Event Boundary Detection (GEBD) and the new benchmark Kinetics-GEBD. Our Kinetics-GEBD has the largest number of boundaries (e.g. 32 of ActivityNet, 8 of EPIC-Kitchens-100) which are in-the-wild, taxonomy-free, cover generic event change, and respect human perception diversity. We view GEBD as an important stepping stone towards understanding the video as a whole, and believe it has been previously neglected due to a lack of proper task definition and annotations. Through experiment and human study we demonstrate the value of the annotations. Further, we benchmark supervised and un-supervised GEBD approaches on the TAPOS dataset and our Kinetics-GEBD, together with method design explorations that suggest future directions. We release our annotations and baseline codes at CVPR'21 LOVEU Challenge: https://sites.google.com/view/loveucvpr21.

翻译：本文展示了一个新任务, 以及用于检测非常规、无分类事件界限的新基准, 将整段视频分割成块块。时间视频分割和行动探测的常规工作侧重于将预定义的行动类别本地化, 因而不比通用视频更多。认知科学自上世纪以来一直知道, 人类始终将视频分割成有意义的时间块。这种分割自然发生, 没有预定义的事件类别, 也没有明确要求这样做。在这里, 我们重复了主流 CV 数据集的认知实验; 我们的新说明准则, 解决了无分类事件边界注释的复杂性, 我们引入了通用事件边界探测( GEBD) 和新的基尼特斯- GEBD 基准任务。我们的 Enationaltics- GEB 具有最大的界限( 例如: 活动网 32 、 ePIC- Kitchens- 100 8 ), 这些界限是在20世纪21 、分类- 、覆盖通用事件变化以及尊重人类认知多样性。我们视 GEOD 是一个重要的基石, 要理解 GE- 定义, 以及我们过去没有正确定义的C- 定义, 定义和, 我们进一步展示了C- 定义定义和的的和的的和和指南定义。

相关内容

Cognition

关注 4

Cognition：Cognition：International Journal of Cognitive Science Explanation：认知：国际认知科学杂志。 Publisher：Elsevier。 SIT： http://www.journals.elsevier.com/cognition/

2020数据工程师成长路线图

专知会员服务

19+阅读 · 2020年9月6日

【CVPR2020-微软-CMU】视频物体分割的一种直推方法，Video Object Segmentation

专知会员服务

7+阅读 · 2020年4月16日

【北卡罗莱纳州立大学】单场景视频异常检测综述，A Survey of Single-Scene Video Anomaly Detection

专知会员服务

31+阅读 · 2020年4月13日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日