Generic Event Boundary Detection (GEBD) is a newly introduced task that aims to detect "general" event boundaries that correspond to natural human perception. In this paper, we introduce a novel contrastive learning based approach to deal with the GEBD. Our intuition is that the feature similarity of the video snippet would significantly vary near the event boundaries, while remaining relatively the same in the remaining part of the video. In our model, Temporal Self-similarity Matrix (TSM) is utilized as an intermediate representation which takes on a role as an information bottleneck. With our model, we achieved significant performance boost compared to the given baselines. Our code is available at https://github.com/hello-jinwoo/LOVEU-CVPR2021.
翻译:通用事件边界探测(GEBD)是一项新引入的任务,旨在探测符合人类自然感知的“一般”事件边界。在本文中,我们引入了一种与GEBD相适应的新颖的以对比学习为基础的学习方法。我们的直觉是,视频片段的特征在事件边界附近会有很大的相似性,而在视频的剩余部分则保持相对的相同性。在我们的模型中,时空自相异矩阵(TSM)被用作中间代表,发挥信息瓶颈的作用。用我们的模型,我们取得了与给定基线相比的显著性能提升。我们的代码可以在https://github.com/hello-jinwoo/LOVEU-CVPR2021上查阅。