Generic Event Boundary Detection (GEBD) tasks aim at detecting generic, taxonomy-free event boundaries that segment a whole video into chunks. In this paper, we apply Masked Autoencoders to improve algorithm performance on the GEBD tasks. Our approach mainly adopted the ensemble of Masked Autoencoders fine-tuned on the GEBD task as a self-supervised learner with other base models. Moreover, we also use a semi-supervised pseudo-label method to take full advantage of the abundant unlabeled Kinetics-400 data while training. In addition, we propose a soft-label method to partially balance the positive and negative samples and alleviate the problem of ambiguous labeling in this task. Lastly, a tricky segmentation alignment policy is implemented to refine boundaries predicted by our models to more accurate locations. With our approach, we achieved 85.94% on the F1-score on the Kinetics-GEBD test set, which improved the F1-score by 2.31% compared to the winner of the 2021 Kinetics-GEBD Challenge. Our code is available at https://github.com/ContentAndMaterialPortrait/MAE-GEBD.
翻译:通用事件边界探测( GEBD) 任务旨在检测通用的、无分类的活动界限, 将整段视频分为整块。 在本文中, 我们运用了一种软标签方法, 以部分平衡正负样本, 并缓解这一任务中的模糊标签问题。 最后, 我们采用一种复杂的分解政策, 将模型预测的边界改进到更准确的地点。 我们采用这种方法, 在Kinitics- GEBD 测试集的F1-芯片上实现了85.94%。 该测试集的F1-芯片比2021 Kinitics- GEBD 挑战的赢家提高了2. 31%。 我们的代码可在 https://Mangistria/Mastreal.D可查到 comm: https://Mastrial/Mastreial/Mastreab.