This report describes our submission to the Ego4D Moment Queries Challenge 2022. Our submission builds on ActionFormer, the state-of-the-art backbone for temporal action localization, and a trio of strong video features from SlowFast, Omnivore and EgoVLP. Our solution is ranked 2nd on the public leaderboard with 21.76% average mAP on the test set, which is nearly three times higher than the official baseline. Further, we obtain 42.54% Recall@1x at tIoU=0.5 on the test set, outperforming the top-ranked solution by a significant margin of 1.41 absolute percentage points. Our code is available at https://github.com/happyharrycn/actionformer_release.
翻译:本报告介绍我们向Ego4D运动挑战2022年提交的呈件。我们提交的呈件以ActionFormer、最先进的时间行动定位主干线以及来自Slowfast、Omnivore和EgoVLP的三组强力视频特征为基础。我们的解决方案在公共领导板上排名第二,测试集的平均 mAP占21.76%,比官方基准高出近三倍。此外,我们在测试集上获得了42.54%的回调@1x at tIOU=0.5, 以1.41绝对百分点的显著差值来表现排名最高的解决方案。我们的代码可以在 https://github.com/happyharrycn/actioneforent_release。