In this report, we describe the technical details of our submission for the EPIC-Kitchen-100 action anticipation challenge. Our modelings, the higher-order recurrent space-time transformer and the message-passing neural network with edge learning, are both recurrent-based architectures which observe only 2.5 seconds inference context to form the action anticipation prediction. By averaging the prediction scores from a set of models compiled with our proposed training pipeline, we achieved strong performance on the test set, which is 19.61% overall mean top-5 recall, recorded as second place on the public leaderboard.
翻译:在本报告中,我们描述了我们为EPIC-Kitchen-100行动预期挑战提交的呈件的技术细节。我们的模型、高阶的经常性时空变压器和带有边际学习的信息传递神经网络都是基于经常性的建筑,它们仅对2.5秒的推理背景进行观察,以形成行动预期预测。通过对与我们拟议的培训管道一起汇编的一套模型的预测分数进行平均,我们在测试集上取得了强劲的成绩,该测试集的总平均值为19.61%,在公共领导板上记录为第二位。