鲜热行动识别时间关系交叉转换器 (Temporal-Relational CrossTransformers for Few-Shot Action Recognition)

We propose a novel approach to few-shot action recognition, finding temporally-corresponding frame tuples between the query and videos in the support set. Distinct from previous few-shot action recognition works, we construct class prototypes using the CrossTransformer attention mechanism to observe relevant sub-sequences of all support videos, rather than using class averages or single best matches. Video representations are formed from ordered tuples of varying numbers of frames, which allows sub-sequences of actions at different speeds and temporal offsets to be compared. Our proposed Temporal-Relational CrossTransformers achieve state-of-the-art results on both Kinetics and Something-Something V2 (SSv2), outperforming prior work on SSv2 by a wide margin (6.8%) due to the method's ability to model temporal relations. A detailed ablation showcases the importance of matching to multiple support set videos and learning higher-order relational CrossTransformers. Code is available at https://github.com/tobyperrett/trx

翻译：我们建议一种新颖的方法来识别几发动作, 找到在支持组中的查询和视频之间对应的时间框架图例。与以往的微发动作识别工程不同, 我们使用交叉传输关注机制构建了类原型, 以观察所有支持视频的相关子序列, 而不是使用类平均值或最佳匹配。视频演示由不同数量框架的定购图例组成, 允许以不同速度和时间偏移的动作序列子序列进行对比。我们拟议的时空- 关系交叉转换者在动因和某些东西V2( SSv2) 上都取得了最新结果, 由于该方法能够模拟时间关系, 从而大大超过先前在SSv2上的工作( 6.8% ) 。详细的缩略图展示了匹配多个支持设置的视频和学习更高顺序的关联交叉转换者的重要性。代码可在 https://github.com/tobyert/trx 上查阅 https://github. com/tobypert/ texx

相关内容

小样本学习

关注 215

小样本学习（Few-Shot Learning，以下简称 FSL ）用于解决当可用的数据量比较少时，如何提升神经网络的性能。在 FSL 中，经常用到的一类方法被称为 Meta-learning。和普通的神经网络的训练方法一样，Meta-learning 也包含训练过程和测试过程，但是它的训练过程被称作 Meta-training 和 Meta-testing。

零样本文本分类，Zero-Shot Learning for Text Classification

专知会员服务

97+阅读 · 2020年5月31日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

【ICLR2020论文】自我注意力与卷积层的关系，On the Relationship between Self-Attention and Convolutional Layers

专知会员服务

37+阅读 · 2020年1月12日

【AAAI2020论文-清华大学】Enhanced Meta-Learning for Cross-lingual Named Entity Recognition with Minimal Resources，最小资源增强的元学习跨语言命名实体识别

专知会员服务

31+阅读 · 2019年11月17日