带查询适应变异器的本地化 (Few-Shot Temporal Action Localization with Query Adaptive Transformer)

Existing temporal action localization (TAL) works rely on a large number of training videos with exhaustive segment-level annotation, preventing them from scaling to new classes. As a solution to this problem, few-shot TAL (FS-TAL) aims to adapt a model to a new class represented by as few as a single video. Exiting FS-TAL methods assume trimmed training videos for new classes. However, this setting is not only unnatural actions are typically captured in untrimmed videos, but also ignores background video segments containing vital contextual cues for foreground action segmentation. In this work, we first propose a new FS-TAL setting by proposing to use untrimmed training videos. Further, a novel FS-TAL model is proposed which maximizes the knowledge transfer from training classes whilst enabling the model to be dynamically adapted to both the new class and each video of that class simultaneously. This is achieved by introducing a query adaptive Transformer in the model. Extensive experiments on two action localization benchmarks demonstrate that our method can outperform all the state of the art alternatives significantly in both single-domain and cross-domain scenarios. The source code can be found in https://github.com/sauradip/fewshotQAT

翻译：现有的时间行动本地化(TAL)工作依赖于大量培训视频,这些视频包含全面的分层层次说明,防止它们升级到新班级。作为解决这个问题的一种解决办法,少见的TAL(FS-TAL)旨在将一个模型适应以少数作为单一视频代表的新班级。FS-TAL方法的退出为新班级设置了剪裁培训视频。然而,这种设置不仅是非自然行动通常在未剪裁的视频中捕捉,而且还忽略了包含地表行动分块的重要背景提示的背景视频段。在这项工作中,我们首先提出一个新的FS-TAL设置,提议使用未剪裁的培训视频。此外,还提出了一个新的FS-TAL模型,最大限度地将培训班的知识转移纳入到仅以少数人为代表的新班级,同时使该模型能够动态地适应新班级和每班的视频。通过在模型中引入一个调控变器实现这一点。两个行动本地化基准的广泛实验表明,我们的方法可以大大超越艺术替代物的所有状态,在单度和横跨度/横跨度设想中都能找到 ATsum/Q。在 http源代码中可以找到。

相关内容

小样本学习

关注 215

小样本学习（Few-Shot Learning，以下简称 FSL ）用于解决当可用的数据量比较少时，如何提升神经网络的性能。在 FSL 中，经常用到的一类方法被称为 Meta-learning。和普通的神经网络的训练方法一样，Meta-learning 也包含训练过程和测试过程，但是它的训练过程被称作 Meta-training 和 Meta-testing。

NeurIPS 2021 | 寻MixTraining: 一种全新的物体检测训练范式

专知会员服务

12+阅读 · 2021年12月9日

【ICCV 2021 】Vision Transformer中的相对位置编码

专知会员服务

30+阅读 · 2021年7月30日

近期必读的6篇顶会CVPR 2021【零样本学习（ZSL）】相关论文和代码

专知会员服务

32+阅读 · 2021年6月30日

[CVPR 2021] 序列到序列对比学习的文本识别

专知会员服务

14+阅读 · 2021年5月2日