Current few-shot action recognition methods reach impressive performance by learning discriminative features for each video via episodic training and designing various temporal alignment strategies. Nevertheless, they are limited in that (a) learning individual features without considering the entire task may lose the most relevant information in the current episode, and (b) these alignment strategies may fail in misaligned instances. To overcome the two limitations, we propose a novel Hybrid Relation guided Set Matching (HyRSM) approach that incorporates two key components: hybrid relation module and set matching metric. The purpose of the hybrid relation module is to learn task-specific embeddings by fully exploiting associated relations within and cross videos in an episode. Built upon the task-specific features, we reformulate distance measure between query and support videos as a set matching problem and further design a bidirectional Mean Hausdorff Metric to improve the resilience to misaligned instances. By this means, the proposed HyRSM can be highly informative and flexible to predict query categories under the few-shot settings. We evaluate HyRSM on six challenging benchmarks, and the experimental results show its superiority over the state-of-the-art methods by a convincing margin. Project page: https://hyrsm-cvpr2022.github.io/.
翻译:目前的微小行动识别方法通过通过偶发培训和设计各种时间调整战略,学习每个视频的差别性特征,取得了令人印象深刻的成绩;然而,这些方法有限,因为(a) 学习个别特征而不考虑整个任务,可能会失去当前情况中最相关的信息,以及(b) 这些匹配战略在错误的事例中可能失败。为了克服这两个局限性,我们提议采用新的混合组合制导匹配(HyRSM)方法,其中包含两个关键组成部分:混合关系模块和设定匹配度量。混合关系模块的目的是通过在一集中充分利用相关关系和交叉视频来学习特定任务嵌入。在特定任务特点的基础上,我们重新配置查询和支持视频之间的远程测量,作为一组匹配问题,并进一步设计双向均势Hausdorff 度量度,以提高对错误情形的适应力。通过这个方法,拟议的HyRSM可以非常丰富和灵活地预测几发环境中的查询类别。我们评估了六种具有挑战性的基准,以及实验结果显示其在州-GUT-Art的优越性。 ALs primopal page: http: http://brmalpalpalpalpalpalpalpage.