AFACAR: 结构性关注聚合,以确认组成行动 (SAFCAR: Structured Attention Fusion for Compositional Action Recognition)

We present a general framework for compositional action recognition -- i.e. action recognition where the labels are composed out of simpler components such as subjects, atomic-actions and objects. The main challenge in compositional action recognition is that there is a combinatorially large set of possible actions that can be composed using basic components. However, compositionality also provides a structure that can be exploited. To do so, we develop and test a novel Structured Attention Fusion (SAF) self-attention mechanism to combine information from object detections, which capture the time-series structure of an action, with visual cues that capture contextual information. We show that our approach recognizes novel verb-noun compositions more effectively than current state of the art systems, and it generalizes to unseen action categories quite efficiently from only a few labeled examples. We validate our approach on the challenging Something-Else tasks from the Something-Something-V2 dataset. We further show that our framework is flexible and can generalize to a new domain by showing competitive results on the Charades-Fewshot dataset.

翻译：我们提出了一个整体行动识别总体框架 -- -- 即:行动识别,标签由主题、原子动作和对象等更简单的组成部分组成。在组合行动识别方面的主要挑战是,有一组组合的庞大可能的行动,可以使用基本组成部分组成。然而,组成性也提供了一个可以利用的结构。为此,我们开发并测试一个新型的结构关注聚合(SAF)自控机制,将物体探测信息集成,捕捉行动的时间序列结构,并配有可捕捉背景信息的视觉提示。我们表明,我们的方法比艺术系统当前状态更有成效地承认新动词-noun组成,它从几个有标签的例子中非常高效地概括了看不见的行动类别。我们验证了我们对某些东西-V2数据集中具有挑战性的东西-Else任务的做法。我们进一步表明,我们的框架是灵活的,可以通过在Charades-Fewshot数据集上显示竞争性的结果,将一个新的领域加以概括。

相关内容

注意力机制

关注 120

Attention机制最早是在视觉图像领域提出来的，但是真正火起来应该算是google mind团队的这篇论文《Recurrent Models of Visual Attention》[14]，他们在RNN模型上使用了attention机制来进行图像分类。随后，Bahdanau等人在论文《Neural Machine Translation by Jointly Learning to Align and Translate》 [1]中，使用类似attention的机制在机器翻译任务上将翻译和对齐同时进行，他们的工作算是是第一个提出attention机制应用到NLP领域中。接着类似的基于attention机制的RNN模型扩展开始应用到各种NLP任务中。最近，如何在CNN中使用attention机制也成为了大家的研究热点。下图表示了attention研究进展的大概趋势。

IJCAI2020接受论文列表，592篇论文pdf都在这了！

专知会员服务

64+阅读 · 2020年7月16日

【ACL2020】命名实体识别即依存解析，Named Entity Recognition as Dependency Parsing

专知会员服务

61+阅读 · 2020年5月15日

【CVPR2020-哈工大-京东】自监督结构建模的目标识别，Self-supervised Structure Modeling

专知会员服务

43+阅读 · 2020年4月1日

抢鲜看！13篇CVPR2020论文链接/开源代码/解读

专知会员服务

50+阅读 · 2020年2月26日