愿景-语言任务的原因关注 (Causal Attention for Vision-Language Tasks)

We present a novel attention mechanism: Causal Attention (CATT), to remove the ever-elusive confounding effect in existing attention-based vision-language models. This effect causes harmful bias that misleads the attention module to focus on the spurious correlations in training data, damaging the model generalization. As the confounder is unobserved in general, we use the front-door adjustment to realize the causal intervention, which does not require any knowledge on the confounder. Specifically, CATT is implemented as a combination of 1) In-Sample Attention (IS-ATT) and 2) Cross-Sample Attention (CS-ATT), where the latter forcibly brings other samples into every IS-ATT, mimicking the causal intervention. CATT abides by the Q-K-V convention and hence can replace any attention module such as top-down attention and self-attention in Transformers. CATT improves various popular attention-based vision-language models by considerable margins. In particular, we show that CATT has great potential in large-scale pre-training, e.g., it can promote the lighter LXMERT~\cite{tan2019lxmert}, which uses fewer data and less computational power, comparable to the heavier UNITER~\cite{chen2020uniter}. Code is published in \url{https://github.com/yangxuntu/catt}.

翻译：我们提出了一个新的关注机制:即Causat(CATT),目的是消除现有基于关注的视觉语言模型中不断蔓延的混乱效应;这种效应导致有害偏见,误导关注模块,以关注培训数据中的虚假相关性,损害模型的概括性;由于CATT在总体上没有受到关注,我们利用前门调整来实现因果干预,这不需要对混乱者有任何了解。具体地说,CATT是作为以下组合实施的:(1) 抽样关注(IS-ATT)和(2) 交叉关注(CS-ATT),后者强行将其他样本带入每一项IS-ATT,模仿因果关系干预。CATT遵守Q-K-V公约,因此可以取代任何关注模块,例如上下关注和在变换者中自我关注。CATT大大改进了各种以公众关注为基础的视觉语言模型。我们特别表明,CATT在大规模20前培训中具有巨大的潜力,例如,CS-Smissionlex

相关内容

注意力机制

关注 120

Attention机制最早是在视觉图像领域提出来的，但是真正火起来应该算是google mind团队的这篇论文《Recurrent Models of Visual Attention》[14]，他们在RNN模型上使用了attention机制来进行图像分类。随后，Bahdanau等人在论文《Neural Machine Translation by Jointly Learning to Align and Translate》 [1]中，使用类似attention的机制在机器翻译任务上将翻译和对齐同时进行，他们的工作算是是第一个提出attention机制应用到NLP领域中。接着类似的基于attention机制的RNN模型扩展开始应用到各种NLP任务中。最近，如何在CNN中使用attention机制也成为了大家的研究热点。下图表示了attention研究进展的大概趋势。