BAANet: 学习双向适应性注意门,用于多谱谱谱路人探测 (BAANet: Learning Bi-directional Adaptive Attention Gates for Multispectral Pedestrian Detection)

Thermal infrared (TIR) image has proven effectiveness in providing temperature cues to the RGB features for multispectral pedestrian detection. Most existing methods directly inject the TIR modality into the RGB-based framework or simply ensemble the results of two modalities. This, however, could lead to inferior detection performance, as the RGB and TIR features generally have modality-specific noise, which might worsen the features along with the propagation of the network. Therefore, this work proposes an effective and efficient cross-modality fusion module called Bi-directional Adaptive Attention Gate (BAA-Gate). Based on the attention mechanism, the BAA-Gate is devised to distill the informative features and recalibrate the representations asymptotically. Concretely, a bi-direction multi-stage fusion strategy is adopted to progressively optimize features of two modalities and retain their specificity during the propagation. Moreover, an adaptive interaction of BAA-Gate is introduced by the illumination-based weighting strategy to adaptively adjust the recalibrating and aggregating strength in the BAA-Gate and enhance the robustness towards illumination changes. Considerable experiments on the challenging KAIST dataset demonstrate the superior performance of our method with satisfactory speed.

翻译：热红红外线(TIR)图像在为多光谱行人检测提供RGB特征的温度提示方面证明是有效的,大多数现有方法直接将TIR模式注入RGB框架,或只是将两种模式的结果合并在一起,但这可能导致检测性能低劣,因为RGB和TIR特征一般具有特定模式的噪音,这可能会随着网络的传播而使特征更加恶化。因此,这项工作建议采用一个高效高效和高效的跨模式融合模块,称为双向调心门(BAAA-Gate)。根据关注机制,BAA-Gate设计BA-Gate旨在提取信息性特征,并尽量调整表述方式。具体地说,采取双向多阶段融合战略,逐步优化两种模式的特征,并在传播过程中保持其特殊性。此外,BAA-Gate的适应性权重战略引入了适应性互动,以适应性地调整BAA-Gate的调整和加固强度,并增强我们追求低污染性能的稳健性。关于ASTIGLA的精确性测试。

相关内容

注意力机制

关注 120

Attention机制最早是在视觉图像领域提出来的，但是真正火起来应该算是google mind团队的这篇论文《Recurrent Models of Visual Attention》[14]，他们在RNN模型上使用了attention机制来进行图像分类。随后，Bahdanau等人在论文《Neural Machine Translation by Jointly Learning to Align and Translate》 [1]中，使用类似attention的机制在机器翻译任务上将翻译和对齐同时进行，他们的工作算是是第一个提出attention机制应用到NLP领域中。接着类似的基于attention机制的RNN模型扩展开始应用到各种NLP任务中。最近，如何在CNN中使用attention机制也成为了大家的研究热点。下图表示了attention研究进展的大概趋势。

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【Google Research】Wavesplit:通过说话者聚类实现端到端的语音分离，Wavesplit: End-to-End Speech Separation by Speaker Clustering

专知会员服务

19+阅读 · 2020年2月26日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日