Thermal infrared (TIR) image has proven effectiveness in providing temperature cues to the RGB features for multispectral pedestrian detection. Most existing methods directly inject the TIR modality into the RGB-based framework or simply ensemble the results of two modalities. This, however, could lead to inferior detection performance, as the RGB and TIR features generally have modality-specific noise, which might worsen the features along with the propagation of the network. Therefore, this work proposes an effective and efficient cross-modality fusion module called Bi-directional Adaptive Attention Gate (BAA-Gate). Based on the attention mechanism, the BAA-Gate is devised to distill the informative features and recalibrate the representations asymptotically. Concretely, a bi-direction multi-stage fusion strategy is adopted to progressively optimize features of two modalities and retain their specificity during the propagation. Moreover, an adaptive interaction of BAA-Gate is introduced by the illumination-based weighting strategy to adaptively adjust the recalibrating and aggregating strength in the BAA-Gate and enhance the robustness towards illumination changes. Considerable experiments on the challenging KAIST dataset demonstrate the superior performance of our method with satisfactory speed.
翻译:热红红外线(TIR)图像在为多光谱行人检测提供RGB特征的温度提示方面证明是有效的,大多数现有方法直接将TIR模式注入RGB框架,或只是将两种模式的结果合并在一起,但这可能导致检测性能低劣,因为RGB和TIR特征一般具有特定模式的噪音,这可能会随着网络的传播而使特征更加恶化。因此,这项工作建议采用一个高效高效和高效的跨模式融合模块,称为双向调心门(BAAA-Gate)。根据关注机制,BAA-Gate设计BA-Gate旨在提取信息性特征,并尽量调整表述方式。具体地说,采取双向多阶段融合战略,逐步优化两种模式的特征,并在传播过程中保持其特殊性。此外,BAA-Gate的适应性权重战略引入了适应性互动,以适应性地调整BAA-Gate的调整和加固强度,并增强我们追求低污染性能的稳健性。关于ASTIGLA的精确性测试。