Camera traps, unmanned observation devices, and deep learning-based image recognition systems have greatly reduced human effort in collecting and analyzing wildlife images. However, data collected via above apparatus exhibits 1) long-tailed and 2) open-ended distribution problems. To tackle the open-set long-tailed recognition problem, we propose the Temporal Flow Mask Attention Network that comprises three key building blocks: 1) an optical flow module, 2) an attention residual module, and 3) a meta-embedding classifier. We extract temporal features of sequential frames using the optical flow module and learn informative representation using attention residual blocks. Moreover, we show that applying the meta-embedding technique boosts the performance of the method in open-set long-tailed recognition. We apply this method on a Korean Demilitarized Zone (DMZ) dataset. We conduct extensive experiments, and quantitative and qualitative analyses to prove that our method effectively tackles the open-set long-tailed recognition problem while being robust to unknown classes.
翻译:相机陷阱、无人观察装置和深层学习图像识别系统大大减少了人类收集和分析野生生物图像的努力。然而,通过上述仪器收集的数据显示:(1) 长尾和(2) 开放型分布问题。为解决开放型长尾识别问题,我们提议建立由三个关键构件组成的时空流动面具关注网络:1) 光学流动模块,(2) 关注残余模块,(3) 元组合分类器。我们利用光学流动模块提取序列框架的时间特征,并利用关注残留块学习信息表述。此外,我们表明,采用元组合技术可以促进该方法在开放型长尾识别中的性能。我们在韩国非军事化区(DMZ)数据集中应用了这种方法。我们进行了广泛的实验、定量和定性分析,以证明我们的方法有效地解决了开放型长尾识别问题,同时对未知的类别保持了活力。