We present a novel approach for action recognition in UAV videos. Our formulation is designed to handle occlusion and viewpoint changes caused by the movement of a UAV. We use the concept of mutual information to compute and align the regions corresponding to human action or motion in the temporal domain. This enables our recognition model to learn from the key features associated with the motion. We also propose a novel frame sampling method that uses joint mutual information to acquire the most informative frame sequence in UAV videos. We have integrated our approach with X3D and evaluated the performance on multiple datasets. In practice, we achieve 18.9% improvement in Top-1 accuracy over current state-of-the-art methods on UAV-Human(Li et al., 2021), 7.3% improvement on Drone-Action(Perera et al., 2019), and 7.16% improvement on NEC Drones(Choi et al., 2020). We will release the code at the time of publication
翻译:在无人驾驶航空器的视频中,我们提出一种新的行动识别方法。我们的配方旨在处理无人驾驶航空器移动引起的排斥和观点变化。我们使用相互信息的概念来计算和调整与人类行动或运动相对应的区域,从而使我们的识别模式能够从与该动议相关的关键特征中学习。我们还提议了一种新的框架抽样方法,利用共同信息获取无人驾驶航空器视频中信息最丰富的框架序列。我们已经将我们的方法与X3D结合起来,并评估了多个数据集的性能。在实践中,我们比目前关于无人驾驶航空器-人类(Li等人,2021年)的最新方法(Li等人,2021年)、德龙-动作(Perera等人,2019年)7.3%的改进以及NEC Droones(Coi等人,2020年)的7.16%的改进率提高了18.9%。我们将在出版时发布该代码。</s>