Recently, deep learning based video object detection has attracted more and more attention. Compared with object detection of static images, video object detection is more challenging due to the motion of objects, while providing rich temporal information. The RNN-based algorithm is an effective way to enhance detection performance in videos with temporal information. However, most studies in this area only focus on accuracy while ignoring the calculation cost and the number of parameters. In this paper, we propose an efficient method that combines channel-reduced convolutional GRU (Squeezed GRU), and Information Entropy map for video object detection (SGE-Net). The experimental results validate the accuracy improvement, computational savings of the Squeezed GRU, and superiority of the information entropy attention mechanism on the classification performance. The mAP has increased by 3.7 contrasted with the baseline, and the number of parameters has decreased from 6.33 million to 0.67 million compared with the standard GRU.
翻译:最近,基于深层学习的视频天体探测吸引了越来越多的关注。与静态图像的物体探测相比,视频天体探测由于物体的移动而更具挑战性,同时提供了丰富的时间信息。基于 RNN 的算法是提高带有时间信息的视频探测性能的有效方法。然而,该领域的大多数研究仅侧重于准确性,而忽略了计算成本和参数数量。在本文中,我们提出了一个高效的方法,将频道降级GRU(Squeezed GRU)和视频天体探测信息Entropy地图(SGE-Net)结合起来。实验结果验证了Squeezed GRU的精确性改进、计算节省量以及分类性能信息环球注意机制的优越性。与基线相比,MAP增加了3.7,参数数量从633万减少到0.67万,而标准GRU则减少了。