Detection of pedestrians on embedded devices, such as those on-board of robots and drones, has many applications including road intersection monitoring, security, crowd monitoring and surveillance, to name a few. However, the problem can be challenging due to continuously-changing camera viewpoint and varying object appearances as well as the need for lightweight algorithms suitable for embedded systems. This paper proposes a robust framework for pedestrian detection in many footages. The framework performs fine and coarse detections on different image regions and exploits temporal and spatial characteristics to attain enhanced accuracy and real time performance on embedded boards. The framework uses the Yolo-v3 object detection [1] as its backbone detector and runs on the Nvidia Jetson TX2 embedded board, however other detectors and/or boards can be used as well. The performance of the framework is demonstrated on two established datasets and its achievement of the second place in CVPR 2019 Embedded Real-Time Inference (ERTI) Challenge.
翻译:在嵌入装置(如机器人和无人驾驶飞机)上探测行人有许多应用,包括路交叉监测、安全、人群监测和监视,等等。然而,由于摄像师的观点不断变化,物体外观各异,以及需要适合嵌入系统的轻量算法,这一问题可能具有挑战性。本文提出了在许多镜头中进行行人探测的有力框架。框架在不同图像区域进行精细和粗略的探测,并利用时间和空间特点提高嵌入板的准确性和实时性能。框架使用Yolo-v3天体探测[1]作为主干线探测器,运行在Nvidia Jetson TX2嵌入板上,但其他探测器和/或板也可以使用。框架的绩效体现在两个既定数据集及其在CVPR 2019 Embeddededed Real-Time Terence(ERTI)挑战中的第二位。