A guiding robot aims to effectively bring people to and from specific places within environments that are possibly unknown to them. During this operation the robot should be able to detect and track the accompanied person, trying never to lose sight of her/him. A solution to minimize this event is to use an omnidirectional camera: its 360{\deg} Field of View (FoV) guarantees that any framed object cannot leave the FoV if not occluded or very far from the sensor. However, the acquired panoramic videos introduce new challenges in perception tasks such as people detection and tracking, including the large size of the images to be processed, the distortion effects introduced by the cylindrical projection and the periodic nature of panoramic images. In this paper, we propose a set of targeted methods that allow to effectively adapt to panoramic videos a standard people detection and tracking pipeline originally designed for perspective cameras. Our methods have been implemented and tested inside a deep learning-based people detection and tracking framework with a commercial 360{\deg} camera. Experiments performed on datasets specifically acquired for guiding robot applications and on a real service robot show the effectiveness of the proposed approach over other state-of-the-art systems. We release with this paper the acquired and annotated datasets and the open-source implementation of our method.
翻译:指导机器人的目标是有效地将人们带往和从特定地点带往他们可能不知道的环境。 在此次操作中,机器人应当能够探测和跟踪被陪同的人,试图永远不忽视她/他。 尽量减少这一事件的一个解决办法是使用全向相机:其360=deg}视野(FoV)场保证任何被标的物体如果不隐蔽或远离传感器,就不能离开Fov,然而,获得的全景视频在视觉任务中带来了新的挑战,例如人们的探测和跟踪,包括要处理的图像大尺寸、圆柱形投影带来的扭曲效应以及全景图像的周期性。在本文件中,我们提出了一套有针对性的方法,以便能够有效地适应全景视频:最初为视觉摄像头设计的标准人探测和跟踪管道。我们的方法是在一个深层次的学习基础的人探测和跟踪框架里,用商业360=deg}相机来实施和测试。在专门为指导机器人应用而获得的数据集上进行的实验,以及对于一个真正的服务机器人投影过程的扭曲效果。我们提出了一套有针对性的方法,用以有效地调整了我们所提议的其他状态数据系统。