Knowing the exact 3D location of workers and robots in a collaborative environment enables several real applications, such as the detection of unsafe situations or the study of mutual interactions for statistical and social purposes. In this paper, we propose a non-invasive and light-invariant framework based on depth devices and deep neural networks to estimate the 3D pose of robots from an external camera. The method can be applied to any robot without requiring hardware access to the internal states. We introduce a novel representation of the predicted pose, namely Semi-Perspective Decoupled Heatmaps (SPDH), to accurately compute 3D joint locations in world coordinates adapting efficient deep networks designed for the 2D Human Pose Estimation. The proposed approach, which takes as input a depth representation based on XYZ coordinates, can be trained on synthetic depth data and applied to real-world settings without the need for domain adaptation techniques. To this end, we present the SimBa dataset, based on both synthetic and real depth images, and use it for the experimental evaluation. Results show that the proposed approach, made of a specific depth map representation and the SPDH, overcomes the current state of the art.
翻译:了解工人和机器人在协作环境中的精确 3D 位置后, 就可以进行一些真正的应用, 如检测不安全情况或为统计和社会目的研究相互互动。 在本文件中, 我们提议了一个基于深度装置和深神经网络的无侵入和光异性框架, 以从外部摄像头中估计3D机器人的构成。 这个方法可以应用到任何机器人, 而不需要硬件进入内部各州。 我们引入了一种预示的外形的新描述, 即半双视分解热映像( SPDH), 以精确地计算3D 联合位置, 以世界坐标为坐标, 调整为2D 人脉动设计的高效深度网络。 提议的方法, 以 XYZ 坐标为基础, 用作深度代表的输入, 可以接受合成深度数据培训, 并应用到实际世界环境环境, 而不需要区域适应技术。 为此, 我们以合成和真实深度图像为基础, 介绍SimBa数据集, 并用于实验性评估。 结果显示, 拟议的方法, 以具体的深度图示和SPDH 的当前艺术状态。