A popular and affordable option to provide room-scale human behaviour tracking is to rely on commodity RGB-D sensors %todo: such as the Kinect family of devices? as such devices offer body tracking capabilities at a reasonable price point. While their capabilities may be sufficient for applications such as entertainment systems where a person plays in front of a television, RGB-D sensors are sensitive to occlusions from objects or other persons that might be in the way in more complex room-scale setups. To alleviate the occlusion issue but also in order to extend the tracking range and strengthen its accuracy, it is possible to rely on multiple RGB-D sensors and perform data fusion. Unfortunately, fusing the data in a meaningful manner raises additional challenges related to the calibration of the sensors relative to each other to provide a common frame of reference, but also regarding skeleton matching and merging when actually combining the data. In this paper, we discuss our approach to tackle these challenges and present the results we achieved, through aligned point clouds and combined skeleton lists. These results successfully enable unobtrusive and occlusion-resilient human behaviour tracking at room scale, that may be used as input for interactive applications as well as (possibly remote) collaborative systems.
翻译:暂无翻译