Gait disabilities are among the most frequent worldwide. Their treatment relies on rehabilitation therapies, in which smart walkers are being introduced to empower the user's recovery and autonomy, while reducing the clinicians effort. For that, these should be able to decode human motion and needs, as early as possible. Current walkers decode motion intention using information of wearable or embedded sensors, namely inertial units, force and hall sensors, and lasers, whose main limitations imply an expensive solution or hinder the perception of human movement. Smart walkers commonly lack a seamless human-robot interaction, which intuitively understands human motions. A contactless approach is proposed in this work, addressing human motion decoding as an early action recognition/detection problematic, using RGB-D cameras. We studied different deep learning-based algorithms, organised in three different approaches, to process lower body RGB-D video sequences, recorded from an embedded camera of a smart walker, and classify them into 4 classes (stop, walk, turn right/left). A custom dataset involving 15 healthy participants walking with the device was acquired and prepared, resulting in 28800 balanced RGB-D frames, to train and evaluate the deep networks. The best results were attained by a convolutional neural network with a channel attention mechanism, reaching accuracy values of 99.61% and above 93%, for offline early detection/recognition and trial simulations, respectively. Following the hypothesis that human lower body features encode prominent information, fostering a more robust prediction towards real-time applications, the algorithm focus was also evaluated using Dice metric, leading to values slightly higher than 30%. Promising results were attained for early action detection as a human motion decoding strategy, with enhancements in the focus of the proposed architectures.
翻译:Gait残疾是全世界最常见的残疾之一。 他们的治疗依赖于康复疗法, 智能行尸通常缺乏无缝的人体机器人互动, 从而直觉地理解人类运动。 在这项工作中,提出了一种无接触性的应用方法, 使用 RGB-D 相机将人类运动解码为早期行动识别/检测问题。 我们用三种不同的方法研究了不同的基于深层次的算法, 处理从智能行尸机嵌入的摄像头中录入的低体型RGB-D视频序列, 并将其分为四级( 停、行、右/左)。 智能行尸通常缺乏无缝的人体机器人互动, 从而直觉地理解人类运动。 在这项工作中提出了一种无接触性的应用方法, 将人类运动解码作为早期行动识别/检测问题。 在28800年的 RGB-D 相机中, 以最平衡的 RGB-D 轨迹定位为模型, 将一个最精确的轨迹定位系统, 并用最精确的 RGB- D