Controlling hand exoskeletons for assisting impaired patients in grasping tasks is challenging because it is difficult to infer user intent. We hypothesize that majority of daily grasping tasks fall into a small set of categories or modes which can be inferred through real-time analysis of environmental geometry from 3D point clouds. This paper presents a low-cost, real-time system for semantic image labeling of household scenes with the objective to inform and assist activities of daily living. The system consists of a miniature depth camera, an inertial measurement unit and a microprocessor. It is able to achieve 85% or higher accuracy at classification of predefined modes while processing complex 3D scenes at over 30 frames per second. Within each mode it can detect and localize graspable objects. Grasping points can be correctly estimated on average within 1 cm for simple object geometries. The system has potential applications in robotic-assisted rehabilitation as well as manual task assistance.
翻译:暂无翻译