We consider the task of object grasping with a prosthetic hand capable of multiple grasp types. In this setting, communicating the intended grasp type often requires a high user cognitive load which can be reduced adopting shared autonomy frameworks. Among these, so-called eye-in-hand systems automatically control the hand pre-shaping before the grasp, based on visual input coming from a camera on the wrist. In this paper, we present an eye-in-hand learning-based approach for hand pre-shape classification from RGB sequences. Differently from previous work, we design the system to support the possibility to grasp each considered object part with a different grasp type. In order to overcome the lack of data of this kind and reduce the need for tedious data collection sessions for training the system, we devise a pipeline for rendering synthetic visual sequences of hand trajectories. We develop a sensorized setup to acquire real human grasping sequences for benchmarking and show that, compared on practical use cases, models trained with our synthetic dataset achieve better generalization performance than models trained on real data. We finally integrate our model on the Hannes prosthetic hand and show its practical effectiveness. We make publicly available the code and dataset to reproduce the presented results.
翻译:我们考虑的是用能够具有多重抓取类型的假手捕捉物体的任务。 在这种环境下,交流预想的捕捉类型往往需要很高的用户认知负荷,从而可以减少采用共享的自主框架。 其中,所谓的“眼在手”系统根据手腕上的摄像头的视觉输入,自动控制握住手前的剪裁。在本文中,我们展示了从 RGB 序列中手握形状前分类的以眼在手边学习为基础的方法。与以往的工作不同,我们设计了系统,以支持以不同的抓取类型来捕捉每个被考虑的物体部分的可能性。为了克服这种缺乏数据的情况,并减少为系统培训而需要重复的数据收集课程,我们设计了一条管道,用于合成手脚轨的视觉序列。我们开发了一套感官化的设置,以获得真正的人手握序列,用于基准和显示,与实际使用的案例相比,我们所培训的合成数据集模型比实际数据模型取得更好的概括性性工作。我们最后将模型整合成汉内斯假手的模型,并展示其实际效果。我们公开提供数据。