The game of Jenga represents an inspiring benchmark for developing innovative manipulation solutions for complex tasks. Indeed, it encouraged the study of novel robotics methods to successfully extract blocks from the tower. A Jenga game round undoubtedly embeds many traits of complex industrial or surgical manipulation tasks, requiring a multi-step strategy, the combination of visual and tactile data, and the highly precise motion of the robotic arm to perform a single block extraction. In this work, we propose a novel, cost-effective architecture for playing Jenga with e.Do, a 6-DOF anthropomorphic manipulator manufactured by Comau, a standard depth camera, and an inexpensive monodirectional force sensor. Our solution focuses on a visual-based control strategy to accurately align the end-effector with the desired block, enabling block extraction by pushing. To this aim, we train an instance segmentation deep learning model on a synthetic custom dataset to segment each piece of the Jenga tower, allowing visual tracking of the desired block's pose during the motion of the manipulator. We integrate the visual-based strategy with a 1D force sensor to detect whether the block can be safely removed by identifying a force threshold value. Our experimentation shows that our low-cost solution allows e.DO to precisely reach removable blocks and perform up to 14 consecutive extractions in a row.
翻译:延加游戏代表了为复杂任务开发创新操纵解决方案的激励性基准。 事实上, 延加游戏鼓励了对新型机器人方法的研究, 以成功从塔中提取砖块。 一次延加游戏无疑包含了复杂的工业或外科操作任务的许多特征, 需要多步战略、 视觉和触觉数据的组合, 以及机器人臂的极精确运动, 以进行单一块提取。 在这项工作中, 我们提议了一个新的、 具有成本效益的架构, 用于用 e.Do. 来玩延加游戏, 一个 6 - DOF 人体形态操控器, 由 Comau 制造的 6 - DOF 人类形态操控器, 一个标准的深度相机, 和一个廉价的单向动力传感器。 我们的解决方案侧重于基于视觉的控制策略, 以便精确地将末效或外操作与所期望的块相匹配, 通过推动, 使区块的抽动能够安全地切换到 14 的底线 。