Continuum robots are promising candidates for interactive tasks in various applications due to their unique shape, compliance, and miniaturization capability. Accurate and real-time shape sensing is essential for such tasks yet remains a challenge. Embedded shape sensing has high hardware complexity and cost, while vision-based methods require stereo setup and struggle to achieve real-time performance. This paper proposes the first eye-to-hand monocular approach to continuum robot shape sensing. Utilizing a deep encoder-decoder network, our method, MoSSNet, eliminates the computation cost of stereo matching and reduces requirements on sensing hardware. In particular, MoSSNet comprises an encoder and three parallel decoders to uncover spatial, length, and contour information from a single RGB image, and then obtains the 3D shape through curve fitting. A two-segment tendon-driven continuum robot is used for data collection and testing, demonstrating accurate (mean shape error of 0.91 mm, or 0.36% of robot length) and real-time (70 fps) shape sensing on real-world data. Additionally, the method is optimized end-to-end and does not require fiducial markers, manual segmentation, or camera calibration. Code and datasets will be made available at https://github.com/ContinuumRoboticsLab/MoSSNet.
翻译:连续机器人由于其独特的形状、合规性能和微型化能力,是各种应用中互动任务有希望的候选者。精确和实时形状感测对于这类任务至关重要,但仍然是个挑战。嵌入形状感测具有很高的硬件复杂性和成本,而基于视觉的方法则需要立体设置和努力实现实时性能。本文件提出对连续机器人形状感应的第一个眼对手单视法方法。利用一个深重编码解码网络、我们的方法、MOSSNet,消除音响匹配的计算成本,并减少对感测硬件的要求。特别是,MOSSNet包含一个编码器和三个平行解析器,以从单一 RGB 图像中揭示空间、长度和轮廓信息,然后通过曲线安装获得3D形状。一个双成形的由方向驱动的连续机器人用于数据收集和测试,展示真实世界数据(0.91毫米或0.36%的机器人长度)和实时(70英尺)的形状感测。此外,MOSNet包含一个编码器和自动校准系统,在现实-Rabb/SUDA 上需要最精确的校准的校正/最后的校正/校正 。</s>