We present SeeThruFinger, a Vision-Based Tactile Sensing (VBTS) architecture using a markerless See-Thru-Network. It achieves simultaneous visual perception and tactile sensing while providing omni-directional, adaptive grasping for manipulation. Multi-modal perception of intrinsic and extrinsic interactions is critical in building intelligent robots that learn. Instead of adding various sensors for different modalities, a preferred solution is to integrate them into one elegant and coherent design, which is a challenging task. This study leverages the in-finger vision to inpaint occluded regions of the external environment, achieving coherent scene reconstruction for visual perception. By tracking real-time segmentation of the Soft Polyhedral Network's large-scale deformation, we achieved real-time markerless tactile sensing of 6D forces and torques. We further demonstrate the application of the SeeThruFinger for reactive grasping without using external cameras or dedicated force and torque sensors. As a result, our proposed SeeThruFinger architecture enables multi-modal perception via a single in-finger vision camera in a markerless way, including scene inpainting, object detection, segmentation tracking, and tactile sensing.
翻译:暂无翻译