In a future with autonomous robots, visual and spatial perception is of utmost importance for robotic systems. Particularly for aerial robotics, there are many applications where utilizing visual perception is necessary for any real-world scenarios. Robotic aerial grasping using drones promises fast pick-and-place solutions with a large increase in mobility over other robotic solutions. Utilizing Mask R-CNN scene segmentation (detectron2), we propose a vision-based system for autonomous rapid aerial grasping which does not rely on markers for object localization and does not require the appearence of the object to be previously known. Combining segmented images with spatial information from a depth camera, we generate a dense point cloud of the detected objects and perform geometry-based grasp planning to determine grasping points on the objects. In real-world experiments on a dynamically grasping aerial platform, we show that our system can replicate the performance of a motion capture system for object localization up to 94.5% of the baseline grasping success rate. With our results, we show the first use of geometry-based grasping techniques with a flying platform and aim to increase the autonomy of existing aerial manipulation platforms, bringing them further towards real-world applications in warehouses and similar environments.
翻译:在未来的自主机器人中,对于机器人系统而言,视觉和空间感知至关重要。特别是对于空中机器人,利用视觉感知在任何实际场景中都是必要的。使用飞行器进行机器人空中抓取,承诺提供一个快速的拾取和放置解决方案,并且具有更大的机动性。利用Mask R-CNN场景分割(detectron2),我们提出了一种基于视觉的、自主的、快速的空中抓取系统,该系统不依赖于标记来定位物体,也不需要预先了解物体的外观。将分割的图像与深度相机的空间信息相结合,我们生成了检测到的物体的密集点云,并执行基于几何的抓取规划,确定物体的抓取点。在一个动态抓取的飞行平台上进行实际实验时,我们展示了我们的系统可以复制运动捕捉系统实现物体定位的性能,抓取成功率高达基准的94.5%。通过我们的结果,我们展示了使用飞行平台进行基于几何的抓取技术的第一个案例,并旨在提高现有空中操纵平台的自主性,使其进一步适用于仓库和类似环境的实际场景。