Deep learning methods have recently exhibited impressive performance in object detection. However, such methods needed much training data to achieve high recognition accuracy, which was time-consuming and required considerable manual work like labeling images. In this paper, we automatically prepare training data using robots. Considering the low efficiency and high energy consumption in robot motion, we proposed combining robotic in-hand observation and data synthesis to enlarge the limited data set collected by the robot. We first used a robot with a depth sensor to collect images of objects held in the robot's hands and segment the object pictures. Then, we used a copy-paste method to synthesize the segmented objects with rack backgrounds. The collected and synthetic images are combined to train a deep detection neural network. We conducted experiments to compare YOLOv5x detectors trained with images collected using the proposed method and several other methods. The results showed that combined observation and synthetic images led to comparable performance to manual data preparation. They provided a good guide on optimizing data configurations and parameter settings for training detectors. The proposed method required only a single process and was a low-cost way to produce the combined data. Interested readers may find the data sets and trained models from the following GitHub repository: github.com/wrslab/tubedet
翻译:最近,深层学习方法在物体探测方面表现出了令人印象深刻的性能。然而,这些方法需要大量培训数据,以实现高度识别准确性,这需要时间,需要大量手工工作,如标签图象等。在本文中,我们自动使用机器人来准备培训数据。考虑到机器人运动中效率低和能源消耗高的情况,我们提议将机器人手动观测和数据合成结合起来,以扩大机器人收集的有限数据集。我们首先使用一个带有深度传感器的机器人来收集机器人手部和物体图片中所持有的物体的图像。然后,我们使用复制版版版版式方法来合成带架背景的分离对象。所收集的和合成的图像被合并在一起,以训练一个深度探测神经网络。我们进行了实验,将经过培训的YOLOv5x探测器与使用拟议方法和若干其他方法收集的图像进行比较。结果显示,综合的观察和合成图像可以使手工数据编制工作具有可比性。这些图像为培训探测器提供了优化数据配置和参数设置的良好指南。拟议方法只需要一个单一的过程,并且是一种低成本的方法来制作综合数据的低成本方法。收集数据。感兴趣的读者阅读者可能从GiHgul/btubrobbsir 找到的数据集和训练模型。