Within the field of robotics, computer vision remains a significant barrier to progress, with many tasks hindered by inefficient vision systems. This research proposes a generalized vision module leveraging YOLOv9, a state-of-the-art framework optimized for computationally constrained environments like robots. The model is trained on a dataset tailored to the FIRA robotics Hurocup. A new vision module is implemented in ROS1 using a virtual environment to enable YOLO compatibility. Performance is evaluated using metrics such as frames per second (FPS) and Mean Average Precision (mAP). Performance is then compared to the existing geometric framework in static and dynamic contexts. The YOLO model achieved comparable precision at a higher computational cost then the geometric model, while providing improved robustness.
翻译:在机器人学领域,计算机视觉仍是制约发展的主要障碍,许多任务因低效的视觉系统而受阻。本研究提出了一种通用视觉模块,该模块利用YOLOv9这一专为机器人等计算受限环境优化的先进框架。该模型使用针对FIRA机器人Hurocup赛事定制的数据集进行训练。通过虚拟环境在ROS1中实现了新的视觉模块,以实现YOLO兼容性。使用每秒帧数(FPS)和平均精度均值(mAP)等指标评估性能,并在静态与动态场景中与现有几何框架进行对比。YOLO模型在计算成本较高的情况下达到了与几何模型相当的精度,同时提供了更强的鲁棒性。