This paper presents a comprehensive overview of the Ultralytics YOLO(You Only Look Once) family of object detectors, focusing the architectural evolution, benchmarking, deployment perspectives, and future challenges. The review begins with the most recent release, YOLO26 (or YOLOv26), which introduces key innovations including Distribution Focal Loss (DFL) removal, native NMS-free inference, Progressive Loss Balancing (ProgLoss), Small-Target-Aware Label Assignment (STAL), and the MuSGD optimizer for stable training. The progression is then traced through YOLO11, with its hybrid task assignment and efficiency-focused modules; YOLOv8, which advanced with a decoupled detection head and anchor-free predictions; and YOLOv5, which established the modular PyTorch foundation that enabled modern YOLO development. Benchmarking on the MS COCO dataset provides a detailed quantitative comparison of YOLOv5, YOLOv8, YOLO11, and YOLO26 (YOLOv26), alongside cross-comparisons with YOLOv12, YOLOv13, RT-DETR, and DEIM(DETR with Improved Matching). Metrics including precision, recall, F1 score, mean Average Precision, and inference speed are analyzed to highlight trade-offs between accuracy and efficiency. Deployment and application perspectives are further discussed, covering export formats, quantization strategies, and real-world use in robotics, agriculture, surveillance, and manufacturing. Finally, the paper identifies challenges and future directions, including dense-scene limitations, hybrid CNN-Transformer integration, open-vocabulary detection, and edge-aware training approaches. (Object Detection, YOLOv26, YOLO)
翻译:本文对 Ultralytics YOLO(You Only Look Once)系列目标检测器进行了全面综述,重点探讨其架构演进、基准测试、部署前景及未来挑战。综述从最新发布的 YOLO26(或称 YOLOv26)开始,其引入了多项关键创新,包括移除分布焦点损失(DFL)、原生无 NMS 推理、渐进式损失平衡(ProgLoss)、小目标感知标签分配(STAL)以及用于稳定训练的 MuSGD 优化器。随后追溯了 YOLO11 的进展,其采用混合任务分配和注重效率的模块;YOLOv8 则通过解耦检测头和无需锚框的预测实现了进步;而 YOLOv5 建立了模块化的 PyTorch 基础,为现代 YOLO 的发展奠定了基础。在 MS COCO 数据集上的基准测试提供了 YOLOv5、YOLOv8、YOLO11 和 YOLO26(YOLOv26)的详细定量比较,并与 YOLOv12、YOLOv13、RT-DETR 以及 DEIM(改进匹配的 DETR)进行了交叉对比。通过分析精确率、召回率、F1 分数、平均精度均值及推理速度等指标,揭示了精度与效率之间的权衡。进一步讨论了部署与应用前景,涵盖导出格式、量化策略以及在机器人、农业、监控和制造等领域的实际应用。最后,本文指出了当前面临的挑战与未来发展方向,包括密集场景的局限性、混合 CNN-Transformer 架构的集成、开放词汇检测以及边缘感知训练方法。(目标检测,YOLOv26,YOLO)