Vehicular object detection is the heart of any intelligent traffic system. It is essential for urban traffic management. R-CNN, Fast R-CNN, Faster R-CNN and YOLO were some of the earlier state-of-the-art models. Region based CNN methods have the problem of higher inference time which makes it unrealistic to use the model in real-time. YOLO on the other hand struggles to detect small objects that appear in groups. In this paper, we propose a method that can locate and classify vehicular objects from a given densely crowded image using YOLOv5. The shortcoming of YOLO was solved my ensembling 4 different models. Our proposed model performs well on images taken from both top view and side view of the street in both day and night. The performance of our proposed model was measured on Dhaka AI dataset which contains densely crowded vehicular images. Our experiment shows that our model achieved mAP@0.5 of 0.458 with inference time of 0.75 sec which outperforms other state-of-the-art models on performance. Hence, the model can be implemented in the street for real-time traffic detection which can be used for traffic control and data collection.
翻译:视觉物体探测是任何智能交通系统的核心。 它对于城市交通管理至关重要。 R- CNN、 Fast R-CNN、 Faster R-CNN 和 YOLO 是早期最先进的模型之一。 以区域为基础的CNN 方法有较高的推断时间问题,这使得实时使用该模型不现实。 YOLO 则在另一端为探测群体中出现的小物体而挣扎。 在本文中,我们建议一种方法,可以用YOLOV5. 来定位和分类特定拥挤的图像中的车辆物体。 YOLO 的缺点解决了我的四种组合模型。 我们提议的模型在白天和夜间从头部和侧面拍摄的图像上都很好。 我们提议的模型的性能是在达卡的AI数据集中测量的,该数据集包含拥挤的视觉图像。 我们的实验表明,我们的模型已经实现了0. 458 mAP@0.5, 0.458的推断时间为0.75秒,它比其他状态的运行模型要快。 因此,该模型可以在街道上进行实时的交通监控数据收集中使用的交通数据。