Automatic License Plate Recognition systems aim to provide a solution for detecting, localizing, and recognizing license plate characters from vehicles appearing in video frames. However, deploying such systems in the real world requires real-time performance in low-resource environments. In our paper, we propose a two-stage detection pipeline paired with Vision API that provides real-time inference speed along with consistently accurate detection and recognition performance. We used a haar-cascade classifier as a filter on top of our backbone MobileNet SSDv2 detection model. This reduces inference time by only focusing on high confidence detections and using them for recognition. We also impose a temporal frame separation strategy to distinguish between multiple vehicle license plates in the same clip. Furthermore, there are no publicly available Bangla license plate datasets, for which we created an image dataset and a video dataset containing license plates in the wild. We trained our models on the image dataset and achieved an AP(0.5) score of 86% and tested our pipeline on the video dataset and observed reasonable detection and recognition performance (82.7% detection rate, and 60.8% OCR F1 score) with real-time processing speed (27.2 frames per second).
翻译:自动牌照板识别系统旨在为探测、定位和识别在视频框中出现的车辆的牌号提供解决方案。然而,在现实世界中部署这类系统需要低资源环境中的实时性能。在我们的论文中,我们建议采用与Vision API相配的两阶段检测管道,配以提供实时推断速度以及一致准确检测和识别性能的实时自动检测速度。我们用一个haar 橡皮层分类器作为主干线移动网 SSDv2检测模型的过滤器。这只侧重于高度信任检测,并使用它们来减少推断时间。我们还实施一个时间框架分离战略,以区分同一剪辑中的多个车辆牌号。此外,我们没有可供公众查阅的Bangla牌牌牌数据集,为此我们制作了一个图像数据集和包含野外牌照的视频数据集。我们用图像数据集对模型进行了培训,实现了86%的AP(0.5)分,并在视频数据集上测试了我们的管道,并观察到了合理的检测和识别业绩(82.7%的检测率和60.8%的OCRF1分)。