Automatic License Plate Recognition (ALPR) systems have shown remarkable performance on license plates (LPs) from multiple regions due to advances in deep learning and the increasing availability of datasets. The evaluation of deep ALPR systems is usually done within each dataset; therefore, it is questionable if such results are a reliable indicator of generalization ability. In this paper, we propose a traditional-split versus leave-one-dataset-out experimental setup to empirically assess the cross-dataset generalization of 12 Optical Character Recognition (OCR) models applied to LP recognition on nine publicly available datasets with a great variety in several aspects (e.g., acquisition settings, image resolution, and LP layouts). We also introduce a public dataset for end-to-end ALPR that is the first to contain images of vehicles with Mercosur LPs and the one with the highest number of motorcycle images. The experimental results shed light on the limitations of the traditional-split protocol for evaluating approaches in the ALPR context, as there are significant drops in performance for most datasets when training and testing the models in a leave-one-dataset-out fashion.
翻译:由于深层学习的进步和数据集越来越多,来自多个区域的牌照自动牌照识别(ALPR)系统表现显著。对深层ALPR系统的评价通常是在每个数据集内进行的;因此,如果这种结果是概括能力的可靠指标,则令人怀疑。在本文中,我们提议采用传统版与休假一元数据集的实验设置,以经验评估12个光学字符识别(OCR)模型的交叉数据集通用性,这些模型适用于9个公开数据集的识别,在多个方面(例如,获取设置、图像分辨率和LP布局)差异很大。我们还为端到端的ALPR系统引入了公共数据集,这是第一个包含有Mercosur LPs和摩托车图像数量最多的车辆图像的公开数据集。实验结果揭示了传统版协议在评估ALPR背景下的方法方面的局限性,因为在对大多数模型进行休假式数据设置的培训和测试时,大多数数据集的性能都显著下降。