Although there are many datasets for traffic sign classification, there are few datasets collected for traffic sign recognition and few of them obtain enough instances especially for training a model with the deep learning method. The deep learning method is almost the only way to train a model for real-world usage that covers various highly similar classes compared with the traditional way such as through color, shape, etc. Also, for some certain sign classes, their sign meanings were destined to can't get enough instances in the dataset. To solve this problem, we purpose a unique data augmentation method for the traffic sign recognition dataset that takes advantage of the standard of the traffic sign. We called it TSR dataset augmentation. We based on the benchmark Tsinghua-Tencent 100K (TT100K) dataset to verify the unique data augmentation method. we performed the method on four main iteration version datasets based on the TT100K dataset and the experimental results showed our method is efficacious. The iteration version datasets based on TT100K, data augmentation method source code and the training results introduced in this paper are publicly available.
翻译:虽然有许多交通标志分类的数据集,但是为交通标志识别而收集的数据集很少,其中很少有足够的实例,尤其是对于使用深度学习方法进行模型训练的情况。深度学习方法几乎是训练真实世界中涵盖各种高度相似类别的模型的唯一方法,相比传统的通过颜色、形状等方式的方法要好。此外,对于某些特定的标志类别,它们的标志意义命中注定不能在数据集中获得足够的实例。为了解决这个问题,我们提出了一种独特的交通标志识别数据集增强方法,该方法利用交通标准。我们将其称为 TSR(Traffic Sign Recognition)数据集增强。我们基于清华大学图像识别 100K(TT100K)基准数据集来验证这个独特的数据增强方法。我们在 TT100K 数据集基础上进行了四个主要迭代版本数据集的处理,实验结果表明我们的方法是有效的。基于 TT100K 的迭代版本数据集、数据增强方法的源代码和训练结果在本文中介绍并公开发布。