In this paper, we introduce the ShopSign dataset, which is a newly developed natural scene text dataset of Chinese shop signs in street views. Although a few scene text datasets are already publicly available (e.g. ICDAR2015, COCO-Text), there are few images in these datasets that contain Chinese texts/characters. Hence, we collect and annotate the ShopSign dataset to advance research in Chinese scene text detection and recognition. The new dataset has three distinctive characteristics: (1) large-scale: it contains 25,362 Chinese shop sign images, with a total number of 196,010 text-lines. (2) diversity: the images in ShopSign were captured in different scenes, from downtown to developing regions, using more than 50 different mobile phones. (3) difficulty: the dataset is very sparse and imbalanced. It also includes five categories of hard images (mirror, wooden, deformed, exposed and obscure). To illustrate the challenges in ShopSign, we run baseline experiments using state-of-the-art scene text detection methods (including CTPN, TextBoxes++ and EAST), and cross-dataset validation to compare their corresponding performance on the related datasets such as CTW, RCTW and ICPR 2018 MTWI challenge dataset. The sample images and detailed descriptions of our ShopSign dataset are publicly available at: https://github.com/chongshengzhang/shopsign.
翻译:在本文中,我们引入了ShopSign数据集,这是一个新开发的街头观观中国商店牌子的自然场景文本数据集。虽然已经公开提供了少数场景文本数据集(如ICDAR2015、COCO-Text),但这些数据集中只有少量图像包含中文文本/字符缩略图。因此,我们收集并注释了ShopSign数据集,以推进对中国现场文本检测和识别的研究。新的数据集有三个不同的特点:(1)大型:它包含25,362个中国商店牌子图像,共计196,010个文本线。(2)多样性:商店牌子的图像是在从市中心到发展中地区的不同场景中采集的,使用了50多个不同的移动电话。(3)困难:数据集非常稀少和不平衡。它还包括五类硬图像(镜像、木质、变形、暴露和模糊)。为了说明在ShopSimsignSmarch Sqreal-chest 中,我们使用州-艺术现场文本检测方法进行基线实验(包括CTPN、TextBoxes++和RMBS),以及交叉数据验证,例如IMBS/RMT的数据。