Sketch recognition remains a significant challenge due to the limited training data and the substantial intra-class variance of freehand sketches for the same object. Conventional methods for this task often rely on the availability of the temporal order of sketch strokes, additional cues acquired from different modalities and supervised augmentation of sketch datasets with real images, which also limit the applicability and feasibility of these methods in real scenarios. In this paper, we propose a novel sketch-specific data augmentation (SSDA) method that leverages the quantity and quality of the sketches automatically. From the aspect of quantity, we introduce a Bezier pivot based deformation (BPD) strategy to enrich the training data. Towards quality improvement, we present a mean stroke reconstruction (MSR) approach to generate a set of novel types of sketches with smaller intra-class variances. Both of these solutions are unrestricted from any multi-source data and temporal cues of sketches. Furthermore, we show that some recent deep convolutional neural network models that are trained on generic classes of real images can be better choices than most of the elaborate architectures that are designed explicitly for sketch recognition. As SSDA can be integrated with any convolutional neural networks, it has a distinct advantage over the existing methods. Our extensive experimental evaluations demonstrate that the proposed method achieves the state-of-the-art results (84.27%) on the TU-Berlin dataset, outperforming the human performance by a remarkable 11.17% increase. Finally, more experiments show the practical value of our approach for the task of sketch-based image retrieval.
翻译:由于培训数据有限,而且同一对象的自由手草图在类内差异很大,因此,人们认识到了不相干仍然是一个重大挑战。这一任务的常规方法往往依赖于提供素描中划线的时间顺序、从不同模式获得的额外提示以及监督地增加带有真实图像的素描数据集,这也限制了这些方法在真实情景中的适用性和可行性。在本文中,我们提出了一个新的、具体草图特定数据增强方法,自动利用素描的数量和质量。从数量方面看,我们采用基于贝塞尔的分流变形(BPD)战略来丰富培训数据。为了提高质量,我们提出了一种中风调整(MSR)方法,以产生一系列新型的、具有较小类内观差异的素描数据集。这两种方法都不受任何多源数据以及素描时间提示的影响。此外,我们表明,最近一些以一般图像类别为培训的深层进化神经网络模型比大多数为明确设计用于素描目的的精细结构要好得多。我们提出了一种中等的图谱化结构,因此,我们提出了一种中等的图谱化方法,从而展示了我们现有的实验性数据结果。