In this work, we explore massive pre-training on synthetic word images for enhancing the performance on four benchmark downstream handwriting analysis tasks. To this end, we build a large synthetic dataset of word images rendered in several handwriting fonts, which offers a complete supervision signal. We use it to train a simple convolutional neural network (ConvNet) with a fully supervised objective. The vector representations of the images obtained from the pre-trained ConvNet can then be considered as encodings of the handwriting style. We exploit such representations for Writer Retrieval, Writer Identification, Writer Verification, and Writer Classification and demonstrate that our pre-training strategy allows extracting rich representations of the writers' style that enable the aforementioned tasks with competitive results with respect to task-specific State-of-the-Art approaches.
翻译:在这项工作中,我们探索了用于增强四项基准下游手写分析任务表现的合成单词图像的大规模预训练。为此,我们建立了一个大型合成数据集,其中包含用多种手写字体渲染的单词图像,它提供了完整的监督信号。我们使用这个数据集来训练一个简单的卷积神经网络(ConvNet),采用完全监督的目标。从预训练的ConvNet得到的图像向量表示可以被认为是手写风格的编码。我们利用这些表示进行作者检索、作者识别、作者验证和作者分类,并展示我们的预训练策略允许提取作者风格的丰富表示,使上述任务具有与特定任务State-of-the-Art方法相竞争的结果。