In many machine learning tasks, a large general dataset and a small specialized dataset are available. In such situations, various domain adaptation methods can be used to adapt a general model to the target dataset. We show that in the case of neural networks trained for handwriting recognition using CTC, simple finetuning with data augmentation works surprisingly well in such scenarios and that it is resistant to overfitting even for very small target domain datasets. We evaluated the behavior of finetuning with respect to augmentation, training data size, and quality of the pre-trained network, both in writer-dependent and writer-independent settings. On a large real-world dataset, finetuning provided an average relative CER improvement of 25 % with 16 text lines for new writers and 50 % for 256 text lines.
翻译:在许多机器学习任务中,有一个庞大的一般数据集和一个小型的专门数据集。在这种情况下,可以使用各种领域适应方法使一个通用模型适应目标数据集。我们表明,在使用CTC进行笔迹识别培训的神经网络中,简单微调数据扩增工作在这种情景下非常出色,而且即使在非常小的目标域数据集中也难以超编。我们评估了在依赖作家和依赖作家的环境中,对预先培训的网络的扩增、培训数据大小和质量进行微调的行为。在大型真实世界数据集中,微调提供了25%的相对CER改进,新作家有16条文本线,256条文本线有50%的相对改进。