In recent years, deep learning-based methods have shown promising results in computer vision area. However, a common deep learning model requires a large amount of labeled data, which is labor-intensive to collect and label. What's more, the model can be ruined due to the domain shift between training data and testing data. Text recognition is a broadly studied field in computer vision and suffers from the same problems noted above due to the diversity of fonts and complicated backgrounds. In this paper, we focus on the text recognition problem and mainly make three contributions toward these problems. First, we collect a multi-source domain adaptation dataset for text recognition, including five different domains with over five million images, which is the first multi-domain text recognition dataset to our best knowledge. Secondly, we propose a new method called Meta Self-Learning, which combines the self-learning method with the meta-learning paradigm and achieves a better recognition result under the scene of multi-domain adaptation. Thirdly, extensive experiments are conducted on the dataset to provide a benchmark and also show the effectiveness of our method. The code of our work and dataset are available soon at https://bupt-ai-cz.github.io/Meta-SelfLearning/.
翻译:近些年来,深层次的学习方法在计算机视野领域显示出了有希望的成果。然而,共同的深层次学习模式需要大量标记的数据,这是收集和标签的劳动密集型数据。此外,由于培训数据与测试数据之间的领域转移,该模式可能因培训数据与测试数据之间的领域转移而毁损。文本识别是计算机视野中一个广泛研究的领域,由于字体和复杂背景的多样性,也存在上述同样的问题。在本文件中,我们侧重于文本识别问题,主要为这些问题做出三项贡献。首先,我们收集了一个多来源域适应数据集,用于文本识别,包括5个不同域,图像超过500万张,这是我们最了解的首个多域文本识别数据集。第二,我们提出了一个名为Meta自学的新方法,该方法将自学方法与元学习模式结合起来,并在多层次适应领域领域取得更好的认可结果。第三,对数据集进行了广泛的实验,以提供一个基准,并展示我们的方法的有效性。我们的工作和数据集的代码不久将在 https://bup-Metata-matio.c.