OTS：一种历史手稿文本定位的一次学习方法 (OTS: A One-shot Learning Approach for Text Spotting in Historical Manuscripts)

Historical manuscript processing poses challenges like limited annotated training data and novel class emergence. To address this, we propose a novel One-shot learning-based Text Spotting (OTS) approach that accurately and reliably spots novel characters with just one annotated support sample. Drawing inspiration from cognitive research, we introduce a spatial alignment module that finds, focuses on, and learns the most discriminative spatial regions in the query image based on one support image. Especially, since the low-resource spotting task often faces the problem of example imbalance, we propose a novel loss function called torus loss which can make the embedding space of distance metric more discriminative. Our approach is highly efficient and requires only a few training samples while exhibiting the remarkable ability to handle novel characters, and symbols. To enhance dataset diversity, a new manuscript dataset that contains the ancient Dongba hieroglyphics (DBH) is created. We conduct experiments on publicly available VML-HD, TKH, NC datasets, and the new proposed DBH dataset. The experimental results demonstrate that OTS outperforms the state-of-the-art methods in one-shot text spotting. Overall, our proposed method offers promising applications in the field of text spotting in historical manuscripts.

翻译：历史手稿处理存在数据标注有限和新类别出现等挑战。为了应对这一挑战，我们提出了一种新颖的基于一次学习的文本定位（OTS）方法，使用仅有一个标注样本就能准确可靠地定位新字符。受认知研究启发，我们引入了空间对齐模块，基于一张支持图像在查询图像中寻找、关注和学习最具区分性的空间区域。特别地，由于低资源定位任务常常面临实例失衡问题，我们提出了一种称为Torus Loss的新型损失函数，可以使距离度量的嵌入空间更具区分性。我们的方法非常高效，仅需要几个训练样本，同时具有处理新字符和符号的显著能力。为了增强数据集的多样性，我们创建了一个包括古代东巴象形文字的新手稿数据集。我们在公开可用的VML-HD、TKH、NC数据集和新提出的DBH数据集上进行了实验。实验结果表明，OTS在一次文本定位中优于现有的方法。总的来说，我们提出的方法在历史手稿文本定位领域具有很大的应用前景。

相关内容

数据集

关注 88

数据集，又称为资料集、数据集合或资料集合，是一种由数据所组成的集合。
Data set（或dataset）是一个数据的集合，通常以表格形式出现。每一列代表一个特定变量。每一行都对应于某一成员的数据集的问题。它列出的价值观为每一个变量，如身高和体重的一个物体或价值的随机数。每个数值被称为数据资料。对应于行数，该数据集的数据可能包括一个或多个成员。