Despite recent advances in the field of supervised deep learning for text line segmentation, unsupervised deep learning solutions are beginning to gain popularity. In this paper, we present an unsupervised deep learning method that embeds document image patches to a compact Euclidean space where distances correspond to a coarse text line pattern similarity. Once this space has been produced, text line segmentation can be easily implemented using standard techniques with the embedded feature vectors. To train the model, we extract random pairs of document image patches with the assumption that neighbour patches contain a similar coarse trend of text lines, whereas if one of them is rotated, they contain different coarse trends of text lines. Doing well on this task requires the model to learn to recognize the text lines and their salient parts. The benefit of our approach is zero manual labelling effort. We evaluate the method qualitatively and quantitatively on several variants of text line segmentation datasets to demonstrate its effectivity.
翻译:尽管在监督深入学习文本线条分割方面最近有所进展,但未经监督的深层次学习解决方案正开始受到欢迎。 在本文中,我们提出了一个未经监督的深层次学习方法,将文件图像嵌入紧凑的欧几里德空间,其距离与粗略文本线条模式相似。一旦生成了这一空间,就可以使用嵌入特性矢量的标准技术轻而易举地实施文本线分割。为了对模型进行培训,我们随机抽取一对文档图像,假设相邻部分含有类似的粗糙文本线条趋势,而如果其中之一被旋转,则含有不同的粗略文本线条趋势。做好这项工作需要模型学习识别文本线及其突出部分。我们方法的优点是零手动标签工作。我们用几个文本线条分割数据集变量对方法进行定性和定量评估,以证明其效果。