Identifying the production dates of historical manuscripts is one of the main goals for paleographers when studying ancient documents. Automatized methods can provide paleographers with objective tools to estimate dates more accurately. Previously, statistical features have been used to date digitized historical manuscripts based on the hypothesis that handwriting styles change over periods. However, the sparse availability of such documents poses a challenge in obtaining robust systems. Hence, the research of this article explores the influence of data augmentation on the dating of historical manuscripts. Linear Support Vector Machines were trained with k-fold cross-validation on textural and grapheme-based features extracted from historical manuscripts of different collections, including the Medieval Paleographical Scale, early Aramaic manuscripts, and the Dead Sea Scrolls. Results show that training models with augmented data improve the performance of historical manuscripts dating by 1% - 3% in cumulative scores. Additionally, this indicates further enhancement possibilities by considering models specific to the features and the documents' scripts.
翻译:在研究古代文件时,确定历史手稿的制作日期是考古学家的主要目标之一。自动化方法可以向考古学家提供客观工具,更准确地估计日期。以前,根据笔迹风格在一段时间内变化的假设,统计特征一直被用于数字化的历史手稿。然而,这类文件的很少提供对获得强有力的系统构成挑战。因此,对本条的研究探讨了数据增强对历史手稿的约会的影响。线性支持矢量机接受了K倍交叉校验培训,其内容来自不同收藏的历史手稿,包括中世纪平面图、早期阿拉姆手稿和死海卷。结果显示,使用强化数据的培训模式提高了历史手稿的性能,在累积分数中增加了1%-3%。此外,这还表明通过考虑具体特征和文件脚本的模型,进一步提高了历史手稿性能的可能性。