The digitization of historical maps enables the study of ancient, fragile, unique, and hardly accessible information sources. Main map features can be retrieved and tracked through the time for subsequent thematic analysis. The goal of this work is the vectorization step, i.e., the extraction of vector shapes of the objects of interest from raster images of maps. We are particularly interested in closed shape detection such as buildings, building blocks, gardens, rivers, etc. in order to monitor their temporal evolution. Historical map images present significant pattern recognition challenges. The extraction of closed shapes by using traditional Mathematical Morphology (MM) is highly challenging due to the overlapping of multiple map features and texts. Moreover, state-of-the-art Convolutional Neural Networks (CNN) are perfectly designed for content image filtering but provide no guarantee about closed shape detection. Also, the lack of textural and color information of historical maps makes it hard for CNN to detect shapes that are represented by only their boundaries. Our contribution is a pipeline that combines the strengths of CNN (efficient edge detection and filtering) and MM (guaranteed extraction of closed shapes) in order to achieve such a task. The evaluation of our approach on a public dataset shows its effectiveness for extracting the closed boundaries of objects in historical maps.
翻译:历史地图的数字化使得能够对古老、脆弱、独特和难以获取的信息来源进行研究。主要地图特征可以在随后进行专题分析的时间内检索和跟踪。这项工作的目标是矢量化步骤,即从地图的光栅图像中提取感兴趣的对象的矢量形状。我们特别感兴趣的是封闭形状探测,如建筑物、建筑块、花园、河流等,以监测其时间演变情况。历史地图呈现出显著的形态识别挑战。通过使用传统的数学生理学(MMM)提取封闭形状,由于多重地图特征和文本的重叠而具有高度挑战性。此外,最新工艺的进化神经网络(CNN)设计得非常完善,用于内容图像过滤,但不能保证封闭形状探测。此外,历史地图缺乏质谱和颜色信息,使CNN很难探测仅以其边界为代表的形状。我们的贡献是将CNN(高效边缘探测和过滤)和MM(保证提取封闭形状的图像)的优势结合起来,因为多重地图特征和文本相互重叠。此外,最先进的神经神经网络(CNN)完全的神经网络(CNN)是用来进行图像的提取,以便完成这种封闭式的地图的提取,从而实现对公众进行历史目的评估。