Large amounts of geospatial data have been made available recently on the linked open data cloud and the portals of many national cartographic agencies (e.g., OpenStreetMap data, administrative geographies of various countries, or land cover/land use data sets). These datasets use various geospatial vocabularies and can be queried using SPARQL or its OGC-standardized extension GeoSPARQL. In this paper, we go beyond these approaches to offer a question-answering engine for natural language questions on top of linked geospatial data sources. Our system has been implemented as re-usable components of the Frankenstein question answering architecture. We give a detailed description of the system's architecture, its underlying algorithms, and its evaluation using a set of 201 natural language questions. The set of questions is offered to the research community as a gold standard dataset for the comparative evaluation of future geospatial question answering engines.
翻译:最近,在相连的开放数据云和许多国家制图机构的门户网站(例如OpenStreetMap数据、不同国家的行政地理图或土地覆被/土地使用数据集)上提供了大量地理空间数据,这些数据集使用了各种地理空间词汇,可以使用SPARQL或其标准化的OGC扩展GeoSPARQL进行查询。在本文件中,我们超越了这些方法,在链接的地理空间数据源之上为自然语言问题提供一个问答引擎。我们的系统已经作为可重复使用的科学怪人问题解答结构组成部分而实施。我们详细介绍了该系统的结构、其基本算法以及使用一套201种自然语言问题进行的评价。向研究界提供的一组问题,是用于比较评估未来地理空间问题解答引擎的金质标准数据集。