As a core task in location-based services (LBS) (e.g., navigation maps), query and point of interest (POI) matching connects users' intent with real-world geographic information. Recently, pre-trained models (PTMs) have made advancements in many natural language processing (NLP) tasks. Generic text-based PTMs do not have enough geographic knowledge for query-POI matching. To overcome this limitation, related literature attempts to employ domain-adaptive pre-training based on geo-related corpus. However, a query generally contains mentions of multiple geographic objects, such as nearby roads and regions of interest (ROIs). The geographic context (GC), i.e., these diverse geographic objects and their relationships, is therefore pivotal to retrieving the most relevant POI. Single-modal PTMs can barely make use of the important GC and therefore have limited performance. In this work, we propose a novel query-POI matching method Multi-modal Geographic language model (MGeo), which comprises a geographic encoder and a multi-modal interaction module. MGeo represents GC as a new modality and is able to fully extract multi-modal correlations for accurate query-POI matching. Besides, there is no publicly available benchmark for this topic. In order to facilitate further research, we build a new open-source large-scale benchmark Geographic TExtual Similarity (GeoTES). The POIs come from an open-source geographic information system (GIS). The queries are manually generated by annotators to prevent privacy issues. Compared with several strong baselines, the extensive experiment results and detailed ablation analyses on GeoTES demonstrate that our proposed multi-modal pre-training method can significantly improve the query-POI matching capability of generic PTMs, even when the queries' GC is not provided. Our code and dataset are publicly available at https://github.com/PhantomGrapes/MGeo.
翻译:作为基于地点的服务(LBS)(如导航地图)、查询和兴趣点(POI)的核心任务,将用户的意向与现实世界的地理信息连接起来。最近,经过预先培训的模型(PTMs)在许多自然语言处理(NLPP)任务中取得了进步。基于通用文本的PTM没有足够的地理知识来进行查询-POI匹配。为了克服这一限制,相关文献试图利用基于地理相关内容的域适应前培训。然而,一个查询通常提到多个地理目标,如附近的道路和感兴趣的区域(ROI)。地理环境(GC),即这些不同的地理目标及其关系,因此对于重新获取最相关的POI(NLPP)处理(NLPMs)任务。基于通用文本的PTMs没有足够的地理知识来进行查询,因此业绩有限。在这项工作中,我们提议采用新的 CIPO-PI 匹配方法的多模式(Mgeoo) 和多模式语言模式(Meo), 由广泛的地理编码和多式互动模块组成,而不是多式的多式地理关系。 GGEO(OSeral-real-rial-rial-rial dreal dreal dreal dreal dreal dreal deal)代表了一个新的GS) 并可以完全一个用于建立一个可获取的G-commex-commex-commal 。