Pre-trained models (PTMs) have become a fundamental backbone for downstream tasks in natural language processing and computer vision. Despite initial gains that were obtained by applying generic PTMs to geo-related tasks at Baidu Maps, a clear performance plateau over time was observed. One of the main reasons for this plateau is the lack of readily available geographic knowledge in generic PTMs. To address this problem, in this paper, we present ERNIE-GeoL, which is a geography-and-language pre-trained model designed and developed for improving the geo-related tasks at Baidu Maps. ERNIE-GeoL is elaborately designed to learn a universal representation of geography-language by pre-training on large-scale data generated from a heterogeneous graph that contains abundant geographic knowledge. Extensive quantitative and qualitative experiments conducted on large-scale real-world datasets demonstrate the superiority and effectiveness of ERNIE-GeoL. ERNIE-GeoL has already been deployed in production at Baidu Maps since April 2021, which significantly benefits the performance of various downstream tasks. This demonstrates that ERNIE-GeoL can serve as a fundamental backbone for a wide range of geo-related tasks.
翻译:预先培训的模型已成为自然语言处理和计算机愿景下游任务的基本支柱。尽管在Baidu地图上对与地理有关的任务应用通用的PTM系统取得了初步成果,但随着时间推移观察到一个明显的性能高原,其主要原因之一是通用PTM系统缺乏现成的地理知识。为了解决这一问题,我们在本文件中介绍了ERNIE-GeoL,这是为改进Baidu地图的地理相关任务而设计和开发的地理和语言前培训模型。ERNIE-GeoL是精心设计的,目的是通过对包含大量地理知识的多元图生成的大型数据进行预先培训,学习通用的地理语言代表。对大规模真实世界数据集进行的广泛定量和定性实验显示了ERNIE-GeoL的优越性和有效性。ERNIE-GeoL自2021年4月以来已在Baidu地图上投入生产,这极大地有利于各种下游任务的执行。这说明ERNIE-GeoL可以作为与一系列地质任务有关的基础。