The global geometry of language models is important for a range of applications, but language model probes tend to evaluate rather local relations, for which ground truths are easily obtained. In this paper we exploit the fact that in geography, ground truths are available beyond local relations. In a series of experiments, we evaluate the extent to which language model representations of city and country names are isomorphic to real-world geography, e.g., if you tell a language model where Paris and Berlin are, does it know the way to Rome? We find that language models generally encode limited geographic information, but with larger models performing the best, suggesting that geographic knowledge can be induced from higher-order co-occurrence statistics.
翻译:语言模型的全球几何学对于一系列应用十分重要,但语言模型探测器往往评估比较局部的关系,而对于这种关系,很容易获得地面真相。在本文中,我们利用在地理上,地面真相超越了当地关系。在一系列实验中,我们评估城市和国名的语言模型的表述在多大程度上与现实世界地理不相容,例如,如果你告诉一个语言模型巴黎和柏林的位置,它是否了解通往罗马的道路?我们发现语言模型通常将有限的地理信息编码化,但用更大的模型来发挥最佳效果,这表明地理知识可以从更高层次的共生统计数据中引出。