Recent work has shown that Pre-trained Language Models (PLMs) have the ability to store the relational knowledge from pre-training data in their model parameters. However, it is not clear up to what extent do PLMs store geo-diverse commonsense knowledge, the knowledge associated with a culture and only shared locally. For instance, the color of bridal dress is white in American weddings whereas it is red in Chinese weddings. Here, we wish to probe if PLMs can predict red and white as the color of the bridal dress when queried for American and Chinese weddings, respectively. To this end, we introduce a framework for geo-diverse commonsense probing on multilingual PLMs (mPLMs) and introduce a corresponding benchmark Geo-diverse Commonsense Multilingual Language Model Analysis (GeoMLAMA) dataset. GeoMLAMA contains 3125 prompts in English, Chinese, Hindi, Persian, and Swahili, with a wide coverage of concepts shared by people from American, Chinese, Indian, Iranian and Kenyan cultures. We benchmark 11 standard mPLMs which include variants of mBERT, XLM, mT5, and XGLM on GeoMLAMA. Interestingly, we find that 1) larger mPLM variants do not necessarily store geo-diverse concepts better than its smaller variant; 2) mPLMs are not intrinsically biased towards knowledge from the Western countries (the United States); 3) the native language of a country may not be the best language to probe its knowledge and 4) a language may better probe knowledge about a non-native country than its native country.
翻译:最近的工作表明,培训前语言模型(PLMs)有能力将培训前数据中的关系知识储存在模型参数中,但尚不清楚PLMs在多大程度上储存了地理多样性常识知识,这种知识与一种文化相关,仅在当地分享。例如,美国婚礼中的新娘礼服颜色是白色,而中国婚礼则红色。在这里,我们希望探究PLMs能否在为美国和中国婚礼询问时,将红白服装作为婚纱的颜色。为此,我们引入了一个地理多样性常识框架,用于多语言PLMS(MPLM),并引入相应的基准,即地理多样性常识多语言模型分析(GeoMLAMA)数据集。GeoMLAMA包含英文、中文、印地文、波斯文和斯瓦希里文的3125个提示。 由来自美国、中国、印度、伊朗和肯尼亚文化的人们共享的概念。我们将11个标准 mPLMMMMMM(而不是更小的ML)标准非语言用于多语言的多语言,我们把MLMMMMMMMM(MMM)作为基准,而不是MMMMMMMMM(M)国家里基)概念中的国家的变数。