In this paper, an unsupervised and cognitively driven weighted-entropy method for embedding semantic categories in hyperbolic geometry is proposed. The model is driven by two fields of research in cognitive linguistics: the first is the statistical learning theory of language acquisition and the proposal of using high-dimensional networks to represent semantic knowledge in cognition, and the second is the domain-specific informativeness approach to semantic communication. Weighted conditional entropy of word co-occurrence is proposed as the embedding metric, and the two weighting parameters are collocation diversity and conditional probability ranking in the corresponding statistical distribution. The Boltzmann distribution is then used on the weighted-entropy metric and embedded into a hyperbolic Poincare disk model. Testing has been mainly performed in the domains of basic color and kinship words, which belong to the classes that domain-specificity focused research in cognitive semantics has most intensively investigated. Results show that this new approach can successfully model and map the semantic relationships of popularity and similarity for most of the basic color and kinship words in English and have potential to be generalized to other semantic domains and different languages. Generally, this paper contributes to both computational cognitive semantics and the research on network and geometry-driven language embedding in computational linguistics and NLP.
翻译:在本文中,提出了将语义分类嵌入超单度几何学的未经监督和认知驱动的加权湿度方法。模型由认知语言研究的两个领域驱动:第一个领域是语言获取的统计学习理论,以及使用高维网络在认知语言中代表语义知识的建议,第二个领域是语义交流的域别特定信息性方法。 以嵌入指标的形式提出了单词共发共发的加权条件诱变,两个加权参数是同一地点的多样性和有条件的语言概率排序。 Boltzmann的分布随后用于加权博尔茨曼度衡量标准,并嵌入超偏point Poincare磁盘模型。测试主要在基本颜色和亲近语言领域进行,属于认知语义中以域别为主的研究最深入地调查的类别。结果显示,这一新方法可以成功地模拟和映射英语中大多数基本颜色和亲近语言的语义关系。 Boltzmann 分布于加权-pertyproperty-producal-pilational ladealational-dealational-dealationalationalational-seal-laphal-al-deal-al-al-deal-al-al-al-al-al-deal-al-deal-deal-deal-al-al-al-develildalationalational-al-al-al-al-develutalationalationalationalational-s) 和制成。结果。结果显示,可以成功和制化和制化其他磁系和制成的系统化的系统和制化的磁系,在基础和制化的磁系系系系系系系系系和制的磁系和制。