模型选择模型选择影响分配性文字协会:对静态单词嵌入的半监督分析 (Model Choices Influence Attributive Word Associations: A Semi-supervised Analysis of Static Word Embeddings)

Static word embeddings encode word associations, extensively utilized in downstream NLP tasks. Although prior studies have discussed the nature of such word associations in terms of biases and lexical regularities captured, the variation in word associations based on the embedding training procedure remains in obscurity. This work aims to address this gap by assessing attributive word associations across five different static word embedding architectures, analyzing the impact of the choice of the model architecture, context learning flavor and training corpora. Our approach utilizes a semi-supervised clustering method to cluster annotated proper nouns and adjectives, based on their word embedding features, revealing underlying attributive word associations formed in the embedding space, without introducing any confirmation bias. Our results reveal that the choice of the context learning flavor during embedding training (CBOW vs skip-gram) impacts the word association distinguishability and word embeddings' sensitivity to deviations in the training corpora. Moreover, it is empirically shown that even when trained over the same corpora, there is significant inter-model disparity and intra-model similarity in the encoded word associations across different word embedding models, portraying specific patterns in the way the embedding space is created for each embedding architecture.

翻译：虽然先前的研究已经从偏见和词汇规律的角度讨论了此类词汇协会的性质,但基于嵌入培训程序的文字协会的变异仍然是隐蔽的。这项工作的目的是通过评估五种不同静态词嵌入结构的属性字协会,分析模式结构选择、背景学习口味和培训公司的影响,分析模式结构选择、背景学习口味和培训公司的影响。我们的方法利用半监督的集群方法,将一个附带说明的适当名词和形容词分组在一起,基于其词嵌入特征,揭示在嵌入空间中形成的根本性的归并字协会,而没有引入任何确认偏见。我们的结果显示,在嵌入培训(CBOW 与跳过)过程中选择背景学习口味会影响该词的区别和词嵌入对培训公司偏差的敏感度。此外,我们的经验表明,即使对同一公司进行了培训,也存在显著的建模差异和内部建模在嵌入空间空间空间结构中呈现的内嵌入式模式。

相关内容

词向量表示

关注 37

分散式表示即将语言表示为稠密、低维、连续的向量。研究者最早发现学习得到词嵌入之间存在类比关系。比如apple−apples ≈ car−cars， man−woman ≈ king – queen 等。这些方法都可以直接在大规模无标注语料上进行训练。词嵌入的质量也非常依赖于上下文窗口大小的选择。通常大的上下文窗口学到的词嵌入更反映主题信息，而小的上下文窗口学到的词嵌入更反映词的功能和上下文语义信息。

最新《自监督表示学习》报告，70页ppt

专知会员服务

86+阅读 · 2020年12月22日

知识图谱上的神经和符号逻辑推理，99页ppt

专知会员服务

111+阅读 · 2020年12月17日

【KDD2020教程】多模态网络表示学习

专知会员服务

132+阅读 · 2020年8月26日

最近几种小样本元学习简明综述，A Concise Review of Recent Few-shot Meta-learning Methods

专知会员服务

35+阅读 · 2020年5月25日