Modelling semantic similarity plays a fundamental role in lexical semantic applications. A natural way of calculating semantic similarity is to access handcrafted semantic networks, but similarity prediction can also be anticipated in a distributional vector space. Similarity calculation continues to be a challenging task, even with the latest breakthroughs in deep neural language models. We first examined popular methodologies in measuring taxonomic similarity, including edge-counting that solely employs semantic relations in a taxonomy, as well as the complex methods that estimate concept specificity. We further extrapolated three weighting factors in modelling taxonomic similarity. To study the distinct mechanisms between taxonomic and distributional similarity measures, we ran head-to-head comparisons of each measure with human similarity judgements from the perspectives of word frequency, polysemy degree and similarity intensity. Our findings suggest that without fine-tuning the uniform distance, taxonomic similarity measures can depend on the shortest path length as a prime factor to predict semantic similarity; in contrast to distributional semantics, edge-counting is free from sense distribution bias in use and can measure word similarity both literally and metaphorically; the synergy of retrofitting neural embeddings with concept relations in similarity prediction may indicate a new trend to leverage knowledge bases on transfer learning. It appears that a large gap still exists on computing semantic similarity among different ranges of word frequency, polysemous degree and similarity intensity.
翻译:建模语义相似性在地精精度应用中具有根本作用。计算语义相似性的一种自然方法是使用手工艺的语义网络,但是在分布矢量空间中也可以预见到相似性预测。相似性计算仍然是一项具有挑战性的任务,即便在深心神经语言模型的最新突破中,也仍然是一项具有挑战性的任务。我们首先研究测量分类相似性的流行方法,包括在分类学中仅使用语义关系的边际计算,以及估计概念特性的复杂方法。我们在模拟分类学相似性中进一步外推了三个加权因素。为了研究分类学和分布相似性措施之间的不同机制,我们从文字频率、多元度和相似性强度的角度对每项措施与人类相似性判断进行了头对头的比较。我们的研究结果表明,不细微调整统一距离,类似性分类相似性措施可以取决于最短的路径长度,作为预测语义相似性相似性相似性的首要因素。与分布式语义相似性比较,边缘性统计性比,从逻辑性等级性程度上看,在类似性化的比喻性、多重性化化性化的逻辑上,可以衡量类似性、类似性化、类似性基调、类似性变化、类似性变化的逻辑基础。