In this paper, we provide a novel way to generate low dimensional vector embeddings for the noun and verb synsets in WordNet, where the hypernym-hyponym relationship is preserved in the embeddings. We call this embedding the Sense Spectrum (and Sense Spectra for embeddings). In order to create suitable labels for the training of sense spectra, we designed a new similarity measurement for noun and verb synsets in WordNet. We call this similarity measurement the Hypernym Intersection Similarity (HIS), since it compares the common and unique hypernyms between two synsets. Our experiments show that on the noun and verb pairs of the SimLex-999 dataset, HIS outperforms the three similarity measurements in WordNet. Moreover, to the best of our knowledge, the sense spectra provide the first dense synset embeddings that preserve the semantic relationships in WordNet.
翻译:在本文中,我们为WordNet的名词和动词合成集提供了一种新颖的方式,为WordNet中的名词和动词合成集生成低维矢量嵌入。 WordNet中保存了超nym-hyponym关系。我们称之为嵌入Sense Spectrum(和嵌入的Sense Spetra)。为了为培养感官光谱创建合适的标签,我们设计了一种新的类似度量度,用于WordNet中的名词和动词合成集。我们称之为Hypernym交叉相近(HIS),因为它比较了两个同步的普通和独特的超音性。我们的实验显示,SimLex-999数据集的名词和动词配对,他超越了WordNet中三种相似度量度。此外,根据我们的知识,感官谱提供了第一个保存WordNet中语义关系的密集的环集嵌入。