What is the relation between a word and its description, or a word and its embedding? Both descriptions and embeddings are semantic representations of words. But, what information from the original word remains in these representations? Or more importantly, which information about a word do these two representations share? Definition Modeling and Reverse Dictionary are two opposite learning tasks that address these questions. The goal of the Definition Modeling task is to investigate the power of information laying inside a word embedding to express the meaning of the word in a humanly understandable way -- as a dictionary definition. Conversely, the Reverse Dictionary task explores the ability to predict word embeddings directly from its definition. In this paper, by tackling these two tasks, we are exploring the relationship between words and their semantic representations. We present our findings based on the descriptive, exploratory, and predictive data analysis conducted on the CODWOE dataset. We give a detailed overview of the systems that we designed for Definition Modeling and Reverse Dictionary tasks, and that achieved top scores on SemEval-2022 CODWOE challenge in several subtasks. We hope that our experimental results concerning the predictive models and the data analyses we provide will prove useful in future explorations of word representations and their relationships.
翻译:单词及其描述或单词及其嵌入之间的关系是什么? 描述和嵌入是词的语义表达方式。 但是,从最初的单词中仍然保留着什么信息? 或者更重要的是,关于这两个表达方式共享的单词的信息是什么? 定义建模和反反词字典是处理这些问题的两个相反的学习任务。 定义建模任务的目的是调查嵌入一个单词中的信息的力量,该单词以人类可以理解的方式表达该词的含义 -- -- 作为字典定义。相反,逆变字典任务探索了直接从定义中预测单词嵌入的能力。 在本文件中,我们通过处理这两个任务,正在探索单词及其语义表达方式之间的关系。 我们根据对CODWOE数据集的描述、探索和预测数据分析提出我们的结论。 我们详细概述了我们为定义建模和反词义任务设计的系统,以及SEMEval-2022 CODWOE挑战取得的最高分数。 我们希望,通过处理这两个任务,我们将在几个子的实验性关系中提供我们关于预测模型的实验性分析结果。