Knowledge is captured in the form of entities and their relationships and stored in knowledge graphs. Knowledge graphs enhance the capabilities of applications in many different areas including Web search, recommendation, and natural language understanding. This is mainly because, entities enable machines to understand things that go beyond simple tokens. Many modern algorithms use learned entity embeddings from these structured representations. However, building a knowledge graph takes time and effort, hence very costly and nontrivial. On the other hand, many Web sources describe entities in some structured format and therefore, finding ways to get them into useful entity knowledge is advantageous. We propose an approach that processes entity centric textual knowledge sources to learn entity embeddings and in turn avoids the need for a traditional knowledge graph. We first extract triples into the new representation format that does not use traditional complex triple extraction methods defined by pre-determined relationship labels. Then we learn entity embeddings through this new type of triples. We show that the embeddings learned from our approach are: (i) high quality and comparable to a known knowledge graph-based embeddings and can be used to improve them further, (ii) better than a contextual language model-based entity embeddings, and (iii) easy to compute and versatile in domain-specific applications where a knowledge graph is not readily available
翻译:以实体及其关系的形式捕捉知识,并储存在知识图中。知识图有助于提高许多不同领域的应用能力,包括网络搜索、建议和自然语言理解。这主要是因为,实体使机器能够理解超出简单象征的东西。许多现代算法使用这些结构化代表体的学习实体嵌入。然而,建立知识图需要时间和精力,因此成本很高,而且非三重性。另一方面,许多网络来源以某种结构化格式描述实体,因此,找到将其纳入有用的实体知识的方法是有利的。我们建议一种方法,即处理实体中心文字知识源,以学习实体嵌入,从而避免需要传统知识图表。我们首先将三重输入新的代表格式,不使用预先确定的关系标签所定义的传统复杂的三重提取方法。然后,我们学习实体嵌入这种新类型的三重。我们从我们的方法中学到的嵌入是:(一)高质量和可比的已知知识图表嵌入方式。我们提议的一种方法可以用来进一步改进这些实体的以文字为中心的知识源的知识源的知识源,并且可以进一步加以改进,(二)我们首先提取三)比可轻易的直观和直观的版域域图式语言应用程序更好。