Understanding human's language requires complex world knowledge. However, existing large-scale knowledge graphs mainly focus on knowledge about entities while ignoring knowledge about activities, states, or events, which are used to describe how entities or things act in the real world. To fill this gap, we develop ASER (activities, states, events, and their relations), a large-scale eventuality knowledge graph extracted from more than 11-billion-token unstructured textual data. ASER contains 15 relation types belonging to five categories, 194-million unique eventualities, and 64-million unique edges among them. Both intrinsic and extrinsic evaluations demonstrate the quality and effectiveness of ASER.
翻译:理解人类语言需要复杂的世界知识。然而,现有的大型知识图表主要侧重于实体知识,而忽视关于活动、状态或事件的知识,而忽视关于活动、状态或事件的知识,这些知识被用来描述实体或事物在现实世界中如何运作。为了填补这一空白,我们开发了ASER(活动、状态、事件及其关系),这是从110亿吨以上的非结构化文本数据中提取的大规模可能的知识图表。ASER包含属于五类的15种关系类型,1.94亿个独特的可能性,以及其中6 400万个独特的优势。 内在和外部的评估都显示了ASER的质量和有效性。