Acquiring commonsense knowledge and reasoning is an important goal in modern NLP research. Despite much progress, there is still a lack of understanding (especially at scale) of the nature of commonsense knowledge itself. A potential source of structured commonsense knowledge that could be used to derive insights is ConceptNet. In particular, ConceptNet contains several coarse-grained relations, including HasContext, FormOf and SymbolOf, which can prove invaluable in understanding broad, but critically important, commonsense notions such as 'context'. In this article, we present a methodology based on unsupervised knowledge graph representation learning and clustering to reveal and study substructures in three heavily used commonsense relations in ConceptNet. Our results show that, despite having an 'official' definition in ConceptNet, many of these commonsense relations exhibit considerable sub-structure. In the future, therefore, such relations could be sub-divided into other relations with more refined definitions. We also supplement our core study with visualizations and qualitative analyses.
翻译:获取常识知识和推理是现代国家劳工政策研究的一个重要目标。尽管取得了许多进展,但对常识本身的性质仍然缺乏了解(特别是规模上)。结构化常识知识的潜在来源是概念网。特别是,概念网包含若干粗糙的关系,包括HasContext、FormOuf和符号,这些关系对于理解广泛但至关重要的常识概念,例如“context”可以证明是宝贵的。在本篇文章中,我们提出了一种基于未经监督的知识图表代表性学习和组合的方法,以揭示和研究概念网中三个使用量很大的常识关系中的次级结构。我们的结果表明,尽管概念网中有一个“官方”定义,但这些常识关系中有许多存在相当的次结构。因此,在将来,这种关系可以细分为与更精细的定义的其他关系。我们用可视化和定性分析来补充我们的核心研究。