以事实英语撰写知识 (Knowledge Authoring with Factual English)

Knowledge representation and reasoning (KRR) systems represent knowledge as collections of facts and rules. Like databases, KRR systems contain information about domains of human activities like industrial enterprises, science, and business. KRRs can represent complex concepts and relations, and they can query and manipulate information in sophisticated ways. Unfortunately, the KRR technology has been hindered by the fact that specifying the requisite knowledge requires skills that most domain experts do not have, and professional knowledge engineers are hard to find. One solution could be to extract knowledge from English text, and a number of works have attempted to do so (OpenSesame, Google's Sling, etc.). Unfortunately, at present, extraction of logical facts from unrestricted natural language is still too inaccurate to be used for reasoning, while restricting the grammar of the language (so-called controlled natural language, or CNL) is hard for the users to learn and use. Nevertheless, some recent CNL-based approaches, such as the Knowledge Authoring Logic Machine (KALM), have shown to have very high accuracy compared to others, and a natural question is to what extent the CNL restrictions can be lifted. In this paper, we address this issue by transplanting the KALM framework to a neural natural language parser, mStanza. Here we limit our attention to authoring facts and queries and therefore our focus is what we call factual English statements. Authoring other types of knowledge, such as rules, will be considered in our followup work. As it turns out, neural network based parsers have problems of their own and the mistakes they make range from part-of-speech tagging to lemmatization to dependency errors. We present a number of techniques for combating these problems and test the new system, KALMFL (i.e., KALM for factual language), on a number of benchmarks, which show KALMFL achieves correctness in excess of 95%.

翻译：知识和推理(KRR)系统代表知识,是收集事实和规则的知识。与数据库一样,KRR系统包含关于人类活动领域的信息,例如工业企业、科学和商业。KRR系统可以代表复杂的概念和关系,它们可以以复杂的方式查询和操作信息。不幸的是,KRR技术受到阻碍,因为具体指定必要的知识需要大多数领域专家不具备的技能,而专业知识工程师则很难找到。一个解决办法可能是从英文文本中提取知识,一些工作尝试这样做(OpenSesame,Google Sling,等等)。不幸的是,目前从不受限制的自然语言中提取逻辑事实事实事实事实事实事实事实事实事实事实事实事实事实事实仍然无法用于解释。因此,我们用KALL(KAL)系统来解读事实事实事实真相,我们用KAL(K)网络来解读事实事实事实事实真相,我们用KAL(K)网络来解读这些事实事实真相,我们用KL(KAL)系统来解读数据,我们用它用新的逻辑机器(KALM(KM(KM)来计算出其数量比别人的准确得多,比别人的错误),自然数字要多得多。我们用它用什么来解释。我们用NLLL(KL),我们用NALL),我们用什么来解读),我们用在纸游戏游戏游戏游戏游戏的游戏的游戏的游戏的游戏的游戏,我们用什么来解释。