Rules have a number of desirable properties. It is easy to understand, infer new knowledge, and communicate with other inference systems. One weakness of the previous rule induction systems is that they only find rules within a knowledge base (KB) and therefore cannot generalize to more open and complex real-world rules. Recently, the language model (LM)-based rule generation are proposed to enhance the expressive power of the rules. In this paper, we revisit the differences between KB-based rule induction and LM-based rule generation. We argue that, while KB-based methods inducted rules by discovering data commonalities, the current LM-based methods are "learning rules from rules". This limits these methods to only produce "canned" rules whose patterns are constrained by the annotated rules, while discarding the rich expressive power of LMs for free text. Therefore, in this paper, we propose the open rule induction problem, which aims to induce open rules utilizing the knowledge in LMs. Besides, we propose the Orion (\underline{o}pen \underline{r}ule \underline{i}nducti\underline{on}) system to automatically mine open rules from LMs without supervision of annotated rules. We conducted extensive experiments to verify the quality and quantity of the inducted open rules. Surprisingly, when applying the open rules in downstream tasks (i.e. relation extraction), these automatically inducted rules even outperformed the manually annotated rules.
翻译:规则具有一些可取的属性。 很容易理解、 推断新的知识, 并与其他推论系统沟通。 先前的规则上岗系统的一个弱点是, 它们只在知识库( KB) 中找到规则, 因此无法概括到更开放和复杂的现实世界规则。 最近, 基于语言模型( LM) 的规则生成建议加强规则的表达力。 在本文中, 我们重新审视基于 KB 规则的上岗和基于 LM 的规则生成之间的差别。 我们争论说, 虽然基于 KB 的方法通过发现数据共性而引入规则, 目前基于 LM 的方法是“ 从规则中学习规则 ” 。 这限制了这些方法只产生“ 允许的” 规则, 其模式受到附加说明的规则的限制, 而放弃基于语言模型的丰富表达力来增强规则的表达力。 因此, 我们在本文件中提出开放规则上的问题, 目的是利用LMS 的知识来引入开放规则。 此外, 我们提议从 Orion( deline) 直线 直线{ 直线{ { 直线} 以当前基于 LMine 方法的方法是“ 学习规则中学习规则的“ 从自动地校外校外规则 校外校外校外校外校外校外校外校外规则 ” 规则 。