A common thread of retrieval-augmented methods in the existing literature focuses on retrieving encyclopedic knowledge, such as Wikipedia, which facilitates well-defined entity and relation spaces that can be modeled. However, applying such methods to commonsense reasoning tasks faces two unique challenges, i.e., the lack of a general large-scale corpus for retrieval and a corresponding effective commonsense retriever. In this paper, we systematically investigate how to leverage commonsense knowledge retrieval to improve commonsense reasoning tasks. We proposed a unified framework of retrieval-augmented commonsense reasoning (called RACo), including a newly constructed commonsense corpus with over 20 million documents and novel strategies for training a commonsense retriever. We conducted experiments on four different commonsense reasoning tasks. Extensive evaluation results showed that our proposed RACo can significantly outperform other knowledge-enhanced method counterparts, achieving new SoTA performance on the CommonGen and CREAK leaderboards.
翻译:现有文献中一套共同的检索强化方法侧重于检索诸如维基百科这样的百科全书知识,这种知识有助于建立定义明确的实体和可以建模的关系空间,然而,将这类方法应用于常识推理任务面临两个独特的挑战,即缺乏一般的大规模检索系统和相应的有效公科检索器。在本文件中,我们系统地调查如何利用常识检索改进常识推理任务。我们提议了一个统一的检索强化常识推理框架(称为RACO),其中包括一个新建成的共识集,有2 000多万份文件,以及培训常识检索员的新战略。我们就四种不同的共识推理任务进行了实验。广泛的评价结果表明,我们提议的RACO可以大大超越其他知识强化方法对应方,在通用Gen和CREEK领导板上实现新的 SoTA业绩。