Abstractive summarization is the process of generating a summary given a document as input. Although significant progress has been made, the factual inconsistency between the document and the generated summary still limits its practical applications. Previous work found that the probabilities assigned by the generation model reflect its preferences for the generated summary, including the preference for factual consistency, and the preference for the language or knowledge prior as well. To separate the preference for factual consistency, we propose an unsupervised framework named CoP by controlling the preference of the generation model with the help of prompt. More specifically, the framework performs an extra inference step in which a text prompt is introduced as an additional input. In this way, another preference is described by the generation probability of this extra inference process. The difference between the above two preferences, i.e. the difference between the probabilities, could be used as measurements for detecting factual inconsistencies. Interestingly, we found that with the properly designed prompt, our framework could evaluate specific preferences and serve as measurements for fine-grained categories of inconsistency, such as entity-related inconsistency, coreference-related inconsistency, etc. Moreover, our framework could also be extended to the supervised setting to learn better prompt from the labeled data as well. Experiments show that our framework achieves new SOTA results on three factual inconsistency detection tasks.
 翻译:抽象总结是生成一份摘要的过程,作为投入。虽然已经取得了很大进展,但文件与生成摘要之间的事实不一致仍然限制了其实际应用。先前的工作发现,生成模型所分配的概率反映了其对生成摘要的偏好,包括偏好事实一致性,以及先前对语言或知识的偏好。为了区分对事实一致性的偏好,我们提议了一个未经监督的框架,名为COP,通过迅速帮助控制生成模型的偏好来控制生成模型的偏好。更具体地说,该框架采取了额外的推论步骤,在其中引入了文本提示作为补充投入。这样,这一额外推论过程的生成概率描述了另一种偏好。上述两种偏好之间的差异,即概率之间的差异,可以用来测量事实不一致性。有趣的是,我们发现,如果设计得当即,我们的框架可以评价具体的偏好,并且作为精确的不一致性类别的衡量标准,例如与实体有关的不一致、与引用有关的不一致等等。此外,我们的框架还可以以这一额外推论过程的概率描述来描述另一种偏好。上述两种偏差,即概率的差异可以用来用来测量事实性框架,从监督性标签到检验新的不一致性。