Commonsense generation aims to generate a realistic sentence describing a daily scene under the given concepts, which is very challenging, since it requires models to have relational reasoning and compositional generalization capabilities. Previous work focuses on retrieving prototype sentences for the provided concepts to assist generation. They first use a sparse retriever to retrieve candidate sentences, then re-rank the candidates with a ranker. However, the candidates returned by their ranker may not be the most relevant sentences, since the ranker treats all candidates equally without considering their relevance to the reference sentences of the given concepts. Another problem is that re-ranking is very expensive, but only using retrievers will seriously degrade the performance of their generation models. To solve these problems, we propose the metric distillation rule to distill knowledge from the metric (e.g., BLEU) to the ranker. We further transfer the critical knowledge summarized by the distilled ranker to the retriever. In this way, the relevance scores of candidate sentences predicted by the ranker and retriever will be more consistent with their quality measured by the metric. Experimental results on the CommonGen benchmark verify the effectiveness of our proposed method: (1) Our generation model with the distilled ranker achieves a new state-of-the-art result. (2) Our generation model with the distilled retriever even surpasses the previous SOTA.
翻译:常识一代旨在产生一个现实的句子,描述特定概念下的每日场景,这非常具有挑战性,因为它要求模型具有关联推理和构成概括能力。 先前的工作重点是为协助生成提供的概念重新获取原型句子。 它们首先使用一个稀疏的检索器检索候选判决,然后用一个排级员重新排列候选人。 然而,由排名员送回的候选人可能不是最相关的句子,因为排级员在对待所有候选人时不考虑其与特定概念参考判决的相关性。 另一个问题是,重新排级费用非常昂贵,但只有使用检索器才能严重降低其生成模型的性能。 为了解决这些问题,我们建议采用标准蒸馏规则,从标准(如BLEU)中提取知识,然后用排级员重新排级者重新排列候选人。 我们进一步将精选的排级员所总结的关键知识传递给检索员。 这样, 降级员和检索员预测的候选判决的分数将更符合其根据标准衡量的质量。 共同Gen基准的实验性结果将严重地降低其生成模型的绩效。 为了我们所拟议的代代的比额的比额。 (我们的前的) 与我们所计算的前代成果的结果。