QuestEval is a reference-less metric used in text-to-text tasks, that compares the generated summaries directly to the source text, by automatically asking and answering questions. Its adaptation to Data-to-Text tasks is not straightforward, as it requires multimodal Question Generation and Answering systems on the considered tasks, which are seldom available. To this purpose, we propose a method to build synthetic multimodal corpora enabling to train multimodal components for a data-QuestEval metric. The resulting metric is reference-less and multimodal; it obtains state-of-the-art correlations with human judgment on the WebNLG and WikiBio benchmarks. We make data-QuestEval's code and models available for reproducibility purpose, as part of the QuestEval project.
翻译:QuestEval是一个在文本到文本任务中使用的不参考指标,通过自动询问和回答问题,将生成的摘要直接与源文本进行比较,其适应数据到文本的任务并非直截了当,因为它要求就考虑的任务建立多式问题生成和回答系统,而这些任务很少具备。为此目的,我们提出一种方法,用于建立合成多式联运公司,以便能够为数据到格式的衡量标准培训多式组件。由此产生的指标是无参考和多式的;它在WebNLG和WikiBio基准上获得了与人类判断的最新相关联。我们提供了数据-QuestEval的代码和模型,作为QuestEval项目的一部分,供复制使用。