FFCI:对摘要的可解释自动评价框架 (FFCI: A Framework for Interpretable Automatic Evaluation of Summarization) - 专知论文

会员服务 ·

0

自动问答 · 查全率/召回率 · MoDELS · 查准率/准确率 · 值域 ·

2021 年 7 月 3 日

FFCI: A Framework for Interpretable Automatic Evaluation of Summarization

翻译：FFCI:对摘要的可解释自动评价框架

Fajri Koto,Timothy Baldwin,Jey Han Lau

from arxiv, Under review for the Journal of Artificial Intelligence Research (JAIR)

In this paper, we propose FFCI, a framework for fine-grained summarization evaluation that comprises four elements: faithfulness (degree of factual consistency with the source), focus (precision of summary content relative to the reference), coverage (recall of summary content relative to the reference), and inter-sentential coherence (document fluency between adjacent sentences). We construct a novel dataset for focus, coverage, and inter-sentential coherence, and develop automatic methods for evaluating each of the four dimensions of FFCI based on cross-comparison of evaluation metrics and model-based evaluation methods, including question answering (QA) approaches, STS, next-sentence prediction (NSP), and scores derived from 19 pre-trained language models. We then apply the developed metrics in evaluating a broad range of summarization models across two datasets, with some surprising findings.

翻译：在本文中,我们提议FFCI,这是一个精细总结评价框架,由四个要素组成:忠诚(与来源的实际一致性程度)、重点(与参考相比摘要内容的准确性)、覆盖面(参照参考内容的摘要回顾)和实质间的一致性(相邻句子之间的文件流畅程度),我们为重点、覆盖面和内容间的一致性建立一个新的数据集,并根据评价指标的交叉比较和基于模型的评价方法,包括问题回答方法、STS、下句预测和19个预先培训的语言模型的分数,制定自动方法,评估FFCI的四个层面的每一个层面,我们随后在评价两个数据集的广泛汇总模型时采用已开发的衡量标准,并得出一些令人惊讶的结论。

0

相关内容

自动问答

自动问答（Question Answering, QA）是指利用计算机自动回答用户所提出的问题以满足用户知识需求的任务。不同于现有搜索引擎，问答系统是信息服务的一种高级形式，系统返回用户的不再是基于关键词匹配排序的文档列表，而是精准的自然语言答案。近年来，随着人工智能的飞速发展，自动问答已经成为倍受关注且发展前景广泛的研究方向。

知识荟萃

精品入门和进阶教程、论文和代码整理等

更多

查看相关VIP内容、论文、资讯等

【CVPR2021】动态度量学习

【CVPR2021】动态度量学习

专知会员服务

41+阅读 · 2021年3月30日

【CVPR2021】神经结构搜索的相对论性评价

专知会员服务

12+阅读 · 2021年3月25日

【CVPR2021】自监督几何感知

【CVPR2021】自监督几何感知

专知会员服务

46+阅读 · 2021年3月6日

【视频描述综述论文】Video Description: A Survey of Methods, Datasets, and Evaluation Metrics

【视频描述综述论文】Video Description: A Survey of Methods, Datasets, and Evaluation Metrics

专知会员服务

65+阅读 · 2020年5月12日

《可解释的机器学习-interpretable-ml》238页pdf

《可解释的机器学习-interpretable-ml》238页pdf

专知会员服务

208+阅读 · 2020年2月24日

【ICCV 2019 Toturial】Interpretable Machine Learning for Computer Vision（用于计算机视觉的可解释性机器学习）

【ICCV 2019 Toturial】Interpretable Machine Learning for Computer Vision（用于计算机视觉的可解释性机器学习）

专知会员服务

32+阅读 · 2019年10月30日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

MIT新书《强化学习与最优控制》

MIT新书《强化学习与最优控制》

专知会员服务

280+阅读 · 2019年10月9日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

计算机 | CCF推荐会议信息10条

计算机 | CCF推荐会议信息10条

Call4Papers

5+阅读 · 2018年10月18日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

已删除

将门创投

4+阅读 · 2018年6月1日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

可解释的CNN

可解释的CNN

CreateAMind

17+阅读 · 2017年10月5日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

Data-QuestEval: A Referenceless Metric for Data-to-Text Semantic Evaluation

Arxiv

0+阅读 · 2021年9月7日

DialogLM: Pre-trained Model for Long Dialogue Understanding and Summarization

DialogLM: Pre-trained Model for Long Dialogue Understanding and Summarization

Arxiv

0+阅读 · 2021年9月6日

Counterfactual Evaluation for Explainable AI

Arxiv

1+阅读 · 2021年9月5日

PEGASUS: Pre-training with Extracted Gap-sentences for Abstractive Summarization

Arxiv

17+阅读 · 2020年6月2日

CNM: An Interpretable Complex-valued Network for Matching

Arxiv

4+阅读 · 2019年4月10日

Automatic Summarization of Natural Language

Arxiv

3+阅读 · 2018年12月18日

Neural Network Interpretation via Fine Grained Textual Summarization

Arxiv

6+阅读 · 2018年5月23日

Deep learning evaluation using deep linguistic processing

Arxiv

3+阅读 · 2018年5月12日

An Improved Evaluation Framework for Generative Adversarial Networks

Arxiv

3+阅读 · 2018年3月27日

Interpretable Counting for Visual Question Answering

Arxiv

6+阅读 · 2018年3月2日

VIP会员

文章信息

相关主题

查全率/召回率

查准率/准确率

相关VIP内容

【CVPR2021】动态度量学习

【CVPR2021】动态度量学习

专知会员服务

41+阅读 · 2021年3月30日

【CVPR2021】神经结构搜索的相对论性评价

专知会员服务

12+阅读 · 2021年3月25日

【CVPR2021】自监督几何感知

【CVPR2021】自监督几何感知

专知会员服务

46+阅读 · 2021年3月6日

【视频描述综述论文】Video Description: A Survey of Methods, Datasets, and Evaluation Metrics

【视频描述综述论文】Video Description: A Survey of Methods, Datasets, and Evaluation Metrics

专知会员服务

65+阅读 · 2020年5月12日

《可解释的机器学习-interpretable-ml》238页pdf

《可解释的机器学习-interpretable-ml》238页pdf

专知会员服务

208+阅读 · 2020年2月24日

【ICCV 2019 Toturial】Interpretable Machine Learning for Computer Vision（用于计算机视觉的可解释性机器学习）

【ICCV 2019 Toturial】Interpretable Machine Learning for Computer Vision（用于计算机视觉的可解释性机器学习）

专知会员服务

32+阅读 · 2019年10月30日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

MIT新书《强化学习与最优控制》

MIT新书《强化学习与最优控制》

专知会员服务

280+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《分析与预测陆军战斗体能测试表现：统计与机器学习方法》2025最新137页

《军事行动中的人机协同共同学习》2025最新文献

代理式人工智能时代的决策优势

《F/A-18机队替换中队仿真模型的设计与分析》2025最新73页

相关资讯

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

计算机 | CCF推荐会议信息10条

计算机 | CCF推荐会议信息10条

Call4Papers

5+阅读 · 2018年10月18日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

已删除

将门创投

4+阅读 · 2018年6月1日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

可解释的CNN

可解释的CNN

CreateAMind

17+阅读 · 2017年10月5日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

相关论文

Data-QuestEval: A Referenceless Metric for Data-to-Text Semantic Evaluation

Arxiv

0+阅读 · 2021年9月7日

DialogLM: Pre-trained Model for Long Dialogue Understanding and Summarization

DialogLM: Pre-trained Model for Long Dialogue Understanding and Summarization

Arxiv

0+阅读 · 2021年9月6日

Counterfactual Evaluation for Explainable AI

Arxiv

1+阅读 · 2021年9月5日

PEGASUS: Pre-training with Extracted Gap-sentences for Abstractive Summarization

Arxiv

17+阅读 · 2020年6月2日

CNM: An Interpretable Complex-valued Network for Matching

Arxiv

4+阅读 · 2019年4月10日

Automatic Summarization of Natural Language

Arxiv

3+阅读 · 2018年12月18日

Neural Network Interpretation via Fine Grained Textual Summarization

Arxiv

6+阅读 · 2018年5月23日

Deep learning evaluation using deep linguistic processing

Arxiv

3+阅读 · 2018年5月12日

An Improved Evaluation Framework for Generative Adversarial Networks

Arxiv

3+阅读 · 2018年3月27日

Interpretable Counting for Visual Question Answering

Arxiv

6+阅读 · 2018年3月2日

微信扫码咨询专知VIP会员