利用信息瓶颈利用信息,用于科学文件摘要 (Leveraging Information Bottleneck for Scientific Document Summarization) - 专知论文

会员服务 ·

0

INFORMS · Principle · 语言模型化 · 分离的 · 无监督 ·

2021 年 10 月 4 日

Leveraging Information Bottleneck for Scientific Document Summarization

翻译：利用信息瓶颈利用信息,用于科学文件摘要

Jiaxin Ju,Ming Liu,Huan Yee Koh,Yuan Jin,Lan Du,Shirui Pan

from arxiv, Accepted at EMNLP 2021 Findings

This paper presents an unsupervised extractive approach to summarize scientific long documents based on the Information Bottleneck principle. Inspired by previous work which uses the Information Bottleneck principle for sentence compression, we extend it to document level summarization with two separate steps. In the first step, we use signal(s) as queries to retrieve the key content from the source document. Then, a pre-trained language model conducts further sentence search and edit to return the final extracted summaries. Importantly, our work can be flexibly extended to a multi-view framework by different signals. Automatic evaluation on three scientific document datasets verifies the effectiveness of the proposed framework. The further human evaluation suggests that the extracted summaries cover more content aspects than previous systems.

翻译：本文件介绍了一种未经监督的采掘方法,根据信息瓶颈原则对长长的科学文件进行总结。在以往使用信息瓶颈原则进行句子压缩的工作的启发下,我们将其扩展为以两个不同步骤对文件水平进行总结。第一步,我们使用信号查询源文件的关键内容。然后,经过预先培训的语言模式进行进一步的句子搜索和编辑,以归还最后摘录的摘要。重要的是,我们的工作可以通过不同的信号灵活地扩展到多视角框架。对三个科学文件数据集的自动评估可以核实拟议框架的有效性。进一步的人类评估表明,所提取的摘要的内容方面比以往系统要多。

0

相关内容

INFORMS

《计算机信息》杂志发表高质量的论文，扩大了运筹学和计算的范围，寻求有关理论、方法、实验、系统和应用方面的原创研究论文、新颖的调查和教程论文，以及描述新的和有用的软件工具的论文。官网链接：https://pubsonline.informs.org/journal/ijoc

【数据科学导论书】Introduction to Datascience，253页pdf

【数据科学导论书】Introduction to Datascience，253页pdf

专知会员服务

50+阅读 · 2021年11月15日

最新《Transformers模型》教程，64页ppt

最新《Transformers模型》教程，64页ppt

专知会员服务

324+阅读 · 2020年11月26日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【综述】文献级机器翻译研究:方法与评价（A Survey on Document-level Machine Translation: Methods and Evaluation）

【综述】文献级机器翻译研究:方法与评价（A Survey on Document-level Machine Translation: Methods and Evaluation）

专知会员服务

7+阅读 · 2019年12月19日

【ACL 2019 Tutorials】从结构化数据和知识图谱中讲故事：NLG的观点（Storytelling from Structured Data and Knowledge Graphs : An NLG Perspective）

【ACL 2019 Tutorials】从结构化数据和知识图谱中讲故事：NLG的观点（Storytelling from Structured Data and Knowledge Graphs : An NLG Perspective）

专知会员服务

26+阅读 · 2019年11月18日

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

10+阅读 · 2019年10月24日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

【深度学习视频分析/多模态学习资源大列表】

【深度学习视频分析/多模态学习资源大列表】

专知会员服务

92+阅读 · 2019年10月16日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

周志华教授：如何做研究与写论文？

周志华教授：如何做研究与写论文？

专知会员服务

161+阅读 · 2019年10月9日

计算机类 | PLDI 2020等国际会议信息6条

计算机类 | PLDI 2020等国际会议信息6条

Call4Papers

3+阅读 · 2019年7月8日

人工智能 | SCI期刊专刊信息3条

人工智能 | SCI期刊专刊信息3条

Call4Papers

5+阅读 · 2019年1月10日

大数据 | 顶级SCI期刊专刊/国际会议信息7条

大数据 | 顶级SCI期刊专刊/国际会议信息7条

Call4Papers

10+阅读 · 2018年12月29日

已删除

将门创投

4+阅读 · 2017年12月5日

The Role of Mutual Information in Variational Classifiers

Arxiv

0+阅读 · 2021年11月25日

Knowledge Enhanced Sports Game Summarization

Knowledge Enhanced Sports Game Summarization

Arxiv

0+阅读 · 2021年11月24日

Graph InfoClust: Leveraging cluster-level node information for unsupervised graph representation learning

Arxiv

5+阅读 · 2020年9月15日

PEGASUS: Pre-training with Extracted Gap-sentences for Abstractive Summarization

Arxiv

17+阅读 · 2020年6月2日

Automatic Summarization of Natural Language

Arxiv

3+阅读 · 2018年12月18日

Multi-hop Inference for Sentence-level TextGraphs: How Challenging is Meaningfully Combining Information for Science Question Answering?

Arxiv

3+阅读 · 2018年5月29日

Improving Character-based Decoding Using Target-Side Morphological Information for Neural Machine Translation

Arxiv

3+阅读 · 2018年4月17日

Multi-Reward Reinforced Summarization with Saliency and Entailment

Arxiv

4+阅读 · 2018年4月17日

Deep Communicating Agents for Abstractive Summarization

Arxiv

5+阅读 · 2018年3月27日

Open Information Extraction on Scientific Text: An Evaluation

Arxiv

6+阅读 · 2018年2月15日

VIP会员

文章信息

相关主题

语言模型化

相关VIP内容

【数据科学导论书】Introduction to Datascience，253页pdf

【数据科学导论书】Introduction to Datascience，253页pdf

专知会员服务

50+阅读 · 2021年11月15日

最新《Transformers模型》教程，64页ppt

最新《Transformers模型》教程，64页ppt

专知会员服务

324+阅读 · 2020年11月26日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【综述】文献级机器翻译研究:方法与评价（A Survey on Document-level Machine Translation: Methods and Evaluation）

【综述】文献级机器翻译研究:方法与评价（A Survey on Document-level Machine Translation: Methods and Evaluation）

专知会员服务

7+阅读 · 2019年12月19日

【ACL 2019 Tutorials】从结构化数据和知识图谱中讲故事：NLG的观点（Storytelling from Structured Data and Knowledge Graphs : An NLG Perspective）

【ACL 2019 Tutorials】从结构化数据和知识图谱中讲故事：NLG的观点（Storytelling from Structured Data and Knowledge Graphs : An NLG Perspective）

专知会员服务

26+阅读 · 2019年11月18日

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

10+阅读 · 2019年10月24日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

【深度学习视频分析/多模态学习资源大列表】

【深度学习视频分析/多模态学习资源大列表】

专知会员服务

92+阅读 · 2019年10月16日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

周志华教授：如何做研究与写论文？

周志华教授：如何做研究与写论文？

专知会员服务

161+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

人机协同作战规划：来自美海军陆战队的大语言模型（LLM）使用教训

对北约军事总部战略规划制定与实施的研究 | 140页

美联参会指南-联合规划与执行概述及政策框架 | 32页

俄罗斯军事规划差异性凸显其思维的重要性 | 2025最新文献

相关资讯

计算机类 | PLDI 2020等国际会议信息6条

计算机类 | PLDI 2020等国际会议信息6条

Call4Papers

3+阅读 · 2019年7月8日

人工智能 | SCI期刊专刊信息3条

人工智能 | SCI期刊专刊信息3条

Call4Papers

5+阅读 · 2019年1月10日

大数据 | 顶级SCI期刊专刊/国际会议信息7条

大数据 | 顶级SCI期刊专刊/国际会议信息7条

Call4Papers

10+阅读 · 2018年12月29日

已删除

将门创投

4+阅读 · 2017年12月5日

相关论文

The Role of Mutual Information in Variational Classifiers

Arxiv

0+阅读 · 2021年11月25日

Knowledge Enhanced Sports Game Summarization

Knowledge Enhanced Sports Game Summarization

Arxiv

0+阅读 · 2021年11月24日

Graph InfoClust: Leveraging cluster-level node information for unsupervised graph representation learning

Arxiv

5+阅读 · 2020年9月15日

PEGASUS: Pre-training with Extracted Gap-sentences for Abstractive Summarization

Arxiv

17+阅读 · 2020年6月2日

Automatic Summarization of Natural Language

Arxiv

3+阅读 · 2018年12月18日

Multi-hop Inference for Sentence-level TextGraphs: How Challenging is Meaningfully Combining Information for Science Question Answering?

Arxiv

3+阅读 · 2018年5月29日

Improving Character-based Decoding Using Target-Side Morphological Information for Neural Machine Translation

Arxiv

3+阅读 · 2018年4月17日

Multi-Reward Reinforced Summarization with Saliency and Entailment

Arxiv

4+阅读 · 2018年4月17日

Deep Communicating Agents for Abstractive Summarization

Arxiv

5+阅读 · 2018年3月27日

Open Information Extraction on Scientific Text: An Evaluation

Arxiv

6+阅读 · 2018年2月15日

微信扫码咨询专知VIP会员