通过本地关注和内容选择进行长长的 Span 概括 (Long-Span Summarization via Local Attention and Content Selection) - 专知论文

会员服务 ·

0

state-of-the-art · MoDELS · 值域 · 注意力机制 · ROUGE ·

2021 年 5 月 29 日

Long-Span Summarization via Local Attention and Content Selection

翻译：通过本地关注和内容选择进行长长的 Span 概括

Potsawee Manakul,Mark J. F. Gales

from arxiv, ACL 2021 (camera-ready)

Transformer-based models have achieved state-of-the-art results in a wide range of natural language processing (NLP) tasks including document summarization. Typically these systems are trained by fine-tuning a large pre-trained model to the target task. One issue with these transformer-based models is that they do not scale well in terms of memory and compute requirements as the input length grows. Thus, for long document summarization, it can be challenging to train or fine-tune these models. In this work, we exploit large pre-trained transformer-based models and address long-span dependencies in abstractive summarization using two methods: local self-attention; and explicit content selection. These approaches are compared on a range of network configurations. Experiments are carried out on standard long-span summarization tasks, including Spotify Podcast, arXiv, and PubMed datasets. We demonstrate that by combining these methods, we can achieve state-of-the-art results on all three tasks in the ROUGE scores. Moreover, without a large-scale GPU card, our approach can achieve comparable or better results than existing approaches.

翻译：以变压器为基础的模型在包括文件摘要化在内的各种自然语言处理(NLP)任务中取得了最先进的结果。这些系统通常通过微调一个大型预先培训的模型对目标任务进行培训。这些变压器模型的一个问题是,这些模型在记忆和计算要求方面规模不高,而随着输入长度的提高,这些模型的记忆和计算要求也不同。因此,对于长期的文件总和来说,培训或微调这些模型可能具有挑战性。在这项工作中,我们利用了大型预先培训的变压器模型,并用两种方法来解决抽象式总和的长期依赖性:当地自省;和明确的内容选择。这些方法比较了网络配置的范围。实验是在标准长宽的拼凑任务上进行的,包括Spodificast、arXiv和PubMed数据集。我们证明,通过结合这些方法,我们能够在ROUGEE分数的所有三项任务中实现最先进的结果。此外,没有大规模GPUP卡,我们的方法可以比现有的结果更好。

0

相关内容

state-of-the-art

state-of-the-art

【ICML2020】文本摘要生成模型PEGASUS

【ICML2020】文本摘要生成模型PEGASUS

专知会员服务

35+阅读 · 2020年8月23日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

【IJCAI2020】神经摘要结构性注意力，Neural Abstractive Summarization with Structural Attention

【IJCAI2020】神经摘要结构性注意力，Neural Abstractive Summarization with Structural Attention

专知会员服务

33+阅读 · 2020年4月24日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

【文本摘要】Text Summarization文本摘要与注意力机制

【文本摘要】Text Summarization文本摘要与注意力机制

深度学习自然语言处理

9+阅读 · 2020年3月15日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Call for Participation: Shared Tasks in NLPCC 2019

Call for Participation: Shared Tasks in NLPCC 2019

中国计算机学会

5+阅读 · 2019年3月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

大神一年100篇论文

大神一年100篇论文

CreateAMind

15+阅读 · 2018年12月31日

【论文推荐】最新八篇视频描述生成相关论文—在线视频理解、联合定位和描述事件、生成视频、跨模态注意力机制、联合事件检测和描述

【论文推荐】最新八篇视频描述生成相关论文—在线视频理解、联合定位和描述事件、生成视频、跨模态注意力机制、联合事件检测和描述

专知

11+阅读 · 2018年6月4日

【推荐】用TensorFlow实现LSTM社交对话股市情感分析

【推荐】用TensorFlow实现LSTM社交对话股市情感分析

机器学习研究会

11+阅读 · 2018年1月14日

TensorFlow seq2seq中的Attention机制（续）

TensorFlow seq2seq中的Attention机制（续）

深度学习每日摘要

15+阅读 · 2017年11月16日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

NLP中自动生产文摘（auto text summarization）

NLP中自动生产文摘（auto text summarization）

机器学习研究会

14+阅读 · 2017年10月10日

Kleister: Key Information Extraction Datasets Involving Long Documents with Complex Layouts

Kleister: Key Information Extraction Datasets Involving Long Documents with Complex Layouts

Arxiv

3+阅读 · 2021年5月12日

PEGASUS: Pre-training with Extracted Gap-sentences for Abstractive Summarization

Arxiv

17+阅读 · 2020年6月2日

Not All Attention Is Needed: Gated Attention Network for Sequence Data

Arxiv

3+阅读 · 2019年12月1日

Area Attention

Arxiv

5+阅读 · 2019年5月23日

Joint entity recognition and relation extraction as a multi-head selection problem

Arxiv

3+阅读 · 2018年12月17日

Jointly Multiple Events Extraction via Attention-based Graph Information Aggregation

Jointly Multiple Events Extraction via Attention-based Graph Information Aggregation

Arxiv

5+阅读 · 2018年9月24日

Deep Communicating Agents for Abstractive Summarization

Arxiv

5+阅读 · 2018年3月27日

The Web as a Knowledge-base for Answering Complex Questions

Arxiv

5+阅读 · 2018年3月18日

Attention Is All You Need

Arxiv

27+阅读 · 2017年12月6日

Content based video retrieval

Arxiv

3+阅读 · 2012年11月20日

VIP会员

文章信息

相关主题

state-of-the-art

注意力机制

相关VIP内容

【ICML2020】文本摘要生成模型PEGASUS

【ICML2020】文本摘要生成模型PEGASUS

专知会员服务

35+阅读 · 2020年8月23日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

【IJCAI2020】神经摘要结构性注意力，Neural Abstractive Summarization with Structural Attention

【IJCAI2020】神经摘要结构性注意力，Neural Abstractive Summarization with Structural Attention

专知会员服务

33+阅读 · 2020年4月24日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

热门VIP内容

开通专知VIP会员享更多权益服务

【博士论文】扩展可扩展会话推荐的边界

别想太多：高效 R1 风格大型推理模型综述

【ACMMM2025】EvoVLMA: 进化式视觉-语言模型自适应

智能体网络：用AI智能体编织下一代网络

相关资讯

【文本摘要】Text Summarization文本摘要与注意力机制

【文本摘要】Text Summarization文本摘要与注意力机制

深度学习自然语言处理

9+阅读 · 2020年3月15日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Call for Participation: Shared Tasks in NLPCC 2019

Call for Participation: Shared Tasks in NLPCC 2019

中国计算机学会

5+阅读 · 2019年3月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

大神一年100篇论文

大神一年100篇论文

CreateAMind

15+阅读 · 2018年12月31日

【论文推荐】最新八篇视频描述生成相关论文—在线视频理解、联合定位和描述事件、生成视频、跨模态注意力机制、联合事件检测和描述

【论文推荐】最新八篇视频描述生成相关论文—在线视频理解、联合定位和描述事件、生成视频、跨模态注意力机制、联合事件检测和描述

专知

11+阅读 · 2018年6月4日

【推荐】用TensorFlow实现LSTM社交对话股市情感分析

【推荐】用TensorFlow实现LSTM社交对话股市情感分析

机器学习研究会

11+阅读 · 2018年1月14日

TensorFlow seq2seq中的Attention机制（续）

TensorFlow seq2seq中的Attention机制（续）

深度学习每日摘要

15+阅读 · 2017年11月16日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

NLP中自动生产文摘（auto text summarization）

NLP中自动生产文摘（auto text summarization）

机器学习研究会

14+阅读 · 2017年10月10日

相关论文

Kleister: Key Information Extraction Datasets Involving Long Documents with Complex Layouts

Kleister: Key Information Extraction Datasets Involving Long Documents with Complex Layouts

Arxiv

3+阅读 · 2021年5月12日

PEGASUS: Pre-training with Extracted Gap-sentences for Abstractive Summarization

Arxiv

17+阅读 · 2020年6月2日

Not All Attention Is Needed: Gated Attention Network for Sequence Data

Arxiv

3+阅读 · 2019年12月1日

Area Attention

Arxiv

5+阅读 · 2019年5月23日

Joint entity recognition and relation extraction as a multi-head selection problem

Arxiv

3+阅读 · 2018年12月17日

Jointly Multiple Events Extraction via Attention-based Graph Information Aggregation

Jointly Multiple Events Extraction via Attention-based Graph Information Aggregation

Arxiv

5+阅读 · 2018年9月24日

Deep Communicating Agents for Abstractive Summarization

Arxiv

5+阅读 · 2018年3月27日

The Web as a Knowledge-base for Answering Complex Questions

Arxiv

5+阅读 · 2018年3月18日

Attention Is All You Need

Arxiv

27+阅读 · 2017年12月6日

Content based video retrieval

Arxiv

3+阅读 · 2012年11月20日

微信扫码咨询专知VIP会员