SCROLLS: 标准化的长语言序列共译 (SCROLLS: Standardized CompaRison Over Long Language Sequences) - 专知论文

会员服务 ·

0

INFORMS · 编码器-解码器（模型） · 自动问答 · 推断 · MoDELS ·

2022 年 10 月 11 日

SCROLLS: Standardized CompaRison Over Long Language Sequences

翻译：SCROLLS: 标准化的长语言序列共译

Uri Shaham,Elad Segal,Maor Ivgi,Avia Efrat,Ori Yoran,Adi Haviv,Ankit Gupta,Wenhan Xiong,Mor Geva,Jonathan Berant,Omer Levy

from arxiv, EMNLP 2022

NLP benchmarks have largely focused on short texts, such as sentences and paragraphs, even though long texts comprise a considerable amount of natural language in the wild. We introduce SCROLLS, a suite of tasks that require reasoning over long texts. We examine existing long-text datasets, and handpick ones where the text is naturally long, while prioritizing tasks that involve synthesizing information across the input. SCROLLS contains summarization, question answering, and natural language inference tasks, covering multiple domains, including literature, science, business, and entertainment. Initial baselines, including Longformer Encoder-Decoder, indicate that there is ample room for improvement on SCROLLS. We make all datasets available in a unified text-to-text format and host a live leaderboard to facilitate research on model architecture and pretraining methods.

翻译：尽管长篇案文包含大量自然语言,但长篇案文在很大程度上侧重于短文,如句子和段落等。我们引入了超文本类集,这是需要长篇案文推理的一套任务。我们研究了现有的长文本数据集,在文本自然长的方面手工挑选了那些数据集,同时安排了涉及综合各种输入信息的任务的优先次序。超文本类集包含概括、回答问题和自然语言推论任务,涵盖多个领域,包括文学、科学、商业和娱乐。包括古老的Encoder-Decoder在内的初始基线表明,对超文本类集有充分的改进空间。我们以统一的文本到文本格式提供所有数据集,并主持一个现场引导板,以便利对模型结构和培训前方法的研究。

0

相关内容

INFORMS

《计算机信息》杂志发表高质量的论文，扩大了运筹学和计算的范围，寻求有关理论、方法、实验、系统和应用方面的原创研究论文、新颖的调查和教程论文，以及描述新的和有用的软件工具的论文。官网链接：https://pubsonline.informs.org/journal/ijoc

自然语言处理顶会NAACL2022最佳论文出炉！

自然语言处理顶会NAACL2022最佳论文出炉！

专知会员服务

43+阅读 · 2022年6月30日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Latest News & Announcements of the Plenary Talk2

【ICIG2021】Latest News & Announcements of the Plenary Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年11月2日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

【ICIG2021】Latest News & Announcements of the Industry Talk2

【ICIG2021】Latest News & Announcements of the Industry Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年7月29日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

miR-363调控肺腺癌干细胞促进肿瘤转移的分子机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

基于GIS的典型黄土区地质灾害致灾因子的优选与地表灾害过程模拟

国家自然科学基金

0+阅读 · 2014年12月31日

长链非编码RNA-MEG3调控EMT影响胃癌侵袭迁移分子机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

海洋天然产物Lamellarin D糖基化衍生物的合成与构效关系研究

国家自然科学基金

0+阅读 · 2013年12月31日

胃癌中NKD2基因的甲基化调控和信号通路研究

国家自然科学基金

0+阅读 · 2013年12月31日

miR-335/TGF-β1/Smad通路调控EMT影响胃癌腹膜转移的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

BdDUOX和BdRelish在橘小实蝇肠道微生物群落稳态维持中的作用机理

国家自然科学基金

0+阅读 · 2012年12月31日

基于Kirkendall效应制备CuO粒子填充的一维核壳纳米结构及其稀磁性能研究

国家自然科学基金

0+阅读 · 2009年12月31日

SARI基因在肺癌侵袭转移中的作用及分子机制

国家自然科学基金

0+阅读 · 2009年12月31日

基于本体的Deep Web搜索技术

国家自然科学基金

2+阅读 · 2009年12月31日

Photometric identification of compact galaxies, stars and quasars using multiple neural networks

Arxiv

0+阅读 · 2022年11月15日

Multi-Label Quantification

Arxiv

0+阅读 · 2022年11月15日

On the Convergence of the ELBO to Entropy Sums

Arxiv

0+阅读 · 2022年11月14日

Exploring Length Generalization in Large Language Models

Arxiv

0+阅读 · 2022年11月14日

Measuring Progress on Scalable Oversight for Large Language Models

Arxiv

0+阅读 · 2022年11月11日

Impact of Video Compression on the Performance of Object Detection Systems for Surveillance Applications

Arxiv

0+阅读 · 2022年11月10日

Efficient Deep Learning: A Survey on Making Deep Learning Models Smaller, Faster, and Better

Arxiv

28+阅读 · 2021年6月16日

Recent Advances in Large Margin Learning

Arxiv

12+阅读 · 2021年3月25日

Differentiable Reasoning on Large Knowledge Bases and Natural Language

Arxiv

12+阅读 · 2019年12月17日

A Survey of Model Compression and Acceleration for Deep Neural Networks

Arxiv

66+阅读 · 2019年9月8日

VIP会员

文章信息

相关主题

编码器-解码器（模型）

相关VIP内容

自然语言处理顶会NAACL2022最佳论文出炉！

自然语言处理顶会NAACL2022最佳论文出炉！

专知会员服务

43+阅读 · 2022年6月30日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《以任务为中心的建模未来：将集成数字成熟度路径与用户故事框架融入任务工程》最新文献

《人机协作集成模型中的不确定性捕获》博士论文

运用不可解释人工智能进行军事决策

《以军铁剑战争中的战场决策》最新报告

相关资讯

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Latest News & Announcements of the Plenary Talk2

【ICIG2021】Latest News & Announcements of the Plenary Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年11月2日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

【ICIG2021】Latest News & Announcements of the Industry Talk2

【ICIG2021】Latest News & Announcements of the Industry Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年7月29日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

Photometric identification of compact galaxies, stars and quasars using multiple neural networks

Arxiv

0+阅读 · 2022年11月15日

Multi-Label Quantification

Arxiv

0+阅读 · 2022年11月15日

On the Convergence of the ELBO to Entropy Sums

Arxiv

0+阅读 · 2022年11月14日

Exploring Length Generalization in Large Language Models

Arxiv

0+阅读 · 2022年11月14日

Measuring Progress on Scalable Oversight for Large Language Models

Arxiv

0+阅读 · 2022年11月11日

Impact of Video Compression on the Performance of Object Detection Systems for Surveillance Applications

Arxiv

0+阅读 · 2022年11月10日

Efficient Deep Learning: A Survey on Making Deep Learning Models Smaller, Faster, and Better

Arxiv

28+阅读 · 2021年6月16日

Recent Advances in Large Margin Learning

Arxiv

12+阅读 · 2021年3月25日

Differentiable Reasoning on Large Knowledge Bases and Natural Language

Arxiv

12+阅读 · 2019年12月17日

A Survey of Model Compression and Acceleration for Deep Neural Networks

Arxiv

66+阅读 · 2019年9月8日

相关基金

miR-363调控肺腺癌干细胞促进肿瘤转移的分子机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

基于GIS的典型黄土区地质灾害致灾因子的优选与地表灾害过程模拟

国家自然科学基金

0+阅读 · 2014年12月31日

长链非编码RNA-MEG3调控EMT影响胃癌侵袭迁移分子机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

海洋天然产物Lamellarin D糖基化衍生物的合成与构效关系研究

国家自然科学基金

0+阅读 · 2013年12月31日

胃癌中NKD2基因的甲基化调控和信号通路研究

国家自然科学基金

0+阅读 · 2013年12月31日

miR-335/TGF-β1/Smad通路调控EMT影响胃癌腹膜转移的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

BdDUOX和BdRelish在橘小实蝇肠道微生物群落稳态维持中的作用机理

国家自然科学基金

0+阅读 · 2012年12月31日

基于Kirkendall效应制备CuO粒子填充的一维核壳纳米结构及其稀磁性能研究

国家自然科学基金

0+阅读 · 2009年12月31日

SARI基因在肺癌侵袭转移中的作用及分子机制

国家自然科学基金

0+阅读 · 2009年12月31日

基于本体的Deep Web搜索技术

国家自然科学基金

2+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员