RAFT: 现实世界鲜热文本分类基准 (RAFT: A Real-World Few-Shot Text Classification Benchmark) - 专知论文

会员服务 ·

0

Raft算法 · 小样本学习 · 文本分类 · 基准 · MoDELS ·

2022 年 1 月 18 日

RAFT: A Real-World Few-Shot Text Classification Benchmark

翻译：RAFT: 现实世界鲜热文本分类基准

Neel Alex,Eli Lifland,Lewis Tunstall,Abhishek Thakur,Pegah Maham,C. Jess Riedel,Emmie Hine,Carolyn Ashurst,Paul Sedille,Alexis Carlier,Michael Noetel,Andreas Stuhlmüller

from arxiv, Dataset, submission instructions, code and leaderboard available at https://raft.elicit.org

Large pre-trained language models have shown promise for few-shot learning, completing text-based tasks given only a few task-specific examples. Will models soon solve classification tasks that have so far been reserved for human research assistants? Existing benchmarks are not designed to measure progress in applied settings, and so don't directly answer this question. The RAFT benchmark (Real-world Annotated Few-shot Tasks) focuses on naturally occurring tasks and uses an evaluation setup that mirrors deployment. Baseline evaluations on RAFT reveal areas current techniques struggle with: reasoning over long texts and tasks with many classes. Human baselines show that some classification tasks are difficult for non-expert humans, reflecting that real-world value sometimes depends on domain expertise. Yet even non-expert human baseline F1 scores exceed GPT-3 by an average of 0.11. The RAFT datasets and leaderboard will track which model improvements translate into real-world benefits at https://raft.elicit.org .

翻译：受过培训的大型语言模型显示有希望进行少见的学习,完成基于文本的任务,只给出几个特定任务的例子。模型将很快解决迄今留给人类研究助理的分类任务吗?现有的基准不是用来衡量应用环境中的进展的,因此不能直接回答这个问题。RAFT基准(Real-world附加说明的少见任务)侧重于自然发生的任务,并使用一个反映部署的评价设置。RAFT基线评价揭示了当前技术挣扎的领域:对长文本和许多类任务进行推理。人类基线显示,一些分类任务对于非专家人来说是困难的,反映了现实世界的价值有时取决于领域的专门知识。但即使是非专家的F1基准分数也平均超过GPT-30.11。 RAFT数据集和领导板将跟踪在https://raft.eclicion.org上将改进模式转化为实际世界效益的模型。

0

相关内容

Raft算法

Stanford的Diego Ongaro和John Ousterhout提出了Raft算法，这是一个更容易理解的分布式一致性算法，在算法的论文中，不仅详细描述了算法，甚至给出了RPC接口定义和伪代码，这显然更加容易应用到工程实践中。

【CVPR 2022】基于层次化视觉语言知识蒸馏的开放词汇单阶段检测，Improving Visual Grounding with Visual-Linguistic Verification and Iterative Reasoning

【CVPR 2022】基于层次化视觉语言知识蒸馏的开放词汇单阶段检测，Improving Visual Grounding with Visual-Linguistic Verification and Iterative Reasoning

专知会员服务

7+阅读 · 2022年3月19日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

零样本文本分类，Zero-Shot Learning for Text Classification

零样本文本分类，Zero-Shot Learning for Text Classification

专知会员服务

97+阅读 · 2020年5月31日

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

专知会员服务

115+阅读 · 2020年4月5日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【用十亿级半监督学习实现最先进图像与视频分类】《Billion-scale semi-supervised learning for state-of-the-art image and video classification | Facebook》

【用十亿级半监督学习实现最先进图像与视频分类】《Billion-scale semi-supervised learning for state-of-the-art image and video classification | Facebook》

专知会员服务

16+阅读 · 2019年10月21日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

【ACL2020放榜!】事件抽取、关系抽取、NER、Few-Shot 相关论文整理

【ACL2020放榜!】事件抽取、关系抽取、NER、Few-Shot 相关论文整理

深度学习自然语言处理

18+阅读 · 2020年5月22日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

基于Lowrank分解的谱方法和有限差分地震正演模拟

国家自然科学基金

0+阅读 · 2015年12月31日

水下机器人可重构控制系统可靠性分析研究

国家自然科学基金

4+阅读 · 2015年12月31日

miR-29c在乳腺癌Lapatinib耐药中的作用及机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于欠定盲源分离和Copula统计的船舶电力系统健康评估建模研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于空间耦合压缩感知的复杂流场烟羽精确捕获研究

国家自然科学基金

0+阅读 · 2013年12月31日

青藏高原牧区雪灾预警机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

SPARC在强直性脊柱炎发病中的作用机制

国家自然科学基金

0+阅读 · 2011年12月31日

Cystatin B缺失与Prion疾病自噬作用机制的研究

国家自然科学基金

0+阅读 · 2011年12月31日

肿瘤逃逸机制研究及Jak2/Stat信号通路对抗肿瘤免疫反应的调节及应用

国家自然科学基金

0+阅读 · 2011年12月31日

基于 EC-SMC-MC共培养体系的参莲提取物防治AS作用评价及机制研究

国家自然科学基金

0+阅读 · 2009年12月31日

Active Few-Shot Learning with FASL

Arxiv

0+阅读 · 2022年4月20日

Antipatterns in Software Classification Taxonomies

Antipatterns in Software Classification Taxonomies

Arxiv

0+阅读 · 2022年4月19日

A Semi-supervised Learning Approach with Two Teachers to Improve Breakdown Identification in Dialogues

A Semi-supervised Learning Approach with Two Teachers to Improve Breakdown Identification in Dialogues

Arxiv

0+阅读 · 2022年4月19日

Constrained Sequence-to-Tree Generation for Hierarchical Text Classification

Arxiv

0+阅读 · 2022年4月19日

ExCon: Explanation-driven Supervised Contrastive Learning for Image Classification

Arxiv

0+阅读 · 2022年4月18日

Learning to Fill the Seam by Vision: Sub-millimeter Peg-in-hole on Unseen Shapes in Real World

Arxiv

0+阅读 · 2022年4月16日

Cross-Domain Few-Shot Graph Classification

Arxiv

13+阅读 · 2022年1月20日

Few-Shot Graph Classification with Model Agnostic Meta-Learning

Arxiv

23+阅读 · 2020年3月18日

Self-Supervised Learning For Few-Shot Image Classification

Self-Supervised Learning For Few-Shot Image Classification

Arxiv

19+阅读 · 2019年11月14日

Text Classification Algorithms: A Survey

Arxiv

15+阅读 · 2019年6月25日

VIP会员

文章信息

相关主题

小样本学习

相关VIP内容

【CVPR 2022】基于层次化视觉语言知识蒸馏的开放词汇单阶段检测，Improving Visual Grounding with Visual-Linguistic Verification and Iterative Reasoning

【CVPR 2022】基于层次化视觉语言知识蒸馏的开放词汇单阶段检测，Improving Visual Grounding with Visual-Linguistic Verification and Iterative Reasoning

专知会员服务

7+阅读 · 2022年3月19日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

零样本文本分类，Zero-Shot Learning for Text Classification

零样本文本分类，Zero-Shot Learning for Text Classification

专知会员服务

97+阅读 · 2020年5月31日

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

专知会员服务

115+阅读 · 2020年4月5日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【用十亿级半监督学习实现最先进图像与视频分类】《Billion-scale semi-supervised learning for state-of-the-art image and video classification | Facebook》

【用十亿级半监督学习实现最先进图像与视频分类】《Billion-scale semi-supervised learning for state-of-the-art image and video classification | Facebook》

专知会员服务

16+阅读 · 2019年10月21日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《物联网（IoT）中的无人机通信高效控制》135页

《在GNSS信号降级环境中利用共识实现无人机集群稳健协调》

中程单向攻击无人机的战略意义：俄乌战争启示

《面向无人机集群的避障动态传感器覆盖算法》最新38页

相关资讯

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

【ACL2020放榜!】事件抽取、关系抽取、NER、Few-Shot 相关论文整理

【ACL2020放榜!】事件抽取、关系抽取、NER、Few-Shot 相关论文整理

深度学习自然语言处理

18+阅读 · 2020年5月22日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

Active Few-Shot Learning with FASL

Arxiv

0+阅读 · 2022年4月20日

Antipatterns in Software Classification Taxonomies

Antipatterns in Software Classification Taxonomies

Arxiv

0+阅读 · 2022年4月19日

A Semi-supervised Learning Approach with Two Teachers to Improve Breakdown Identification in Dialogues

A Semi-supervised Learning Approach with Two Teachers to Improve Breakdown Identification in Dialogues

Arxiv

0+阅读 · 2022年4月19日

Constrained Sequence-to-Tree Generation for Hierarchical Text Classification

Arxiv

0+阅读 · 2022年4月19日

ExCon: Explanation-driven Supervised Contrastive Learning for Image Classification

Arxiv

0+阅读 · 2022年4月18日

Learning to Fill the Seam by Vision: Sub-millimeter Peg-in-hole on Unseen Shapes in Real World

Arxiv

0+阅读 · 2022年4月16日

Cross-Domain Few-Shot Graph Classification

Arxiv

13+阅读 · 2022年1月20日

Few-Shot Graph Classification with Model Agnostic Meta-Learning

Arxiv

23+阅读 · 2020年3月18日

Self-Supervised Learning For Few-Shot Image Classification

Self-Supervised Learning For Few-Shot Image Classification

Arxiv

19+阅读 · 2019年11月14日

Text Classification Algorithms: A Survey

Arxiv

15+阅读 · 2019年6月25日

相关基金

基于Lowrank分解的谱方法和有限差分地震正演模拟

国家自然科学基金

0+阅读 · 2015年12月31日

水下机器人可重构控制系统可靠性分析研究

国家自然科学基金

4+阅读 · 2015年12月31日

miR-29c在乳腺癌Lapatinib耐药中的作用及机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于欠定盲源分离和Copula统计的船舶电力系统健康评估建模研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于空间耦合压缩感知的复杂流场烟羽精确捕获研究

国家自然科学基金

0+阅读 · 2013年12月31日

青藏高原牧区雪灾预警机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

SPARC在强直性脊柱炎发病中的作用机制

国家自然科学基金

0+阅读 · 2011年12月31日

Cystatin B缺失与Prion疾病自噬作用机制的研究

国家自然科学基金

0+阅读 · 2011年12月31日

肿瘤逃逸机制研究及Jak2/Stat信号通路对抗肿瘤免疫反应的调节及应用

国家自然科学基金

0+阅读 · 2011年12月31日

基于 EC-SMC-MC共培养体系的参莲提取物防治AS作用评价及机制研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员