时空Wiki:培训和评价不断演变的语言模式的终身基准 (TemporalWiki: A Lifelong Benchmark for Training and Evaluating Ever-Evolving Language Models) - 专知论文

会员服务 ·

0

知识 (knowledge) · 语言模型化 · Continuity · Perplexity · MoDELS ·

2022 年 4 月 29 日

TemporalWiki: A Lifelong Benchmark for Training and Evaluating Ever-Evolving Language Models

翻译：时空Wiki:培训和评价不断演变的语言模式的终身基准

Joel Jang,Seonghyeon Ye,Changho Lee,Sohee Yang,Joongbo Shin,Janghoon Han,Gyeonghun Kim,Minjoon Seo

Language Models (LMs) become outdated as the world changes; they often fail to perform tasks requiring recent factual information which was absent or different during training, a phenomenon called temporal misalignment. This is especially a challenging problem because the research community still lacks a coherent dataset for assessing the adaptability of LMs to frequently-updated knowledge corpus such as Wikipedia. To this end, we introduce TemporalWiki, a lifelong benchmark for ever-evolving LMs that utilizes the difference between consecutive snapshots of English Wikipedia and English Wikidata for training and evaluation, respectively. The benchmark hence allows researchers to periodically track an LM's ability to retain previous knowledge and acquire updated/new knowledge at each point in time. We also find that training an LM on the diff data through continual learning methods achieves similar or better perplexity than on the entire snapshot in our benchmark with 12 times less computational cost, which verifies that factual knowledge in LMs can be safely updated with minimal training data via continual learning. The dataset and the code are available at https://github.com/joeljang/temporalwiki .

翻译：随着世界的变化,语言模型(LMS)已经过时;它们往往不能执行需要最新事实信息的任务,而这种信息在培训期间不存在或不同,这是一种称为时间错配的现象。这是一个特别具有挑战性的问题,因为研究界仍然缺乏一套连贯的数据集来评估LMS适应经常更新的知识,例如维基百科。为此,我们引入了TemoralWiki,这是不断演变的LMS的终身基准,它分别利用英文维基百科连续短片和英文维基数据相异,用于培训和评估。因此,基准使研究人员能够定期跟踪LM是否有能力保留以前的知识并在每一时刻获得更新/新知识。我们还发现,通过持续学习方法对LM数据进行LM培训,与我们基准中的全部截图相比,其计算成本低12倍,与整个截图相比,具有相似性或更好混淆性,这证明LMS的事实知识可以通过持续学习的最低限度的培训数据来安全更新。数据集和代码可在https://github.com/joelgeang/temporalwiki查阅。

0

相关内容

知识 (knowledge)

知识 (knowledge)

通过学习、实践或探索所获得的认识、判断或技能。

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

TensorFlow 2.0 学习资源汇总

TensorFlow 2.0 学习资源汇总

专知会员服务

67+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

肥胖相关Hepatokine LECT2在肝脏中的调控及机制

国家自然科学基金

1+阅读 · 2015年12月31日

高稳定片层Ni-Mg-Al催化剂的可控制备及其在干气重整中的性能研究

国家自然科学基金

0+阅读 · 2015年12月31日

氧化铁晶面调控IB族金属粒子落位及反应条件下的动态行为研究

国家自然科学基金

0+阅读 · 2013年12月31日

氧化应激介导的卵泡膜-间质细胞衰老在卵巢衰老中的作用及其机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

TAM/Gas6在石英粉尘致炎性反应及纤维化中的作用及机制

国家自然科学基金

0+阅读 · 2012年12月31日

Intraflagellar Transport运输纤毛蛋白的分子机理

国家自然科学基金

0+阅读 · 2012年12月31日

三峡库区水上突发事件应急资源配置鲁棒优化研究

国家自然科学基金

0+阅读 · 2012年12月31日

ROS在调控心肌衰老过程中Beclin 1-Vps34复合体功能和自噬流的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

小电导Ca2+激活K+通道与ryanodine受体功能性偶联的研究

国家自然科学基金

0+阅读 · 2008年12月31日

非线性不连续系统的稳定与镇定

国家自然科学基金

0+阅读 · 2008年12月31日

A Unified Evaluation of Textual Backdoor Learning: Frameworks and Benchmarks

Arxiv

0+阅读 · 2022年6月17日

Beyond Supervised vs. Unsupervised: Representative Benchmarking and Analysis of Image Representation Learning

Arxiv

0+阅读 · 2022年6月16日

FixEval: Execution-based Evaluation of Program Fixes for Competitive Programming Problems

Arxiv

0+阅读 · 2022年6月15日

Neural Architecture Search without Training

Neural Architecture Search without Training

Arxiv

10+阅读 · 2021年6月11日

Faster Meta Update Strategy for Noise-Robust Deep Learning

Arxiv

11+阅读 · 2021年4月30日

Adaptive Synthetic Characters for Military Training

Adaptive Synthetic Characters for Military Training

Arxiv

49+阅读 · 2021年1月6日

Heterogeneous Network Representation Learning: A Unified Framework with Survey and Benchmark

Heterogeneous Network Representation Learning: A Unified Framework with Survey and Benchmark

Arxiv

19+阅读 · 2020年12月17日

Extreme Language Model Compression with Optimal Subwords and Shared Projections

Extreme Language Model Compression with Optimal Subwords and Shared Projections

Arxiv

18+阅读 · 2019年9月25日

Continual Lifelong Learning with Neural Networks: A Review

Arxiv

14+阅读 · 2019年2月11日

Ripple Network: Propagating User Preferences on the Knowledge Graph for Recommender Systems

Arxiv

14+阅读 · 2018年5月19日

VIP会员

文章信息

相关主题

知识 (knowledge)

语言模型化

相关VIP内容

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

TensorFlow 2.0 学习资源汇总

TensorFlow 2.0 学习资源汇总

专知会员服务

67+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【博士论文】低维与高维空间中潜在表征的分析、建模与变换

《生态建模密码破译：建模与编程实践》美陆军最新报告

大模型解决方案白皮书：社交陪伴场景全流程落地指南

面向具身操作的视觉-语言-动作模型综述

相关资讯

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

A Unified Evaluation of Textual Backdoor Learning: Frameworks and Benchmarks

Arxiv

0+阅读 · 2022年6月17日

Beyond Supervised vs. Unsupervised: Representative Benchmarking and Analysis of Image Representation Learning

Arxiv

0+阅读 · 2022年6月16日

FixEval: Execution-based Evaluation of Program Fixes for Competitive Programming Problems

Arxiv

0+阅读 · 2022年6月15日

Neural Architecture Search without Training

Neural Architecture Search without Training

Arxiv

10+阅读 · 2021年6月11日

Faster Meta Update Strategy for Noise-Robust Deep Learning

Arxiv

11+阅读 · 2021年4月30日

Adaptive Synthetic Characters for Military Training

Adaptive Synthetic Characters for Military Training

Arxiv

49+阅读 · 2021年1月6日

Heterogeneous Network Representation Learning: A Unified Framework with Survey and Benchmark

Heterogeneous Network Representation Learning: A Unified Framework with Survey and Benchmark

Arxiv

19+阅读 · 2020年12月17日

Extreme Language Model Compression with Optimal Subwords and Shared Projections

Extreme Language Model Compression with Optimal Subwords and Shared Projections

Arxiv

18+阅读 · 2019年9月25日

Continual Lifelong Learning with Neural Networks: A Review

Arxiv

14+阅读 · 2019年2月11日

Ripple Network: Propagating User Preferences on the Knowledge Graph for Recommender Systems

Arxiv

14+阅读 · 2018年5月19日

相关基金

肥胖相关Hepatokine LECT2在肝脏中的调控及机制

国家自然科学基金

1+阅读 · 2015年12月31日

高稳定片层Ni-Mg-Al催化剂的可控制备及其在干气重整中的性能研究

国家自然科学基金

0+阅读 · 2015年12月31日

氧化铁晶面调控IB族金属粒子落位及反应条件下的动态行为研究

国家自然科学基金

0+阅读 · 2013年12月31日

氧化应激介导的卵泡膜-间质细胞衰老在卵巢衰老中的作用及其机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

TAM/Gas6在石英粉尘致炎性反应及纤维化中的作用及机制

国家自然科学基金

0+阅读 · 2012年12月31日

Intraflagellar Transport运输纤毛蛋白的分子机理

国家自然科学基金

0+阅读 · 2012年12月31日

三峡库区水上突发事件应急资源配置鲁棒优化研究

国家自然科学基金

0+阅读 · 2012年12月31日

ROS在调控心肌衰老过程中Beclin 1-Vps34复合体功能和自噬流的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

小电导Ca2+激活K+通道与ryanodine受体功能性偶联的研究

国家自然科学基金

0+阅读 · 2008年12月31日

非线性不连续系统的稳定与镇定

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员