课程学习培训动态:单语和跨语言国家语言股研究 (Training Dynamics for Curriculum Learning: A Study on Monolingual and Cross-lingual NLU) - 专知论文

会员服务 ·

0

NLU · Performer · MoDELS · 统计量 · Better ·

2022 年 11 月 24 日

Training Dynamics for Curriculum Learning: A Study on Monolingual and Cross-lingual NLU

翻译：课程学习培训动态:单语和跨语言国家语言股研究

Fenia Christopoulou,Gerasimos Lampouras,Ignacio Iacobacci

from arxiv, 17 pages, 4 figures, 6 tables. To appear in EMNLP 2022

Curriculum Learning (CL) is a technique of training models via ranking examples in a typically increasing difficulty trend with the aim of accelerating convergence and improving generalisability. Current approaches for Natural Language Understanding (NLU) tasks use CL to improve in-distribution data performance often via heuristic-oriented or task-agnostic difficulties. In this work, instead, we employ CL for NLU by taking advantage of training dynamics as difficulty metrics, i.e., statistics that measure the behavior of the model at hand on specific task-data instances during training and propose modifications of existing CL schedulers based on these statistics. Differently from existing works, we focus on evaluating models on in-distribution (ID), out-of-distribution (OOD) as well as zero-shot (ZS) cross-lingual transfer datasets. We show across several NLU tasks that CL with training dynamics can result in better performance mostly on zero-shot cross-lingual transfer and OOD settings with improvements up by 8.5% in certain cases. Overall, experiments indicate that training dynamics can lead to better performing models with smoother training compared to other difficulty metrics while being 20% faster on average. In addition, through analysis we shed light on the correlations of task-specific versus task-agnostic metrics.

翻译：课程学习(CL)是一种培训模式的技巧,在典型的日益困难趋势中,通过排名实例进行培训,目的是加速趋同和改进通用性。目前,自然语言理解(NLU)任务的方法使用CL来提高分布数据性能,通常是通过超理论性或跨语言传输(ZS)数据集来提高分布数据性能。在这项工作中,我们为NLU采用CL, 利用培训动态作为难度度量,即统计数据,衡量培训中特定任务数据实例的当前模式行为,并根据这些统计数据建议修改现有的CL时间表。与现有工作不同,我们侧重于评价分配(ID)、分配外分配(OOD)和零点(ZS)跨语言传输数据集的模型。我们从几个NLLU任务中显示,培训动态的CL能够提高绩效,主要是零点跨语言传输和OOD环境的绩效,在某些情况下改进了8.5%。总体而言,实验表明,培训动态可以使模型的运行得更好,而与其他困难度培训相比,与其他困难度(ID)指标相比,同时进行20 %的进度分析。

0

相关内容

NLU

【USC-Aaron Chan博士答辩Slides】可信自然语言处理机器解释的生成与利用, 242页ppt，Generating and Utilizing Machine Explanations for Trustworthy NLP

【USC-Aaron Chan博士答辩Slides】可信自然语言处理机器解释的生成与利用, 242页ppt，Generating and Utilizing Machine Explanations for Trustworthy NLP

专知会员服务

16+阅读 · 2022年3月13日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

细胞协同/互补效应对脊髓损伤的治疗与再生微环境调节作用

国家自然科学基金

0+阅读 · 2012年12月31日

基于对NogoA-NgR/Rho-ROCK通路调节的清脑益智方促神经元突触重塑治疗VD的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

MicroRNA-124协同TGF-β1-Smad4调节小胶质细胞内毒素耐受的机制

国家自然科学基金

0+阅读 · 2012年12月31日

水体流动大型浅水湖泊生境变化对氮磷扩散消减的影响研究——以鄱阳湖为例

国家自然科学基金

0+阅读 · 2012年12月31日

基于斑马鱼胚胎研究阻燃剂的生物有效性和内分泌干扰分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

黄河上游农业生态系统中复合污染物的环境过程及生态效应

国家自然科学基金

1+阅读 · 2011年12月31日

角质层基因cer-ym的克隆及其在植物抗旱生态适应中的作用

国家自然科学基金

0+阅读 · 2011年12月31日

P450亚型酶对土壤典型污染物毒性响应及致毒作用机理

国家自然科学基金

0+阅读 · 2009年12月31日

幽门螺杆菌益生菌型口服疫苗的研制

国家自然科学基金

0+阅读 · 2008年12月31日

脂肪因子Chemerin在骨骼肌胰岛素抵抗发生中的作用及其机制

国家自然科学基金

0+阅读 · 2008年12月31日

A Comparative Study of Pretrained Language Models for Long Clinical Text

Arxiv

0+阅读 · 2023年1月27日

A Benchmark Study by using various Machine Learning Models for Predicting Covid-19 trends

Arxiv

1+阅读 · 2023年1月26日

Train Hard, Fight Easy: Robust Meta Reinforcement Learning

Arxiv

0+阅读 · 2023年1月26日

Random Grid Neural Processes for Parametric Partial Differential Equations

Arxiv

0+阅读 · 2023年1月26日

Self-Supervised Curricular Deep Learning for Chest X-Ray Image Classification

Self-Supervised Curricular Deep Learning for Chest X-Ray Image Classification

Arxiv

0+阅读 · 2023年1月25日

RDIS: Random Drop Imputation with Self-Training for Incomplete Time Series Data

Arxiv

0+阅读 · 2023年1月25日

Cross-lingual Argument Mining in the Medical Domain

Arxiv

1+阅读 · 2023年1月25日

Interactive-Chain-Prompting: Ambiguity Resolution for Crosslingual Conditional Generation with Interaction

Arxiv

0+阅读 · 2023年1月24日

Adaptive Synthetic Characters for Military Training

Adaptive Synthetic Characters for Military Training

Arxiv

49+阅读 · 2021年1月6日

Unsupervised Domain Clusters in Pretrained Language Models

Arxiv

11+阅读 · 2020年4月5日

VIP会员

文章信息

相关主题

相关VIP内容

【USC-Aaron Chan博士答辩Slides】可信自然语言处理机器解释的生成与利用, 242页ppt，Generating and Utilizing Machine Explanations for Trustworthy NLP

【USC-Aaron Chan博士答辩Slides】可信自然语言处理机器解释的生成与利用, 242页ppt，Generating and Utilizing Machine Explanations for Trustworthy NLP

专知会员服务

16+阅读 · 2022年3月13日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

操作系统智能体：基于多模态大模型（MLLM）的通用计算设备智能体综述

《美国太空军系统全生命周期建模、仿真与分析效能提升方案》最新84页报告

【博士论文】推进数据高效的深度学习：非参数 Transformer、主动测试与上下文学习

自主人工智能：未来战争是否将是自主化的？

相关资讯

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

A Comparative Study of Pretrained Language Models for Long Clinical Text

Arxiv

0+阅读 · 2023年1月27日

A Benchmark Study by using various Machine Learning Models for Predicting Covid-19 trends

Arxiv

1+阅读 · 2023年1月26日

Train Hard, Fight Easy: Robust Meta Reinforcement Learning

Arxiv

0+阅读 · 2023年1月26日

Random Grid Neural Processes for Parametric Partial Differential Equations

Arxiv

0+阅读 · 2023年1月26日

Self-Supervised Curricular Deep Learning for Chest X-Ray Image Classification

Self-Supervised Curricular Deep Learning for Chest X-Ray Image Classification

Arxiv

0+阅读 · 2023年1月25日

RDIS: Random Drop Imputation with Self-Training for Incomplete Time Series Data

Arxiv

0+阅读 · 2023年1月25日

Cross-lingual Argument Mining in the Medical Domain

Arxiv

1+阅读 · 2023年1月25日

Interactive-Chain-Prompting: Ambiguity Resolution for Crosslingual Conditional Generation with Interaction

Arxiv

0+阅读 · 2023年1月24日

Adaptive Synthetic Characters for Military Training

Adaptive Synthetic Characters for Military Training

Arxiv

49+阅读 · 2021年1月6日

Unsupervised Domain Clusters in Pretrained Language Models

Arxiv

11+阅读 · 2020年4月5日

相关基金

细胞协同/互补效应对脊髓损伤的治疗与再生微环境调节作用

国家自然科学基金

0+阅读 · 2012年12月31日

基于对NogoA-NgR/Rho-ROCK通路调节的清脑益智方促神经元突触重塑治疗VD的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

MicroRNA-124协同TGF-β1-Smad4调节小胶质细胞内毒素耐受的机制

国家自然科学基金

0+阅读 · 2012年12月31日

水体流动大型浅水湖泊生境变化对氮磷扩散消减的影响研究——以鄱阳湖为例

国家自然科学基金

0+阅读 · 2012年12月31日

基于斑马鱼胚胎研究阻燃剂的生物有效性和内分泌干扰分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

黄河上游农业生态系统中复合污染物的环境过程及生态效应

国家自然科学基金

1+阅读 · 2011年12月31日

角质层基因cer-ym的克隆及其在植物抗旱生态适应中的作用

国家自然科学基金

0+阅读 · 2011年12月31日

P450亚型酶对土壤典型污染物毒性响应及致毒作用机理

国家自然科学基金

0+阅读 · 2009年12月31日

幽门螺杆菌益生菌型口服疫苗的研制

国家自然科学基金

0+阅读 · 2008年12月31日

脂肪因子Chemerin在骨骼肌胰岛素抵抗发生中的作用及其机制

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员