配有随机网络蒸馏器的经常批注器 (Distributionally Robust Recurrent Decoders with Random Network Distillation) - 专知论文

会员服务 ·

0

语言模型化 · 稳健性 · 蒸馏 · MoDELS · Networking ·

2022 年 4 月 24 日

Distributionally Robust Recurrent Decoders with Random Network Distillation

翻译：配有随机网络蒸馏器的经常批注器

Antonio Valerio Miceli-Barone,Alexandra Birch,Rico Sennrich

from arxiv, 8 pages, 1 figure

Neural machine learning models can successfully model language that is similar to their training distribution, but they are highly susceptible to degradation under distribution shift, which occurs in many practical applications when processing out-of-domain (OOD) text. This has been attributed to "shortcut learning": relying on weak correlations over arbitrary large contexts. We propose a method based on OOD detection with Random Network Distillation to allow an autoregressive language model to automatically disregard OOD context during inference, smoothly transitioning towards a less expressive but more robust model as the data becomes more OOD while retaining its full context capability when operating in-distribution. We apply our method to a GRU architecture, demonstrating improvements on multiple language modeling (LM) datasets.

翻译：神经机学习模型可以成功地模拟与其培训分布相类似的语言,但非常容易在分布转移中退化,这在处理外部域(OOD)文本时会在许多实际应用中发生。这归因于“短期学习 ” : 依赖任意大环境的薄弱关联性。我们提出了一个基于随机网络蒸馏检测OOOD的方法,允许自动递减语言模型在推断时自动忽略OOD环境,随着数据在分布过程中的运行,平稳地过渡到一个表达性较弱但更健全的模型,同时保留其全部背景能力。我们将我们的方法应用于一个GRU结构,展示多语言模型数据集的改进。

0

相关内容

语言模型化

语言模型化

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【新书发布】原作者MarcG.Bellemare发布315页分布强化学习书籍(DistributionalRL)

【新书发布】原作者MarcG.Bellemare发布315页分布强化学习书籍(DistributionalRL)

深度强化学习实验室

1+阅读 · 2022年1月11日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

CVPR2019 | 15篇论文速递（涵盖目标检测、语义分割和姿态估计等方向）

CVPR2019 | 15篇论文速递（涵盖目标检测、语义分割和姿态估计等方向）

AI研习社

15+阅读 · 2019年5月8日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

专知

23+阅读 · 2018年1月18日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

水稻miR408耐冷功能与作用机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

神经内分泌肿瘤特异性多功能纳米分子探针NIRF-CCPM-Octreotide的研究

国家自然科学基金

0+阅读 · 2013年12月31日

面向云服务数据中心的OpenScale全光交换网络

国家自然科学基金

3+阅读 · 2013年12月31日

云计算环境下的BESIII数据处理与分析

国家自然科学基金

0+阅读 · 2012年12月31日

不同基因型（p53codon72）鼻咽癌细胞放射敏感性差异的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于氯通道差异性表达的Disulfiram-Cu靶向抗肿瘤作用及其机制

国家自然科学基金

0+阅读 · 2012年12月31日

miRNAs在小鼠卵泡发育中的作用及其分子机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

几何阻挫体系ATO2中自旋、电荷、轨道序及其相互作用研究

国家自然科学基金

0+阅读 · 2011年12月31日

编码密码学中若干组合对象研究

国家自然科学基金

0+阅读 · 2009年12月31日

大数据量空间信息实时传输和三维可视化的技术与方法

国家自然科学基金

0+阅读 · 2009年12月31日

A new distance measurement and its application in K-Means Algorithm

A new distance measurement and its application in K-Means Algorithm

Arxiv

0+阅读 · 2022年6月10日

Transformer-Graph Neural Network with Global-Local Attention for Multimodal Rumour Detection with Knowledge Distillation

Arxiv

0+阅读 · 2022年6月10日

Accurate Node Feature Estimation with Structured Variational Graph Autoencoder

Arxiv

0+阅读 · 2022年6月9日

Dual-Distribution Discrepancy for Anomaly Detection in Chest X-Rays

Arxiv

0+阅读 · 2022年6月9日

A Unified Model for Multi-class Anomaly Detection

Arxiv

0+阅读 · 2022年6月8日

Neural Network Decoders for Permutation Codes Correcting Different Errors

Arxiv

0+阅读 · 2022年6月7日

Learning Neural Models for Natural Language Processing in the Face of Distributional Shift

Arxiv

11+阅读 · 2021年9月3日

Data Augmentation for Graph Neural Networks

Arxiv

38+阅读 · 2020年12月2日

Spatio-Temporal Graph for Video Captioning with Knowledge Distillation

Spatio-Temporal Graph for Video Captioning with Knowledge Distillation

Arxiv

19+阅读 · 2020年3月31日

Learning Discrete Structures for Graph Neural Networks

Arxiv

17+阅读 · 2019年3月28日

VIP会员

文章信息

相关主题

语言模型化

相关VIP内容

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

操作系统智能体：基于多模态大模型（MLLM）的通用计算设备智能体综述

《美国太空军系统全生命周期建模、仿真与分析效能提升方案》最新84页报告

【博士论文】推进数据高效的深度学习：非参数 Transformer、主动测试与上下文学习

自主人工智能：未来战争是否将是自主化的？

相关资讯

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【新书发布】原作者MarcG.Bellemare发布315页分布强化学习书籍(DistributionalRL)

【新书发布】原作者MarcG.Bellemare发布315页分布强化学习书籍(DistributionalRL)

深度强化学习实验室

1+阅读 · 2022年1月11日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

CVPR2019 | 15篇论文速递（涵盖目标检测、语义分割和姿态估计等方向）

CVPR2019 | 15篇论文速递（涵盖目标检测、语义分割和姿态估计等方向）

AI研习社

15+阅读 · 2019年5月8日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

专知

23+阅读 · 2018年1月18日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

相关论文

A new distance measurement and its application in K-Means Algorithm

A new distance measurement and its application in K-Means Algorithm

Arxiv

0+阅读 · 2022年6月10日

Transformer-Graph Neural Network with Global-Local Attention for Multimodal Rumour Detection with Knowledge Distillation

Arxiv

0+阅读 · 2022年6月10日

Accurate Node Feature Estimation with Structured Variational Graph Autoencoder

Arxiv

0+阅读 · 2022年6月9日

Dual-Distribution Discrepancy for Anomaly Detection in Chest X-Rays

Arxiv

0+阅读 · 2022年6月9日

A Unified Model for Multi-class Anomaly Detection

Arxiv

0+阅读 · 2022年6月8日

Neural Network Decoders for Permutation Codes Correcting Different Errors

Arxiv

0+阅读 · 2022年6月7日

Learning Neural Models for Natural Language Processing in the Face of Distributional Shift

Arxiv

11+阅读 · 2021年9月3日

Data Augmentation for Graph Neural Networks

Arxiv

38+阅读 · 2020年12月2日

Spatio-Temporal Graph for Video Captioning with Knowledge Distillation

Spatio-Temporal Graph for Video Captioning with Knowledge Distillation

Arxiv

19+阅读 · 2020年3月31日

Learning Discrete Structures for Graph Neural Networks

Arxiv

17+阅读 · 2019年3月28日

相关基金

水稻miR408耐冷功能与作用机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

神经内分泌肿瘤特异性多功能纳米分子探针NIRF-CCPM-Octreotide的研究

国家自然科学基金

0+阅读 · 2013年12月31日

面向云服务数据中心的OpenScale全光交换网络

国家自然科学基金

3+阅读 · 2013年12月31日

云计算环境下的BESIII数据处理与分析

国家自然科学基金

0+阅读 · 2012年12月31日

不同基因型（p53codon72）鼻咽癌细胞放射敏感性差异的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于氯通道差异性表达的Disulfiram-Cu靶向抗肿瘤作用及其机制

国家自然科学基金

0+阅读 · 2012年12月31日

miRNAs在小鼠卵泡发育中的作用及其分子机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

几何阻挫体系ATO2中自旋、电荷、轨道序及其相互作用研究

国家自然科学基金

0+阅读 · 2011年12月31日

编码密码学中若干组合对象研究

国家自然科学基金

0+阅读 · 2009年12月31日

大数据量空间信息实时传输和三维可视化的技术与方法

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员