通用战略 (Linear Connectivity Reveals Generalization Strategies) - 专知论文

会员服务 ·

0

簇 · 泛化理论 · 线性的 · MoDELS · Microsoft Surface ·

2022 年 7 月 9 日

Linear Connectivity Reveals Generalization Strategies

翻译：通用战略

Jeevesh Juneja,Rachit Bansal,Kyunghyun Cho,João Sedoc,Naomi Saphra

It is widely accepted in the mode connectivity literature that when two neural networks are trained similarly on the same data, they are connected by a path through parameter space over which test set accuracy is maintained. Under some circumstances, including transfer learning from pretrained models, these paths are presumed to be linear. In contrast to existing results, we find that among text classifiers (trained on MNLI, QQP, and CoLA), some pairs of finetuned models have large barriers of increasing loss on the linear paths between them. On each task, we find distinct clusters of models which are linearly connected on the test loss surface, but are disconnected from models outside the cluster -- models that occupy separate basins on the surface. By measuring performance on specially-crafted diagnostic datasets, we find that these clusters correspond to different generalization strategies: one cluster behaves like a bag of words model under domain shift, while another cluster uses syntactic heuristics. Our work demonstrates how the geometry of the loss surface can guide models towards different heuristic functions.

翻译：模式连接文献普遍认为,当两个神经网络在相同的数据上接受类似的培训时,它们通过一个通过参数空间的路径连接起来,而该参数空间的测试集的准确性得到维持。在某些情况下,包括从预先培训的模型中转移学习,这些路径被推定为线性。与现有的结果相反,我们发现,在文本分类者(在MNLI、QP和COLA上受过培训)中,有些微调模型对彼此的线性路径造成更大的损失。在每项任务中,我们发现不同的模型群群群,这些模型在测试损失表面上线性连接,但与组群以外的模型脱节 -- -- 占地表不同盆地的模型。通过测量专门设计的诊断数据集的性能,我们发现这些组群与不同的一般化战略相对应:一个组群群在域转移下表现像一袋文字模型,而另一个组群群则使用合成性外观。我们的工作表明,损失表面的几何测量方法可以引导模型走向不同的超度功能。

0

相关内容

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文推荐】最新5篇图像分割（Image Segmentation）相关论文—多重假设、超像素分割、自监督、图、生成对抗网络

【论文推荐】最新5篇图像分割（Image Segmentation）相关论文—多重假设、超像素分割、自监督、图、生成对抗网络

专知

27+阅读 · 2018年2月7日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

回声干扰抑制中的自适应信号处理算法研究

国家自然科学基金

1+阅读 · 2015年12月31日

Ago2磷酸化在DNA损伤修复中的功能分析

国家自然科学基金

0+阅读 · 2014年12月31日

非ABA依赖型SnRK2激酶调控马铃薯响应干旱胁迫的机制解析

国家自然科学基金

0+阅读 · 2014年12月31日

面向移动互联网的机会性数据传输研究

国家自然科学基金

0+阅读 · 2013年12月31日

Egr3调控造血干细胞功能的机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

拟南芥STRF1蛋白参与植物盐胁迫信号转导途径的机理研究

国家自然科学基金

0+阅读 · 2013年12月31日

Doublesex基因在对虾性别决定和分化中的功能研究

国家自然科学基金

0+阅读 · 2013年12月31日

自组装太阳能电池（二）

国家自然科学基金

0+阅读 · 2012年12月31日

热休克转录因子 1 作为癌靶蛋白在肝细胞癌中的分子调控机理

国家自然科学基金

0+阅读 · 2009年12月31日

TR3相互作用新蛋白机理研究

国家自然科学基金

1+阅读 · 2008年12月31日

Revisiting Outer Optimization in Adversarial Training

Arxiv

0+阅读 · 2022年9月2日

A Framework for Supervised Heterogeneous Transfer Learning using Dynamic Distribution Adaptation and Manifold Regularization

Arxiv

0+阅读 · 2022年9月2日

Model Transparency and Interpretability : Survey and Application to the Insurance Industry

Model Transparency and Interpretability : Survey and Application to the Insurance Industry

Arxiv

0+阅读 · 2022年9月1日

A Predictor-Corrector Strategy for Adaptivity in Dynamical Low-Rank Approximations

A Predictor-Corrector Strategy for Adaptivity in Dynamical Low-Rank Approximations

Arxiv

0+阅读 · 2022年9月1日

Coded Caching with Shared Caches and Private Caches

Arxiv

1+阅读 · 2022年9月1日

A Genetic Algorithm-based Framework for Learning Statistical Power Manifold

Arxiv

0+阅读 · 2022年9月1日

Active Learning for Domain Adaptation: An Energy-based Approach

Arxiv

13+阅读 · 2021年12月2日

Sparsity in Deep Learning: Pruning and growth for efficient inference and training in neural networks

Arxiv

14+阅读 · 2021年1月31日

Attention U-Net: Learning Where to Look for the Pancreas

Arxiv

17+阅读 · 2018年5月20日

Conditional Random Field and Deep Feature Learning for Hyperspectral Image Segmentation

Arxiv

11+阅读 · 2017年12月27日

VIP会员

文章信息

相关主题

Microsoft Surface

相关VIP内容

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

小规模训练指南：打造世界级大语言模型的关键方法

无人机编队飞行：复杂环境中作战的策略、挑战与应用

大模型APP，AI时代第一个爆款

从数据中心视角出发的高效大语言模型训练综述

相关资讯

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文推荐】最新5篇图像分割（Image Segmentation）相关论文—多重假设、超像素分割、自监督、图、生成对抗网络

【论文推荐】最新5篇图像分割（Image Segmentation）相关论文—多重假设、超像素分割、自监督、图、生成对抗网络

专知

27+阅读 · 2018年2月7日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

相关论文

Revisiting Outer Optimization in Adversarial Training

Arxiv

0+阅读 · 2022年9月2日

A Framework for Supervised Heterogeneous Transfer Learning using Dynamic Distribution Adaptation and Manifold Regularization

Arxiv

0+阅读 · 2022年9月2日

Model Transparency and Interpretability : Survey and Application to the Insurance Industry

Model Transparency and Interpretability : Survey and Application to the Insurance Industry

Arxiv

0+阅读 · 2022年9月1日

A Predictor-Corrector Strategy for Adaptivity in Dynamical Low-Rank Approximations

A Predictor-Corrector Strategy for Adaptivity in Dynamical Low-Rank Approximations

Arxiv

0+阅读 · 2022年9月1日

Coded Caching with Shared Caches and Private Caches

Arxiv

1+阅读 · 2022年9月1日

A Genetic Algorithm-based Framework for Learning Statistical Power Manifold

Arxiv

0+阅读 · 2022年9月1日

Active Learning for Domain Adaptation: An Energy-based Approach

Arxiv

13+阅读 · 2021年12月2日

Sparsity in Deep Learning: Pruning and growth for efficient inference and training in neural networks

Arxiv

14+阅读 · 2021年1月31日

Attention U-Net: Learning Where to Look for the Pancreas

Arxiv

17+阅读 · 2018年5月20日

Conditional Random Field and Deep Feature Learning for Hyperspectral Image Segmentation

Arxiv

11+阅读 · 2017年12月27日

相关基金

回声干扰抑制中的自适应信号处理算法研究

国家自然科学基金

1+阅读 · 2015年12月31日

Ago2磷酸化在DNA损伤修复中的功能分析

国家自然科学基金

0+阅读 · 2014年12月31日

非ABA依赖型SnRK2激酶调控马铃薯响应干旱胁迫的机制解析

国家自然科学基金

0+阅读 · 2014年12月31日

面向移动互联网的机会性数据传输研究

国家自然科学基金

0+阅读 · 2013年12月31日

Egr3调控造血干细胞功能的机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

拟南芥STRF1蛋白参与植物盐胁迫信号转导途径的机理研究

国家自然科学基金

0+阅读 · 2013年12月31日

Doublesex基因在对虾性别决定和分化中的功能研究

国家自然科学基金

0+阅读 · 2013年12月31日

自组装太阳能电池（二）

国家自然科学基金

0+阅读 · 2012年12月31日

热休克转录因子 1 作为癌靶蛋白在肝细胞癌中的分子调控机理

国家自然科学基金

0+阅读 · 2009年12月31日

TR3相互作用新蛋白机理研究

国家自然科学基金

1+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员