ViVT:通过普遍高斯-牛顿低级结构进入曲线 (ViViT: Curvature access through the generalized Gauss-Newton's low-rank structure) - 专知论文

会员服务 ·

0

曲率 · 有向 · 近似 · 噪声 · INFORMS ·

2021 年 6 月 4 日

ViViT: Curvature access through the generalized Gauss-Newton's low-rank structure

翻译：ViVT:通过普遍高斯-牛顿低级结构进入曲线

Felix Dangel,Lukas Tatzel,Philipp Hennig

from arxiv, Main text: 11 pages, 3 figures; Supplements: 14 pages, 10 figures, 2 tables

Curvature in form of the Hessian or its generalized Gauss-Newton (GGN) approximation is valuable for algorithms that rely on a local model for the loss to train, compress, or explain deep networks. Existing methods based on implicit multiplication via automatic differentiation or Kronecker-factored block diagonal approximations do not consider noise in the mini-batch. We present ViViT, a curvature model that leverages the GGN's low-rank structure without further approximations. It allows for efficient computation of eigenvalues, eigenvectors, as well as per-sample first- and second-order directional derivatives. The representation is computed in parallel with gradients in one backward pass and offers a fine-grained cost-accuracy trade-off, which allows it to scale. As examples for ViViT's usefulness, we investigate the directional gradients and curvatures during training, and how noise information can be used to improve the stability of second-order methods.

翻译：Hessian 或其通用的 Gaus- Newton (GGN) 近似曲线形式下的曲线对于依赖当地损失模型的算法进行训练、压缩或解释深层次网络是有价值的。基于通过自动区分或Kronecker- faciled block diagon countal coupilation的隐含乘法的现有方法并不考虑微型批量中的噪音。我们介绍了ViViVT, 这是一种利用GGGN低级结构而无需进一步接近的曲线模型。它使得能够有效地计算egenvals、 eigenvectors 和 per sample 一阶和二阶一阶方向衍生物。表示方式与一个后端通道的梯度平行计算, 并提供细度成本- 准确性交易, 从而可以进行缩放。作为 ViviT 的有用性示例, 我们在培训期间调查方向梯度和曲度结构, 以及如何使用噪音信息来提高二阶方法的稳定性。

0

相关内容

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

神经网络的拓扑结构，TOPOLOGY OF DEEP NEURAL NETWORKS

神经网络的拓扑结构，TOPOLOGY OF DEEP NEURAL NETWORKS

专知会员服务

35+阅读 · 2020年4月15日

抢鲜看！13篇CVPR2020论文链接/开源代码/解读

抢鲜看！13篇CVPR2020论文链接/开源代码/解读

专知会员服务

50+阅读 · 2020年2月26日

【Science论文】基于波的物理现象作为一种模拟递归神经网络（Wave physics as an analog recurrent neural network）

【Science论文】基于波的物理现象作为一种模拟递归神经网络（Wave physics as an analog recurrent neural network）

专知会员服务

12+阅读 · 2020年1月3日

在线变分推断，76页ppt，A Regret Bound for Online Variational Inference

在线变分推断，76页ppt，A Regret Bound for Online Variational Inference

专知会员服务

21+阅读 · 2019年12月2日

TensorFlow官方开源的神经结构学习（Neural Structured Learning）库

TensorFlow官方开源的神经结构学习（Neural Structured Learning）库

专知会员服务

18+阅读 · 2019年10月18日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

【斯坦福大学NeuralPS2019】GNN解释器，GNNExplainer: Generating Explanations for Graph Neural Networks，斯坦福大学|Jure Leskovec

【斯坦福大学NeuralPS2019】GNN解释器，GNNExplainer: Generating Explanations for Graph Neural Networks，斯坦福大学|Jure Leskovec

专知会员服务

89+阅读 · 2019年10月13日

“CVPR 2020 接受论文列表 1470篇论文都在这了

“CVPR 2020 接受论文列表 1470篇论文都在这了

专知

71+阅读 · 2020年6月10日

分布式并行架构Ray介绍

分布式并行架构Ray介绍

CreateAMind

10+阅读 · 2019年8月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

TCN v2 + 3Dconv 运动信息

TCN v2 + 3Dconv 运动信息

CreateAMind

4+阅读 · 2019年1月8日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

已删除

将门创投

8+阅读 · 2018年10月31日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

An Inertial Newton Algorithm for Deep Learning

Arxiv

0+阅读 · 2021年7月28日

Near-Optimal Algorithms for Minimax Optimization

Arxiv

0+阅读 · 2021年7月26日

Hindsight Value Function for Variance Reduction in Stochastic Dynamic Environment

Arxiv

0+阅读 · 2021年7月26日

Computational graphs for matrix functions

Arxiv

0+阅读 · 2021年7月26日

Computation of generalized matrix functions with rational Krylov methods

Arxiv

0+阅读 · 2021年7月26日

Learning from Successful and Failed Demonstrations via Optimization

Arxiv

0+阅读 · 2021年7月26日

Error Estimates for Neural Network Solutions of Partial Differential Equations

Arxiv

0+阅读 · 2021年7月23日

Structured second-order methods via natural gradient descent

Arxiv

0+阅读 · 2021年7月22日

Pointer Graph Networks

Pointer Graph Networks

Arxiv

7+阅读 · 2020年6月11日

Self-Attention Generative Adversarial Networks

Arxiv

8+阅读 · 2018年5月21日

VIP会员

文章信息

相关主题

相关VIP内容

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

神经网络的拓扑结构，TOPOLOGY OF DEEP NEURAL NETWORKS

神经网络的拓扑结构，TOPOLOGY OF DEEP NEURAL NETWORKS

专知会员服务

35+阅读 · 2020年4月15日

抢鲜看！13篇CVPR2020论文链接/开源代码/解读

抢鲜看！13篇CVPR2020论文链接/开源代码/解读

专知会员服务

50+阅读 · 2020年2月26日

【Science论文】基于波的物理现象作为一种模拟递归神经网络（Wave physics as an analog recurrent neural network）

【Science论文】基于波的物理现象作为一种模拟递归神经网络（Wave physics as an analog recurrent neural network）

专知会员服务

12+阅读 · 2020年1月3日

在线变分推断，76页ppt，A Regret Bound for Online Variational Inference

在线变分推断，76页ppt，A Regret Bound for Online Variational Inference

专知会员服务

21+阅读 · 2019年12月2日

TensorFlow官方开源的神经结构学习（Neural Structured Learning）库

TensorFlow官方开源的神经结构学习（Neural Structured Learning）库

专知会员服务

18+阅读 · 2019年10月18日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

【斯坦福大学NeuralPS2019】GNN解释器，GNNExplainer: Generating Explanations for Graph Neural Networks，斯坦福大学|Jure Leskovec

【斯坦福大学NeuralPS2019】GNN解释器，GNNExplainer: Generating Explanations for Graph Neural Networks，斯坦福大学|Jure Leskovec

专知会员服务

89+阅读 · 2019年10月13日

热门VIP内容

开通专知VIP会员享更多权益服务

《多智能体不确定环境追逃博弈研究》216页

美智库最新发布《解放军"人机编组协同作战"发展路径：理论与实践》53页

现代战争"杀伤区"理论：空间尺度与结构特征、控制手段与毁伤机制、生存策略与战线转移

《俄军无人机创新技术或已在乌克兰达成"战场空中封锁"作战效果》最新18页报告

相关资讯

“CVPR 2020 接受论文列表 1470篇论文都在这了

“CVPR 2020 接受论文列表 1470篇论文都在这了

专知

71+阅读 · 2020年6月10日

分布式并行架构Ray介绍

分布式并行架构Ray介绍

CreateAMind

10+阅读 · 2019年8月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

TCN v2 + 3Dconv 运动信息

TCN v2 + 3Dconv 运动信息

CreateAMind

4+阅读 · 2019年1月8日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

已删除

将门创投

8+阅读 · 2018年10月31日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

相关论文

An Inertial Newton Algorithm for Deep Learning

Arxiv

0+阅读 · 2021年7月28日

Near-Optimal Algorithms for Minimax Optimization

Arxiv

0+阅读 · 2021年7月26日

Hindsight Value Function for Variance Reduction in Stochastic Dynamic Environment

Arxiv

0+阅读 · 2021年7月26日

Computational graphs for matrix functions

Arxiv

0+阅读 · 2021年7月26日

Computation of generalized matrix functions with rational Krylov methods

Arxiv

0+阅读 · 2021年7月26日

Learning from Successful and Failed Demonstrations via Optimization

Arxiv

0+阅读 · 2021年7月26日

Error Estimates for Neural Network Solutions of Partial Differential Equations

Arxiv

0+阅读 · 2021年7月23日

Structured second-order methods via natural gradient descent

Arxiv

0+阅读 · 2021年7月22日

Pointer Graph Networks

Pointer Graph Networks

Arxiv

7+阅读 · 2020年6月11日

Self-Attention Generative Adversarial Networks

Arxiv

8+阅读 · 2018年5月21日

微信扫码咨询专知VIP会员