TENGRAD: 时间效率高的自然渐变后裔,有精确的渔业-锁反转 (TENGraD: Time-Efficient Natural Gradient Descent with Exact Fisher-Block Inversion) - 专知论文

会员服务 ·

0

可约的 · INFORMS · state-of-the-art · Fisher信息矩阵 · 近似 ·

2021 年 8 月 27 日

TENGraD: Time-Efficient Natural Gradient Descent with Exact Fisher-Block Inversion

翻译：TENGRAD: 时间效率高的自然渐变后裔,有精确的渔业-锁反转

Saeed Soori,Bugra Can,Baourun Mu,Mert Gürbüzbalaban,Maryam Mehri Dehnavi

from arxiv, We are going to make minor changes and upload it later

This work proposes a time-efficient Natural Gradient Descent method, called TENGraD, with linear convergence guarantees. Computing the inverse of the neural network's Fisher information matrix is expensive in NGD because the Fisher matrix is large. Approximate NGD methods such as KFAC attempt to improve NGD's running time and practical application by reducing the Fisher matrix inversion cost with approximation. However, the approximations do not reduce the overall time significantly and lead to less accurate parameter updates and loss of curvature information. TENGraD improves the time efficiency of NGD by computing Fisher block inverses with a computationally efficient covariance factorization and reuse method. It computes the inverse of each block exactly using the Woodbury matrix identity to preserve curvature information while admitting (linear) fast convergence rates. Our experiments on image classification tasks for state-of-the-art deep neural architecture on CIFAR-10, CIFAR-100, and Fashion-MNIST show that TENGraD significantly outperforms state-of-the-art NGD methods and often stochastic gradient descent in wall-clock time.

翻译：这项工作提出了一种具有时间效率的自然梯子法,称为TENGRAD,具有线性趋同保证。计算神经网络的渔业信息矩阵的反面在NGD中成本很高,因为Fisher 矩阵很大。KFAC等近似NGD方法试图通过降低Fisher 矩阵反向成本来改善NGD的运行时间和实用应用,但近似方法并没有显著缩短整个时间,导致曲线更新参数和丢失。TENGAD通过以计算效率高的共变系数和再利用方法计算渔业区块反向数据提高NGD的时间效率。它计算每个区完全使用Woodbury矩阵特性来保存曲线信息,同时承认(线性)快速趋同率。我们在CIFAR-10、CIFAR-100和FASAshion-MNIST的高级深神经结构图像分类任务实验显示,ENGAD明显超越了最新NGD方法,而且常常在墙时段梯梯梯梯梯级梯级下降。

0

相关内容

可约的

【干货书】贝叶斯推理和机器学习，610页pdf

专知会员服务

258+阅读 · 2021年10月8日

【经典书】线性代数，436页pdf

专知会员服务

78+阅读 · 2021年3月16日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

最新《高级算法》Advanced Algorithms，176页pdf

最新《高级算法》Advanced Algorithms，176页pdf

专知会员服务

92+阅读 · 2020年10月22日

UC.Berkeley CS189讲义教材:《机器学习全面指南》，185页pdf

专知会员服务

162+阅读 · 2020年1月16日

【IPAM 】张量主元分析中的高维成本景观和梯度下降及其推广（High-dimensional cost landscape and gradient descent in Tensor PCA and its generalisations），附41页pdf

【IPAM 】张量主元分析中的高维成本景观和梯度下降及其推广（High-dimensional cost landscape and gradient descent in Tensor PCA and its generalisations），附41页pdf

专知会员服务

13+阅读 · 2019年11月22日

【课程】普林斯顿大学19年春季学期《机器学习优化》课程讲义

【课程】普林斯顿大学19年春季学期《机器学习优化》课程讲义

专知会员服务

85+阅读 · 2019年10月29日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

一文读懂模型压缩

一文读懂模型压缩

极市平台

4+阅读 · 2020年3月16日

深度卷积神经网络中的降采样

深度卷积神经网络中的降采样

极市平台

12+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

RL 真经

CreateAMind

5+阅读 · 2018年12月28日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

Convergence Acceleration of Ensemble Kalman Inversion in Nonlinear Settings

Arxiv

0+阅读 · 2021年10月18日

Noisy Truncated SGD: Optimization and Generalization

Arxiv

0+阅读 · 2021年10月17日

On the Pareto Frontier of Regret Minimization and Best Arm Identification in Stochastic Bandits

Arxiv

0+阅读 · 2021年10月16日

Nys-Curve: Nyström-Approximated Curvature for Stochastic Optimization

Arxiv

0+阅读 · 2021年10月16日

Escaping Saddle Points in Nonconvex Minimax Optimization via Cubic-Regularized Gradient Descent-Ascent

Arxiv

0+阅读 · 2021年10月15日

A data-driven model reduction method for parabolic inverse source problems and its convergence analysis

Arxiv

0+阅读 · 2021年10月14日

Equivalence Analysis between Counterfactual Regret Minimization and Online Mirror Descent

Arxiv

0+阅读 · 2021年10月11日

Joint Multi-Dimension Pruning via Numerical Gradient Update

Arxiv

0+阅读 · 2021年9月25日

PPO-CMA: Proximal Policy Optimization with Covariance Matrix Adaptation

PPO-CMA: Proximal Policy Optimization with Covariance Matrix Adaptation

Arxiv

8+阅读 · 2018年12月18日

Neural Architecture Optimization

Neural Architecture Optimization

Arxiv

8+阅读 · 2018年9月5日

VIP会员

文章信息

相关主题

state-of-the-art

Fisher信息矩阵

相关VIP内容

【干货书】贝叶斯推理和机器学习，610页pdf

专知会员服务

258+阅读 · 2021年10月8日

【经典书】线性代数，436页pdf

专知会员服务

78+阅读 · 2021年3月16日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

最新《高级算法》Advanced Algorithms，176页pdf

最新《高级算法》Advanced Algorithms，176页pdf

专知会员服务

92+阅读 · 2020年10月22日

UC.Berkeley CS189讲义教材:《机器学习全面指南》，185页pdf

专知会员服务

162+阅读 · 2020年1月16日

【IPAM 】张量主元分析中的高维成本景观和梯度下降及其推广（High-dimensional cost landscape and gradient descent in Tensor PCA and its generalisations），附41页pdf

【IPAM 】张量主元分析中的高维成本景观和梯度下降及其推广（High-dimensional cost landscape and gradient descent in Tensor PCA and its generalisations），附41页pdf

专知会员服务

13+阅读 · 2019年11月22日

【课程】普林斯顿大学19年春季学期《机器学习优化》课程讲义

【课程】普林斯顿大学19年春季学期《机器学习优化》课程讲义

专知会员服务

85+阅读 · 2019年10月29日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

热门VIP内容

开通专知VIP会员享更多权益服务

《复杂工程系统模型驱动设计决策支持系统：早期设计阶段挑战》最新138页

《日本陆上自卫队2040年作战方式与未来作战研究》最新23页slides

人工智能作为战争武器

《后勤保障》最新23页

相关资讯

一文读懂模型压缩

一文读懂模型压缩

极市平台

4+阅读 · 2020年3月16日

深度卷积神经网络中的降采样

深度卷积神经网络中的降采样

极市平台

12+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

RL 真经

CreateAMind

5+阅读 · 2018年12月28日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

相关论文

Convergence Acceleration of Ensemble Kalman Inversion in Nonlinear Settings

Arxiv

0+阅读 · 2021年10月18日

Noisy Truncated SGD: Optimization and Generalization

Arxiv

0+阅读 · 2021年10月17日

On the Pareto Frontier of Regret Minimization and Best Arm Identification in Stochastic Bandits

Arxiv

0+阅读 · 2021年10月16日

Nys-Curve: Nyström-Approximated Curvature for Stochastic Optimization

Arxiv

0+阅读 · 2021年10月16日

Escaping Saddle Points in Nonconvex Minimax Optimization via Cubic-Regularized Gradient Descent-Ascent

Arxiv

0+阅读 · 2021年10月15日

A data-driven model reduction method for parabolic inverse source problems and its convergence analysis

Arxiv

0+阅读 · 2021年10月14日

Equivalence Analysis between Counterfactual Regret Minimization and Online Mirror Descent

Arxiv

0+阅读 · 2021年10月11日

Joint Multi-Dimension Pruning via Numerical Gradient Update

Arxiv

0+阅读 · 2021年9月25日

PPO-CMA: Proximal Policy Optimization with Covariance Matrix Adaptation

PPO-CMA: Proximal Policy Optimization with Covariance Matrix Adaptation

Arxiv

8+阅读 · 2018年12月18日

Neural Architecture Optimization

Neural Architecture Optimization

Arxiv

8+阅读 · 2018年9月5日

微信扫码咨询专知VIP会员