新的求解方法：将梯度算法与牛顿迭代法合二为一 (Quadratic Gradient: Combining Gradient Algorithms and Newton's Method as One) - 专知论文

会员服务 ·

0

梯度 · 迭代法 · 方阵 · 学习率 · 牛顿法 ·

2023 年 3 月 29 日

Quadratic Gradient: Combining Gradient Algorithms and Newton's Method as One

翻译：新的求解方法：将梯度算法与牛顿迭代法合二为一

from arxiv, In this work, we proposed an enhanced Adam method via quadratic gradient and applied the quadratic gradient to the general numerical optimization problems. The quadratic gradient can indeed be used to build enhanced gradient methods for general optimization problems. There is a good chance that quadratic gradient can also be applied to quasi-Newton methods, such as the famous BFGS method

It might be inadequate for the line search technique for Newton's method to use only one floating point number. A column vector of the same size as the gradient might be better than a mere float number to accelerate each of the gradient elements with different rates. Moreover, a square matrix of the same order as the Hessian matrix might be helpful to correct the Hessian matrix. Chiang applied something between a column vector and a square matrix, namely a diagonal matrix, to accelerate the gradient and further proposed a faster gradient variant called quadratic gradient. In this paper, we present a new way to build a new version of the quadratic gradient. This new quadratic gradient doesn't satisfy the convergence conditions of the fixed Hessian Newton's method. However, experimental results show that it sometimes has a better performance than the original one in convergence rate. Also, Chiang speculates that there might be a relation between the Hessian matrix and the learning rate for the first-order gradient descent method. We prove that the floating number $\frac{1}{\epsilon + \max \{| \lambda_i | \}}$ can be a good learning rate of the gradient methods, where $\epsilon$ is a number to avoid division by zero and $\lambda_i$ the eigenvalues of the Hessian matrix.

翻译：在牛顿迭代法的线搜索技术中，仅使用一个浮点数可能是不够的。一个与梯度相同大小的列向量可能比仅使用一个浮点数更好，加速每个梯度元素的速度并使用不同的速率。此外，一个与海森矩阵相同阶数的方阵可能有助于校正海森矩阵。Chiang使用了介于列向量和方阵之间的东西，即对角矩阵，来加速梯度，并进一步提出了一个更快的梯度变体“二次梯度”。在本文中，我们提出了一种构建新版本二次梯度的新方法。这种新的二次梯度不满足固定海森牛顿法的收敛条件。但是，实验结果表明，它有时比原来的性能更好。此外，Chiang推测海森矩阵与一阶梯度下降方法的学习率之间可能存在关系。我们证明了浮点数$\frac{1}{\epsilon + max \{| \lambda_i | \}}$可以是梯度方法的良好学习率，其中$\epsilon$是一个避免除以零的数字，$\lambda_i$是海森矩阵的特征值。

0

相关内容

梯度的本意是一个向量（矢量），表示某一函数在该点处的方向导数沿着该方向取得最大值，即函数在该点处沿着该方向（此梯度的方向）变化最快，变化率最大（为该梯度的模）。

【2022新书】高效深度学习，Efficient Deep Learning Book

【2022新书】高效深度学习，Efficient Deep Learning Book

专知会员服务

125+阅读 · 2022年4月21日

机器学习损失函数概述，Loss Functions in Machine Learning

机器学习损失函数概述，Loss Functions in Machine Learning

专知会员服务

83+阅读 · 2022年3月19日

【硬核书】矩阵代数基础，248页pdf

【硬核书】矩阵代数基础，248页pdf

专知会员服务

88+阅读 · 2021年12月9日

【经典书】凸优化：算法与复杂度，130页pdf

【经典书】凸优化：算法与复杂度，130页pdf

专知会员服务

81+阅读 · 2021年11月16日

【经典书】线性代数，436页pdf

专知会员服务

77+阅读 · 2021年3月16日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

专知会员服务

244+阅读 · 2019年10月21日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

一些关于随机矩阵的算法

一些关于随机矩阵的算法

PaperWeekly

1+阅读 · 2022年7月13日

量化金融强化学习论文集合

量化金融强化学习论文集合

专知

14+阅读 · 2019年12月18日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

【泡泡一分钟】基于机器人的视觉惯性里程计（IROS2018-10）

【泡泡一分钟】基于机器人的视觉惯性里程计（IROS2018-10）

泡泡机器人SLAM

13+阅读 · 2019年1月3日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

基于Lowrank分解的谱方法和有限差分地震正演模拟

国家自然科学基金

0+阅读 · 2015年12月31日

函数逼近论的一些极值问题与多元线性问题的可处理性

国家自然科学基金

2+阅读 · 2014年12月31日

非线性变分不等式问题的迭代算法

国家自然科学基金

1+阅读 · 2013年12月31日

变系数微分方程的谱方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于自适应网格的混合有限元方法的快速求解

国家自然科学基金

0+阅读 · 2013年12月31日

惯导系统中三轴加速度计标定的模型和算法

国家自然科学基金

2+阅读 · 2012年12月31日

特征值优化问题的理论和算法研究

国家自然科学基金

0+阅读 · 2012年12月31日

杂交有限元法的自适应理论与快速算法

国家自然科学基金

0+阅读 · 2011年12月31日

线性积分方程的Galerkin快速谱方法

国家自然科学基金

0+阅读 · 2009年12月31日

分数阶微分方程的数值计算和动力学行为

国家自然科学基金

0+阅读 · 2008年12月31日

Two-step Newton's method for deflation-one singular zeros of analytic systems

Arxiv

0+阅读 · 2023年5月18日

On the extremal families for the Kruskal--Katona theorem

Arxiv

0+阅读 · 2023年5月17日

A filtering monotonization approach for DG discretizations of hyperbolic problems

Arxiv

0+阅读 · 2023年5月17日

Efficient Modeling of Quasi-Periodic Data with Seasonal Gaussian Process

Arxiv

0+阅读 · 2023年5月17日

A score-based operator Newton method for measure transport

Arxiv

0+阅读 · 2023年5月16日

The Power of Learned Locally Linear Models for Nonlinear Policy Optimization

Arxiv

0+阅读 · 2023年5月16日

The Average Rate of Convergence of the Exact Line Search Gradient Descent Method

Arxiv

0+阅读 · 2023年5月16日

Accelerated gradient descent method for functionals of probability measures by new convexity and smoothness based on transport maps

Arxiv

0+阅读 · 2023年5月16日

Convex optimization over a probability simplex

Arxiv

0+阅读 · 2023年5月15日

A Survey of Quantization Methods for Efficient Neural Network Inference

Arxiv

22+阅读 · 2021年6月21日

VIP会员

文章信息

相关主题

相关VIP内容

【2022新书】高效深度学习，Efficient Deep Learning Book

【2022新书】高效深度学习，Efficient Deep Learning Book

专知会员服务

125+阅读 · 2022年4月21日

机器学习损失函数概述，Loss Functions in Machine Learning

机器学习损失函数概述，Loss Functions in Machine Learning

专知会员服务

83+阅读 · 2022年3月19日

【硬核书】矩阵代数基础，248页pdf

【硬核书】矩阵代数基础，248页pdf

专知会员服务

88+阅读 · 2021年12月9日

【经典书】凸优化：算法与复杂度，130页pdf

【经典书】凸优化：算法与复杂度，130页pdf

专知会员服务

81+阅读 · 2021年11月16日

【经典书】线性代数，436页pdf

专知会员服务

77+阅读 · 2021年3月16日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

专知会员服务

244+阅读 · 2019年10月21日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《无人机集群配置对模拟作战环境任务效能的影响研究》最新50页

《俄罗斯作战模式解析：对俄特别军事行动的观察报告》最新325页

军用无人机集群技术尚未成熟——但潜力可期

《无人机改变战争规则，但无法破解陆战固有挑战》最新报告

相关资讯

一些关于随机矩阵的算法

一些关于随机矩阵的算法

PaperWeekly

1+阅读 · 2022年7月13日

量化金融强化学习论文集合

量化金融强化学习论文集合

专知

14+阅读 · 2019年12月18日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

【泡泡一分钟】基于机器人的视觉惯性里程计（IROS2018-10）

【泡泡一分钟】基于机器人的视觉惯性里程计（IROS2018-10）

泡泡机器人SLAM

13+阅读 · 2019年1月3日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

相关论文

Two-step Newton's method for deflation-one singular zeros of analytic systems

Arxiv

0+阅读 · 2023年5月18日

On the extremal families for the Kruskal--Katona theorem

Arxiv

0+阅读 · 2023年5月17日

A filtering monotonization approach for DG discretizations of hyperbolic problems

Arxiv

0+阅读 · 2023年5月17日

Efficient Modeling of Quasi-Periodic Data with Seasonal Gaussian Process

Arxiv

0+阅读 · 2023年5月17日

A score-based operator Newton method for measure transport

Arxiv

0+阅读 · 2023年5月16日

The Power of Learned Locally Linear Models for Nonlinear Policy Optimization

Arxiv

0+阅读 · 2023年5月16日

The Average Rate of Convergence of the Exact Line Search Gradient Descent Method

Arxiv

0+阅读 · 2023年5月16日

Accelerated gradient descent method for functionals of probability measures by new convexity and smoothness based on transport maps

Arxiv

0+阅读 · 2023年5月16日

Convex optimization over a probability simplex

Arxiv

0+阅读 · 2023年5月15日

A Survey of Quantization Methods for Efficient Neural Network Inference

Arxiv

22+阅读 · 2021年6月21日

相关基金

基于Lowrank分解的谱方法和有限差分地震正演模拟

国家自然科学基金

0+阅读 · 2015年12月31日

函数逼近论的一些极值问题与多元线性问题的可处理性

国家自然科学基金

2+阅读 · 2014年12月31日

非线性变分不等式问题的迭代算法

国家自然科学基金

1+阅读 · 2013年12月31日

变系数微分方程的谱方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于自适应网格的混合有限元方法的快速求解

国家自然科学基金

0+阅读 · 2013年12月31日

惯导系统中三轴加速度计标定的模型和算法

国家自然科学基金

2+阅读 · 2012年12月31日

特征值优化问题的理论和算法研究

国家自然科学基金

0+阅读 · 2012年12月31日

杂交有限元法的自适应理论与快速算法

国家自然科学基金

0+阅读 · 2011年12月31日

线性积分方程的Galerkin快速谱方法

国家自然科学基金

0+阅读 · 2009年12月31日

分数阶微分方程的数值计算和动力学行为

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员