固定小批量梯度下降估计量的统计分析 (Statistical Analysis of Fixed Mini-Batch Gradient Descent Estimator) - 专知论文

会员服务 ·

0

小批量 · 小批量梯度 · 统计效率 · 学习率 · 梯度 ·

2023 年 4 月 13 日

Statistical Analysis of Fixed Mini-Batch Gradient Descent Estimator

翻译：固定小批量梯度下降估计量的统计分析

Haobo Qi,Feifei Wang,Hansheng Wang

We study here a fixed mini-batch gradient decent (FMGD) algorithm to solve optimization problems with massive datasets. In FMGD, the whole sample is split into multiple non-overlapping partitions. Once the partitions are formed, they are then fixed throughout the rest of the algorithm. For convenience, we refer to the fixed partitions as fixed mini-batches. Then for each computation iteration, the gradients are sequentially calculated on each fixed mini-batch. Because the size of fixed mini-batches is typically much smaller than the whole sample size, it can be easily computed. This leads to much reduced computation cost for each computational iteration. It makes FMGD computationally efficient and practically more feasible. To demonstrate the theoretical properties of FMGD, we start with a linear regression model with a constant learning rate. We study its numerical convergence and statistical efficiency properties. We find that sufficiently small learning rates are necessarily required for both numerical convergence and statistical efficiency. Nevertheless, an extremely small learning rate might lead to painfully slow numerical convergence. To solve the problem, a diminishing learning rate scheduling strategy can be used. This leads to the FMGD estimator with faster numerical convergence and better statistical efficiency. Finally, the FMGD algorithms with random shuffling and a general loss function are also studied.

翻译：我们研究了一种固定小批量梯度下降（FMGD）算法，用于解决具有海量数据集的优化问题。在 FMGD 中，整个样本被分成多个不重叠的分区。一旦形成分区，它们就会在其余算法中固定。为方便起见，我们将固定分区称为固定小批量。然后，对于每个计算迭代，梯度会在每个固定小批量上顺序计算。由于固定小批量的大小通常远小于整个样本大小，因此可以很容易地计算。这导致 FMGD 在计算上更高效，更实用。为了展示 FMGD 的理论特性，我们从具有恒定学习率的线性回归模型开始。我们研究它的数值收敛性和统计效率性质。我们发现，对于数值收敛和统计效率来说，必须使用足够小的学习率。尽管如此，过于小的学习率可能导致极其缓慢的数值收敛。为了解决这个问题，可以使用递减学习率调度策略。这会导致具有更快数值收敛和更好统计效率的 FMGD 估计量。最后，还研究了具有随机洗牌和一般损失函数的 FMGD 算法。

0

相关内容

小批量

【2023新书】随机模型基础，815页pdf

【2023新书】随机模型基础，815页pdf

专知会员服务

104+阅读 · 2023年5月10日

【干货书】工程和科学中的概率和统计，

【干货书】工程和科学中的概率和统计，

专知会员服务

58+阅读 · 2022年12月24日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

【硬核书】矩阵代数基础，248页pdf

【硬核书】矩阵代数基础，248页pdf

专知会员服务

88+阅读 · 2021年12月9日

【干货书】数据科学统计推断，124页pdf

专知会员服务

79+阅读 · 2021年10月12日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

【ICML2020】噪声在随机梯度下降中的泛化效益，On the Generalization Benefit of Noise in Stochastic Gradient Descent

【ICML2020】噪声在随机梯度下降中的泛化效益，On the Generalization Benefit of Noise in Stochastic Gradient Descent

专知会员服务

19+阅读 · 2020年6月29日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

93+阅读 · 2020年2月12日

【UMD开放书】机器学习课程书册，19章227页pdf，带你学习ML

【UMD开放书】机器学习课程书册，19章227页pdf，带你学习ML

专知会员服务

102+阅读 · 2019年12月9日

【新书】Python编程基础，669页pdf

【新书】Python编程基础，669页pdf

专知会员服务

196+阅读 · 2019年10月10日

特征筛选还在用XGB的Feature Importance？试试Permutation Importance

特征筛选还在用XGB的Feature Importance？试试Permutation Importance

PaperWeekly

0+阅读 · 2022年9月30日

一些关于随机矩阵的算法

一些关于随机矩阵的算法

PaperWeekly

1+阅读 · 2022年7月13日

一文理解Ranking Loss/Margin Loss/Triplet Loss

一文理解Ranking Loss/Margin Loss/Triplet Loss

极市平台

16+阅读 · 2020年8月10日

深度卷积神经网络中的降采样

深度卷积神经网络中的降采样

极市平台

12+阅读 · 2019年5月24日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文推荐】最新十篇度量学习相关论文—可量化表示、非线性度量学习、在线深度量学习、大间隔最近邻、判别深度度量、域自适应

【论文推荐】最新十篇度量学习相关论文—可量化表示、非线性度量学习、在线深度量学习、大间隔最近邻、判别深度度量、域自适应

专知

12+阅读 · 2018年5月18日

【论文推荐】最新七篇强化学习相关论文—逻辑约束、综述、多任务深度强化学习、参数服务器、事件抽取、分层强化学习、过拟合研究

【论文推荐】最新七篇强化学习相关论文—逻辑约束、综述、多任务深度强化学习、参数服务器、事件抽取、分层强化学习、过拟合研究

专知

25+阅读 · 2018年4月29日

Focal Loss for Dense Object Detection

Focal Loss for Dense Object Detection

统计学习与视觉计算组

12+阅读 · 2018年3月15日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

染色质重构蛋白CHR5在拟南芥抗病免疫反应中的功能研究

国家自然科学基金

0+阅读 · 2015年12月31日

非线性Schrödinger方程孤立子和怪波的数值方法

国家自然科学基金

0+阅读 · 2015年12月31日

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

关于面板(纵向）数据的动态统计分析

国家自然科学基金

0+阅读 · 2014年12月31日

Yb3+、Ca2+离子共掺新型硼硅酸盐超快激光晶体的研究

国家自然科学基金

0+阅读 · 2013年12月31日

面向非线性非高斯数据的因果结构学习算法研究

国家自然科学基金

2+阅读 · 2013年12月31日

不完全数据下分位数回归模型的经验似然推断

国家自然科学基金

1+阅读 · 2013年12月31日

非参数变换模型的统计推断

国家自然科学基金

0+阅读 · 2012年12月31日

惰性气体与O2,N2,CO,NO、CN相互作用势及碰撞激发截面的同位素效应研究

国家自然科学基金

0+阅读 · 2009年12月31日

p进表示的伽罗瓦上同调

国家自然科学基金

0+阅读 · 2008年12月31日

Identifying the Complete Correlation Structure in Large-Scale High-Dimensional Data Sets with Local False Discovery Rates

Arxiv

0+阅读 · 2023年6月1日

Estimation of Multivariate Discrete Hawkes Processes: An Application to Incident Monitoring

Estimation of Multivariate Discrete Hawkes Processes: An Application to Incident Monitoring

Arxiv

0+阅读 · 2023年5月31日

Parameter-free projected gradient descent

Arxiv

0+阅读 · 2023年5月31日

Prediction Error-based Classification for Class-Incremental Learning

Arxiv

0+阅读 · 2023年5月30日

Local Convergence of Gradient Descent-Ascent for Training Generative Adversarial Networks

Arxiv

0+阅读 · 2023年5月29日

Alternating Local Enumeration (TnALE): Solving Tensor Network Structure Search with Fewer Evaluations

Arxiv

0+阅读 · 2023年5月29日

Implicit Bias of Gradient Descent for Mean Squared Error Regression with Two-Layer Wide Neural Networks

Arxiv

0+阅读 · 2023年5月28日

The Implicit Regularization of Dynamical Stability in Stochastic Gradient Descent

Arxiv

0+阅读 · 2023年5月27日

Gradient Correction beyond Gradient Descent

Arxiv

0+阅读 · 2023年5月26日

Train Large, Then Compress: Rethinking Model Size for Efficient Training and Inference of Transformers

Arxiv

12+阅读 · 2020年6月23日

VIP会员

文章信息

相关主题

小批量梯度

相关VIP内容

【2023新书】随机模型基础，815页pdf

【2023新书】随机模型基础，815页pdf

专知会员服务

104+阅读 · 2023年5月10日

【干货书】工程和科学中的概率和统计，

【干货书】工程和科学中的概率和统计，

专知会员服务

58+阅读 · 2022年12月24日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

【硬核书】矩阵代数基础，248页pdf

【硬核书】矩阵代数基础，248页pdf

专知会员服务

88+阅读 · 2021年12月9日

【干货书】数据科学统计推断，124页pdf

专知会员服务

79+阅读 · 2021年10月12日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

【ICML2020】噪声在随机梯度下降中的泛化效益，On the Generalization Benefit of Noise in Stochastic Gradient Descent

【ICML2020】噪声在随机梯度下降中的泛化效益，On the Generalization Benefit of Noise in Stochastic Gradient Descent

专知会员服务

19+阅读 · 2020年6月29日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

93+阅读 · 2020年2月12日

【UMD开放书】机器学习课程书册，19章227页pdf，带你学习ML

【UMD开放书】机器学习课程书册，19章227页pdf，带你学习ML

专知会员服务

102+阅读 · 2019年12月9日

【新书】Python编程基础，669页pdf

【新书】Python编程基础，669页pdf

专知会员服务

196+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

《美空军条令出版物：战略打击》最新条令

《高能激光武器》22页slides

军事前沿模型

《面向小型无人机或无人飞行器的创新雷达探测与人工智能分类技术》263页

相关资讯

特征筛选还在用XGB的Feature Importance？试试Permutation Importance

特征筛选还在用XGB的Feature Importance？试试Permutation Importance

PaperWeekly

0+阅读 · 2022年9月30日

一些关于随机矩阵的算法

一些关于随机矩阵的算法

PaperWeekly

1+阅读 · 2022年7月13日

一文理解Ranking Loss/Margin Loss/Triplet Loss

一文理解Ranking Loss/Margin Loss/Triplet Loss

极市平台

16+阅读 · 2020年8月10日

深度卷积神经网络中的降采样

深度卷积神经网络中的降采样

极市平台

12+阅读 · 2019年5月24日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文推荐】最新十篇度量学习相关论文—可量化表示、非线性度量学习、在线深度量学习、大间隔最近邻、判别深度度量、域自适应

【论文推荐】最新十篇度量学习相关论文—可量化表示、非线性度量学习、在线深度量学习、大间隔最近邻、判别深度度量、域自适应

专知

12+阅读 · 2018年5月18日

【论文推荐】最新七篇强化学习相关论文—逻辑约束、综述、多任务深度强化学习、参数服务器、事件抽取、分层强化学习、过拟合研究

【论文推荐】最新七篇强化学习相关论文—逻辑约束、综述、多任务深度强化学习、参数服务器、事件抽取、分层强化学习、过拟合研究

专知

25+阅读 · 2018年4月29日

Focal Loss for Dense Object Detection

Focal Loss for Dense Object Detection

统计学习与视觉计算组

12+阅读 · 2018年3月15日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

相关论文

Identifying the Complete Correlation Structure in Large-Scale High-Dimensional Data Sets with Local False Discovery Rates

Arxiv

0+阅读 · 2023年6月1日

Estimation of Multivariate Discrete Hawkes Processes: An Application to Incident Monitoring

Estimation of Multivariate Discrete Hawkes Processes: An Application to Incident Monitoring

Arxiv

0+阅读 · 2023年5月31日

Parameter-free projected gradient descent

Arxiv

0+阅读 · 2023年5月31日

Prediction Error-based Classification for Class-Incremental Learning

Arxiv

0+阅读 · 2023年5月30日

Local Convergence of Gradient Descent-Ascent for Training Generative Adversarial Networks

Arxiv

0+阅读 · 2023年5月29日

Alternating Local Enumeration (TnALE): Solving Tensor Network Structure Search with Fewer Evaluations

Arxiv

0+阅读 · 2023年5月29日

Implicit Bias of Gradient Descent for Mean Squared Error Regression with Two-Layer Wide Neural Networks

Arxiv

0+阅读 · 2023年5月28日

The Implicit Regularization of Dynamical Stability in Stochastic Gradient Descent

Arxiv

0+阅读 · 2023年5月27日

Gradient Correction beyond Gradient Descent

Arxiv

0+阅读 · 2023年5月26日

Train Large, Then Compress: Rethinking Model Size for Efficient Training and Inference of Transformers

Arxiv

12+阅读 · 2020年6月23日

相关基金

染色质重构蛋白CHR5在拟南芥抗病免疫反应中的功能研究

国家自然科学基金

0+阅读 · 2015年12月31日

非线性Schrödinger方程孤立子和怪波的数值方法

国家自然科学基金

0+阅读 · 2015年12月31日

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

关于面板(纵向）数据的动态统计分析

国家自然科学基金

0+阅读 · 2014年12月31日

Yb3+、Ca2+离子共掺新型硼硅酸盐超快激光晶体的研究

国家自然科学基金

0+阅读 · 2013年12月31日

面向非线性非高斯数据的因果结构学习算法研究

国家自然科学基金

2+阅读 · 2013年12月31日

不完全数据下分位数回归模型的经验似然推断

国家自然科学基金

1+阅读 · 2013年12月31日

非参数变换模型的统计推断

国家自然科学基金

0+阅读 · 2012年12月31日

惰性气体与O2,N2,CO,NO、CN相互作用势及碰撞激发截面的同位素效应研究

国家自然科学基金

0+阅读 · 2009年12月31日

p进表示的伽罗瓦上同调

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员