了解私人 SGD 中的渐渐滑动:几何视角 (Understanding Gradient Clipping in Private SGD: A Geometric Perspective) - 专知论文

会员服务 ·

0

梯度截断 · SGD · 有偏 · 可理解性 · 学成 ·

2021 年 3 月 18 日

Understanding Gradient Clipping in Private SGD: A Geometric Perspective

翻译：了解私人 SGD 中的渐渐滑动:几何视角

Xiangyi Chen,Zhiwei Steven Wu,Mingyi Hong

Deep learning models are increasingly popular in many machine learning applications where the training data may contain sensitive information. To provide formal and rigorous privacy guarantee, many learning systems now incorporate differential privacy by training their models with (differentially) private SGD. A key step in each private SGD update is gradient clipping that shrinks the gradient of an individual example whenever its L2 norm exceeds some threshold. We first demonstrate how gradient clipping can prevent SGD from converging to stationary point. We then provide a theoretical analysis that fully quantifies the clipping bias on convergence with a disparity measure between the gradient distribution and a geometrically symmetric distribution. Our empirical evaluation further suggests that the gradient distributions along the trajectory of private SGD indeed exhibit symmetric structure that favors convergence. Together, our results provide an explanation why private SGD with gradient clipping remains effective in practice despite its potential clipping bias. Finally, we develop a new perturbation-based technique that can provably correct the clipping bias even for instances with highly asymmetric gradient distributions.

翻译：为了提供正式和严格的隐私保障,许多学习系统现在都通过(不同地)用私人 SGD 来培训自己的模型,从而纳入不同的隐私。每个私人 SGD 更新的关键步骤是梯度剪切,当个人例的L2 规范超过某些阈值时,这种剪切会缩缩缩梯度。我们首先演示梯度剪切如何防止SGD 凝聚到固定点。我们然后提供理论分析,充分量化关于渐变分布和几何对称分布之间差异测量的切合偏差。我们的经验评估进一步表明,私人 SGD 轨迹上的梯度分布确实展示了有利于趋同的对称结构。我们的结果共同解释了为什么带有梯度剪切的私人SGD 切除在实际中仍然有效,尽管它具有潜在的偏差偏差。最后,我们开发了一种新的以扰动为基础的技术,可以证明即使在高度不对称的梯度分布的情况下也能纠正剪切偏差的偏差。

0

相关内容

梯度截断

截断，即通过某个阈值来控制系数的大小，若系数小于某个阈值便将该系数设置为0，即简单截断。

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

不可错过！UIUC最新《统计强化学习》课程！

专知会员服务

53+阅读 · 2020年9月7日

神经常微分方程教程，50页ppt，A brief tutorial on Neural ODEs

神经常微分方程教程，50页ppt，A brief tutorial on Neural ODEs

专知会员服务

74+阅读 · 2020年8月2日

【ICML2020】噪声在随机梯度下降中的泛化效益，On the Generalization Benefit of Noise in Stochastic Gradient Descent

【ICML2020】噪声在随机梯度下降中的泛化效益，On the Generalization Benefit of Noise in Stochastic Gradient Descent

专知会员服务

19+阅读 · 2020年6月29日

【论文推荐】Stochastic Graph Neural Networks，随机图神经网络

【论文推荐】Stochastic Graph Neural Networks，随机图神经网络

专知会员服务

69+阅读 · 2020年6月6日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【ICLR2020】深度神经网络优化轨迹的平衡点，The Break-Even Point on Optimization Trajectories of Deep Neural Networks

【ICLR2020】深度神经网络优化轨迹的平衡点，The Break-Even Point on Optimization Trajectories of Deep Neural Networks

专知会员服务

34+阅读 · 2020年2月27日

【北京智源大会2019】神经网络的优化Optimization for Overparametrized Deep Neural Networks，北京大学 | 王立威

【北京智源大会2019】神经网络的优化Optimization for Overparametrized Deep Neural Networks，北京大学 | 王立威

专知会员服务

23+阅读 · 2019年11月21日

【ICCV 2019 Toturial】Global Optimization for Geometric Understanding with Provable Guarantees（具有可证明保证的几何理解的全局优化）

【ICCV 2019 Toturial】Global Optimization for Geometric Understanding with Provable Guarantees（具有可证明保证的几何理解的全局优化）

专知会员服务

18+阅读 · 2019年11月1日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

近13年无任何实际进展？Cornell&Facebook研究员剑指Deep Metric Learning领域

近13年无任何实际进展？Cornell&Facebook研究员剑指Deep Metric Learning领域

极市平台

4+阅读 · 2020年5月13日

图神经网络库PyTorch geometric

图神经网络库PyTorch geometric

图与推荐

17+阅读 · 2020年3月22日

神经网络中 warmup 策略为什么有效？

神经网络中 warmup 策略为什么有效？

极市平台

10+阅读 · 2019年9月23日

PyTorch & PyTorch Geometric图神经网络(GNN)实战

PyTorch & PyTorch Geometric图神经网络(GNN)实战

专知

81+阅读 · 2019年6月1日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

神经网络学习率设置

神经网络学习率设置

机器学习研究会

4+阅读 · 2018年3月3日

【推荐】神经网络调试经验汇编：神经网络不好使该咋办？

【推荐】神经网络调试经验汇编：神经网络不好使该咋办？

机器学习研究会

5+阅读 · 2017年9月5日

Learning Graphs from Smooth Signals under Moment Uncertainty

Arxiv

0+阅读 · 2021年5月12日

FL-NTK: A Neural Tangent Kernel-based Framework for Federated Learning Convergence Analysis

Arxiv

0+阅读 · 2021年5月11日

Stable Adversarial Learning under Distributional Shifts

Arxiv

1+阅读 · 2021年5月11日

On the Convergence of SGD with Biased Gradients

Arxiv

0+阅读 · 2021年5月9日

Differential Privacy for Pairwise Learning: Non-convex Analysis

Arxiv

0+阅读 · 2021年5月7日

Information-Theoretic Bounds on the Moments of the Generalization Error of Learning Algorithms

Arxiv

0+阅读 · 2021年5月5日

Identification and Formal Privacy Guarantees

Arxiv

0+阅读 · 2021年5月3日

LDP-FL: Practical Private Aggregation in Federated Learning with Local Differential Privacy

Arxiv

5+阅读 · 2020年7月31日

Geometric Understanding of Deep Learning

Arxiv

5+阅读 · 2018年5月31日

Variance-based regularization with convex objectives

Arxiv

5+阅读 · 2017年12月14日

VIP会员

文章信息

相关主题

相关VIP内容

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

不可错过！UIUC最新《统计强化学习》课程！

专知会员服务

53+阅读 · 2020年9月7日

神经常微分方程教程，50页ppt，A brief tutorial on Neural ODEs

神经常微分方程教程，50页ppt，A brief tutorial on Neural ODEs

专知会员服务

74+阅读 · 2020年8月2日

【ICML2020】噪声在随机梯度下降中的泛化效益，On the Generalization Benefit of Noise in Stochastic Gradient Descent

【ICML2020】噪声在随机梯度下降中的泛化效益，On the Generalization Benefit of Noise in Stochastic Gradient Descent

专知会员服务

19+阅读 · 2020年6月29日

【论文推荐】Stochastic Graph Neural Networks，随机图神经网络

【论文推荐】Stochastic Graph Neural Networks，随机图神经网络

专知会员服务

69+阅读 · 2020年6月6日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【ICLR2020】深度神经网络优化轨迹的平衡点，The Break-Even Point on Optimization Trajectories of Deep Neural Networks

【ICLR2020】深度神经网络优化轨迹的平衡点，The Break-Even Point on Optimization Trajectories of Deep Neural Networks

专知会员服务

34+阅读 · 2020年2月27日

【北京智源大会2019】神经网络的优化Optimization for Overparametrized Deep Neural Networks，北京大学 | 王立威

【北京智源大会2019】神经网络的优化Optimization for Overparametrized Deep Neural Networks，北京大学 | 王立威

专知会员服务

23+阅读 · 2019年11月21日

【ICCV 2019 Toturial】Global Optimization for Geometric Understanding with Provable Guarantees（具有可证明保证的几何理解的全局优化）

【ICCV 2019 Toturial】Global Optimization for Geometric Understanding with Provable Guarantees（具有可证明保证的几何理解的全局优化）

专知会员服务

18+阅读 · 2019年11月1日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

热门VIP内容

开通专知VIP会员享更多权益服务

操作系统智能体：基于多模态大模型（MLLM）的通用计算设备智能体综述

《美国太空军系统全生命周期建模、仿真与分析效能提升方案》最新84页报告

【博士论文】推进数据高效的深度学习：非参数 Transformer、主动测试与上下文学习

自主人工智能：未来战争是否将是自主化的？

相关资讯

近13年无任何实际进展？Cornell&Facebook研究员剑指Deep Metric Learning领域

近13年无任何实际进展？Cornell&Facebook研究员剑指Deep Metric Learning领域

极市平台

4+阅读 · 2020年5月13日

图神经网络库PyTorch geometric

图神经网络库PyTorch geometric

图与推荐

17+阅读 · 2020年3月22日

神经网络中 warmup 策略为什么有效？

神经网络中 warmup 策略为什么有效？

极市平台

10+阅读 · 2019年9月23日

PyTorch & PyTorch Geometric图神经网络(GNN)实战

PyTorch & PyTorch Geometric图神经网络(GNN)实战

专知

81+阅读 · 2019年6月1日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

神经网络学习率设置

神经网络学习率设置

机器学习研究会

4+阅读 · 2018年3月3日

【推荐】神经网络调试经验汇编：神经网络不好使该咋办？

【推荐】神经网络调试经验汇编：神经网络不好使该咋办？

机器学习研究会

5+阅读 · 2017年9月5日

相关论文

Learning Graphs from Smooth Signals under Moment Uncertainty

Arxiv

0+阅读 · 2021年5月12日

FL-NTK: A Neural Tangent Kernel-based Framework for Federated Learning Convergence Analysis

Arxiv

0+阅读 · 2021年5月11日

Stable Adversarial Learning under Distributional Shifts

Arxiv

1+阅读 · 2021年5月11日

On the Convergence of SGD with Biased Gradients

Arxiv

0+阅读 · 2021年5月9日

Differential Privacy for Pairwise Learning: Non-convex Analysis

Arxiv

0+阅读 · 2021年5月7日

Information-Theoretic Bounds on the Moments of the Generalization Error of Learning Algorithms

Arxiv

0+阅读 · 2021年5月5日

Identification and Formal Privacy Guarantees

Arxiv

0+阅读 · 2021年5月3日

LDP-FL: Practical Private Aggregation in Federated Learning with Local Differential Privacy

Arxiv

5+阅读 · 2020年7月31日

Geometric Understanding of Deep Learning

Arxiv

5+阅读 · 2018年5月31日

Variance-based regularization with convex objectives

Arxiv

5+阅读 · 2017年12月14日

微信扫码咨询专知VIP会员