低方差梯度估计的ES-Single在展开的计算图中的应用 (Low-Variance Gradient Estimation in Unrolled Computation Graphs with ES-Single) - 专知论文

会员服务 ·

0

方差 · 展开 · 进化策略 · 计算图 · 梯度估计 ·

2023 年 4 月 21 日

Low-Variance Gradient Estimation in Unrolled Computation Graphs with ES-Single

翻译：低方差梯度估计的ES-Single在展开的计算图中的应用

Paul Vicol,Zico Kolter,Kevin Swersky

We propose an evolution strategies-based algorithm for estimating gradients in unrolled computation graphs, called ES-Single. Similarly to the recently-proposed Persistent Evolution Strategies (PES), ES-Single is unbiased, and overcomes chaos arising from recursive function applications by smoothing the meta-loss landscape. ES-Single samples a single perturbation per particle, that is kept fixed over the course of an inner problem (e.g., perturbations are not re-sampled for each partial unroll). Compared to PES, ES-Single is simpler to implement and has lower variance: the variance of ES-Single is constant with respect to the number of truncated unrolls, removing a key barrier in applying ES to long inner problems using short truncations. We show that ES-Single is unbiased for quadratic inner problems, and demonstrate empirically that its variance can be substantially lower than that of PES. ES-Single consistently outperforms PES on a variety of tasks, including a synthetic benchmark task, hyperparameter optimization, training recurrent neural networks, and training learned optimizers.

翻译：我们提出了一种基于进化策略的算法，用于估计展开的计算图中的梯度，称为ES-Single。与最近提出的持续进化策略（PES）类似，ES-Single是无偏的，并通过平滑元丢失的景观来克服递归函数应用中出现的混沌。ES-Single对每个粒子采样一个扰动，并在内部问题的过程中保持不变（例如，不为每个部分展开重新采样扰动）。与PES相比，ES-Single更容易实现，并且具有较低的方差：ES-Single的方差相对于截断展开的数量是固定的，消除了在使用短截断展开长内部问题时应用ES的主要障碍。我们表明对于二次内部问题，ES-Single是无偏的，并实验证明其方差可以显着低于PES。ES-Single在各种任务中始终优于PES，包括合成基准任务，超参数优化，训练递归神经网络和训练学习优化器。

0

相关内容

【硬核书】稀疏多项式优化:理论与实践，220页pdf

【硬核书】稀疏多项式优化:理论与实践，220页pdf

专知会员服务

71+阅读 · 2022年9月30日

【博士论文】机器学习中的熵最优传输:在分布回归、重心估计和概率匹配中的应用，209页pdf

【博士论文】机器学习中的熵最优传输:在分布回归、重心估计和概率匹配中的应用，209页pdf

专知会员服务

37+阅读 · 2022年5月23日

【书籍】优化与编程：线性、非线性、动态、随机和Matlab应用，Optimizations and Programming: Linear, Nonlinear, Dynamic, Stochastic and Applications with Matlab

【书籍】优化与编程：线性、非线性、动态、随机和Matlab应用，Optimizations and Programming: Linear, Nonlinear, Dynamic, Stochastic and Applications with Matlab

专知会员服务

28+阅读 · 2022年4月8日

【ICML2021】具有持续进化策略的展开计算图的无偏梯度估计

专知会员服务

11+阅读 · 2021年8月10日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【KDD2020】最小方差采样用于图神经网络的快速训练

【KDD2020】最小方差采样用于图神经网络的快速训练

专知会员服务

28+阅读 · 2020年7月13日

最大均方差正则化贝叶斯神经网络，Bayesian Neural Networks With Maximum Mean Discrepancy Regularization

最大均方差正则化贝叶斯神经网络，Bayesian Neural Networks With Maximum Mean Discrepancy Regularization

专知会员服务

54+阅读 · 2020年3月5日

【深度学习架构、模型和技巧集合(TensorFlow/PyTorch)】’Deep Learning Models - A collection of various deep learning architectures, models, and tips'

【深度学习架构、模型和技巧集合(TensorFlow/PyTorch)】’Deep Learning Models - A collection of various deep learning architectures, models, and tips'

专知会员服务

58+阅读 · 2020年1月25日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

【ICLR2020】五篇Open代码的GNN论文

【ICLR2020】五篇Open代码的GNN论文

专知会员服务

48+阅读 · 2019年10月2日

ICLR'23截稿, 图神经网络依然火热 (附42 篇好文整理)

ICLR'23截稿, 图神经网络依然火热 (附42 篇好文整理)

图与推荐

2+阅读 · 2022年10月5日

NeurlPS2022推荐系统论文集锦

NeurlPS2022推荐系统论文集锦

机器学习与推荐算法

1+阅读 · 2022年9月26日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

19篇ICML2019论文摘录选读！

19篇ICML2019论文摘录选读！

专知

28+阅读 · 2019年4月28日

基于PyTorch/TorchText的自然语言处理库

基于PyTorch/TorchText的自然语言处理库

专知

28+阅读 · 2019年4月22日

TorchSeg：基于pytorch的语义分割算法开源了

TorchSeg：基于pytorch的语义分割算法开源了

极市平台

20+阅读 · 2019年1月28日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

一类离散Hindmarsh-Rose模型的分支延拓

国家自然科学基金

0+阅读 · 2015年12月31日

具摩擦非线性系统随机振动分析与最优控制

国家自然科学基金

0+阅读 · 2014年12月31日

几类扩散过程的逼近及应用

国家自然科学基金

1+阅读 · 2014年12月31日

凸可分半定规划的数值算法

国家自然科学基金

0+阅读 · 2013年12月31日

基于随机Barbalat引理的随机非线性系统分析与综合

国家自然科学基金

0+阅读 · 2013年12月31日

基于Sieve Bootstrap方法的长记忆过程变点研究与应用

国家自然科学基金

0+阅读 · 2013年12月31日

基于近似动态规划理论的电力系统随机动态经济调度研究

国家自然科学基金

0+阅读 · 2012年12月31日

稳健且有效的回归和变量选择方法研究

国家自然科学基金

1+阅读 · 2012年12月31日

现代电力系统中的线性多维柔性评价与分析

国家自然科学基金

0+阅读 · 2011年12月31日

基于复共线性诊断的正则化方法及其在大地测量中的应用

国家自然科学基金

0+阅读 · 2011年12月31日

Stochastic Marginal Likelihood Gradients using Neural Tangent Kernels

Arxiv

0+阅读 · 2023年6月6日

Instrument Validity for Heterogeneous Causal Effects

Arxiv

0+阅读 · 2023年6月6日

Integrated Sensing, Computation, and Communication: System Framework and Performance Optimization

Arxiv

0+阅读 · 2023年6月6日

Learning-Based Heuristic for Combinatorial Optimization of the Minimum Dominating Set Problem using Graph Convolutional Networks

Arxiv

0+阅读 · 2023年6月6日

On Routing Optimization in Networks with Embedded Computational Services

Arxiv

0+阅读 · 2023年6月6日

Nonlinear Distributionally Robust Optimization

Arxiv

0+阅读 · 2023年6月5日

Inexact iterative numerical linear algebra for neural network-based spectral estimation and rare-event prediction

Arxiv

0+阅读 · 2023年6月3日

Adaptive Message Quantization and Parallelization for Distributed Full-graph GNN Training

Arxiv

0+阅读 · 2023年6月2日

Blockwise Stochastic Variance-Reduced Methods with Parallel Speedup for Multi-Block Bilevel Optimization

Arxiv

0+阅读 · 2023年6月2日

Active Learning for Domain Adaptation: An Energy-based Approach

Arxiv

13+阅读 · 2021年12月2日

VIP会员

文章信息

相关主题

相关VIP内容

【硬核书】稀疏多项式优化:理论与实践，220页pdf

【硬核书】稀疏多项式优化:理论与实践，220页pdf

专知会员服务

71+阅读 · 2022年9月30日

【博士论文】机器学习中的熵最优传输:在分布回归、重心估计和概率匹配中的应用，209页pdf

【博士论文】机器学习中的熵最优传输:在分布回归、重心估计和概率匹配中的应用，209页pdf

专知会员服务

37+阅读 · 2022年5月23日

【书籍】优化与编程：线性、非线性、动态、随机和Matlab应用，Optimizations and Programming: Linear, Nonlinear, Dynamic, Stochastic and Applications with Matlab

【书籍】优化与编程：线性、非线性、动态、随机和Matlab应用，Optimizations and Programming: Linear, Nonlinear, Dynamic, Stochastic and Applications with Matlab

专知会员服务

28+阅读 · 2022年4月8日

【ICML2021】具有持续进化策略的展开计算图的无偏梯度估计

专知会员服务

11+阅读 · 2021年8月10日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【KDD2020】最小方差采样用于图神经网络的快速训练

【KDD2020】最小方差采样用于图神经网络的快速训练

专知会员服务

28+阅读 · 2020年7月13日

最大均方差正则化贝叶斯神经网络，Bayesian Neural Networks With Maximum Mean Discrepancy Regularization

最大均方差正则化贝叶斯神经网络，Bayesian Neural Networks With Maximum Mean Discrepancy Regularization

专知会员服务

54+阅读 · 2020年3月5日

【深度学习架构、模型和技巧集合(TensorFlow/PyTorch)】’Deep Learning Models - A collection of various deep learning architectures, models, and tips'

【深度学习架构、模型和技巧集合(TensorFlow/PyTorch)】’Deep Learning Models - A collection of various deep learning architectures, models, and tips'

专知会员服务

58+阅读 · 2020年1月25日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

【ICLR2020】五篇Open代码的GNN论文

【ICLR2020】五篇Open代码的GNN论文

专知会员服务

48+阅读 · 2019年10月2日

热门VIP内容

开通专知VIP会员享更多权益服务

新质生成式AI赋能产业变革的实践与路径

用于多模态大模型的离散标记化：全面综述

Nature综述：金融网络中的物理学

【CMU博士论文】通信高效且差分隐私的优化方法

相关资讯

ICLR'23截稿, 图神经网络依然火热 (附42 篇好文整理)

ICLR'23截稿, 图神经网络依然火热 (附42 篇好文整理)

图与推荐

2+阅读 · 2022年10月5日

NeurlPS2022推荐系统论文集锦

NeurlPS2022推荐系统论文集锦

机器学习与推荐算法

1+阅读 · 2022年9月26日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

19篇ICML2019论文摘录选读！

19篇ICML2019论文摘录选读！

专知

28+阅读 · 2019年4月28日

基于PyTorch/TorchText的自然语言处理库

基于PyTorch/TorchText的自然语言处理库

专知

28+阅读 · 2019年4月22日

TorchSeg：基于pytorch的语义分割算法开源了

TorchSeg：基于pytorch的语义分割算法开源了

极市平台

20+阅读 · 2019年1月28日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

相关论文

Stochastic Marginal Likelihood Gradients using Neural Tangent Kernels

Arxiv

0+阅读 · 2023年6月6日

Instrument Validity for Heterogeneous Causal Effects

Arxiv

0+阅读 · 2023年6月6日

Integrated Sensing, Computation, and Communication: System Framework and Performance Optimization

Arxiv

0+阅读 · 2023年6月6日

Learning-Based Heuristic for Combinatorial Optimization of the Minimum Dominating Set Problem using Graph Convolutional Networks

Arxiv

0+阅读 · 2023年6月6日

On Routing Optimization in Networks with Embedded Computational Services

Arxiv

0+阅读 · 2023年6月6日

Nonlinear Distributionally Robust Optimization

Arxiv

0+阅读 · 2023年6月5日

Inexact iterative numerical linear algebra for neural network-based spectral estimation and rare-event prediction

Arxiv

0+阅读 · 2023年6月3日

Adaptive Message Quantization and Parallelization for Distributed Full-graph GNN Training

Arxiv

0+阅读 · 2023年6月2日

Blockwise Stochastic Variance-Reduced Methods with Parallel Speedup for Multi-Block Bilevel Optimization

Arxiv

0+阅读 · 2023年6月2日

Active Learning for Domain Adaptation: An Energy-based Approach

Arxiv

13+阅读 · 2021年12月2日

相关基金

一类离散Hindmarsh-Rose模型的分支延拓

国家自然科学基金

0+阅读 · 2015年12月31日

具摩擦非线性系统随机振动分析与最优控制

国家自然科学基金

0+阅读 · 2014年12月31日

几类扩散过程的逼近及应用

国家自然科学基金

1+阅读 · 2014年12月31日

凸可分半定规划的数值算法

国家自然科学基金

0+阅读 · 2013年12月31日

基于随机Barbalat引理的随机非线性系统分析与综合

国家自然科学基金

0+阅读 · 2013年12月31日

基于Sieve Bootstrap方法的长记忆过程变点研究与应用

国家自然科学基金

0+阅读 · 2013年12月31日

基于近似动态规划理论的电力系统随机动态经济调度研究

国家自然科学基金

0+阅读 · 2012年12月31日

稳健且有效的回归和变量选择方法研究

国家自然科学基金

1+阅读 · 2012年12月31日

现代电力系统中的线性多维柔性评价与分析

国家自然科学基金

0+阅读 · 2011年12月31日

基于复共线性诊断的正则化方法及其在大地测量中的应用

国家自然科学基金

0+阅读 · 2011年12月31日

微信扫码咨询专知VIP会员