使用最佳基准值减少计分函数差异 (Variance Reduction for Score Functions Using Optimal Baselines) - 专知论文

会员服务 ·

0

估计/估计量 · 泛函 · 基准 · 方差减小 · 方差 ·

2022 年 12 月 27 日

Variance Reduction for Score Functions Using Optimal Baselines

翻译：使用最佳基准值减少计分函数差异

Ronan Keane,H. Oliver Gao

Many problems involve the use of models which learn probability distributions or incorporate randomness in some way. In such problems, because computing the true expected gradient may be intractable, a gradient estimator is used to update the model parameters. When the model parameters directly affect a probability distribution, the gradient estimator will involve score function terms. This paper studies baselines, a variance reduction technique for score functions. Motivated primarily by reinforcement learning, we derive for the first time an expression for the optimal state-dependent baseline, the baseline which results in a gradient estimator with minimum variance. Although we show that there exist examples where the optimal baseline may be arbitrarily better than a value function baseline, we find that the value function baseline usually performs similarly to an optimal baseline in terms of variance reduction. Moreover, the value function can also be used for bootstrapping estimators of the return, leading to additional variance reduction. Our results give new insight and justification for why value function baselines and the generalized advantage estimator (GAE) work well in practice.

翻译：许多问题涉及使用模型来学习概率分布或以某种方式纳入随机性。在这些问题中,由于计算真实的预期梯度可能难以解决,因此使用梯度估计器来更新模型参数。当模型参数直接影响概率分布时,梯度估计器将涉及得分函数。本文研究基准,即分数的差分减少技术。主要通过强化学习,我们第一次得出了最佳状态依赖基线的表达方式,该基准导致梯度估计器产生最小差异。虽然我们表明有实例表明最佳基线可能任意优于价值函数基线,但我们发现,价值函数基线通常在减少差异方面与最佳基线相似。此外,值函数功能也可以用于计算回报的推车估计器,从而导致更多的差异减少。我们的结果提供了新的洞察力和理由,说明为何价值函数基线和普遍优势估计器(GAE)在实践中运作良好。

0

相关内容

估计/估计量

估计/估计量

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

机器学习损失函数概述，Loss Functions in Machine Learning

机器学习损失函数概述，Loss Functions in Machine Learning

专知会员服务

83+阅读 · 2022年3月19日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

公共就业规模和结构优化的机理与模型

国家自然科学基金

0+阅读 · 2014年12月31日

接力创新中大数据价值的评估与分配研究

国家自然科学基金

1+阅读 · 2014年12月31日

采用pinball loss的MEE算法研究

国家自然科学基金

1+阅读 · 2013年12月31日

混凝土Weibull统计尺寸效应理论模型改进研究

国家自然科学基金

0+阅读 · 2013年12月31日

政府中期支出框架体系研究

国家自然科学基金

0+阅读 · 2013年12月31日

大规模机器学习问题的结构优化方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

多元代数插值的计算机数学方法

国家自然科学基金

1+阅读 · 2011年12月31日

BEC的保几何结构数值模拟与研究

国家自然科学基金

0+阅读 · 2011年12月31日

改进的Unscented卡尔曼滤波与电池组SOC快速精确估计

国家自然科学基金

0+阅读 · 2008年12月31日

A Numerical Approach to Optimal Sequential Multi-Hypothesis Testing

Arxiv

0+阅读 · 2023年2月26日

On Deep Generative Models for Approximation and Estimation of Distributions on Manifolds

Arxiv

0+阅读 · 2023年2月25日

Truly Bayesian Entropy Estimation

Arxiv

0+阅读 · 2023年2月25日

Scalable Unbalanced Sobolev Transport for Measures on a Graph

Arxiv

0+阅读 · 2023年2月24日

Using Automated Algorithm Configuration for Parameter Control

Arxiv

0+阅读 · 2023年2月23日

Evaluation of Combinatorial Optimisation Algorithms for c-Optimal Experimental Designs with Correlated Observations

Arxiv

0+阅读 · 2023年2月23日

High-dimensional limit theorems for SGD: Effective dynamics and critical scaling

Arxiv

0+阅读 · 2023年2月23日

When is Momentum Extragradient Optimal? A Polynomial-Based Analysis

Arxiv

0+阅读 · 2023年2月23日

The Causal Learning of Retail Delinquency

Arxiv

14+阅读 · 2020年12月17日

IEOPF: An Active Contour Model for Image Segmentation with Inhomogeneities Estimated by Orthogonal Primary Functions

Arxiv

10+阅读 · 2018年1月20日

VIP会员

文章信息

相关主题

估计/估计量

相关VIP内容

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

机器学习损失函数概述，Loss Functions in Machine Learning

机器学习损失函数概述，Loss Functions in Machine Learning

专知会员服务

83+阅读 · 2022年3月19日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【CMU博士论文】以人为中心的强化学习

任务规划与地形分析：现代复杂环境作战导航体系

认知优势：人工智能在国家安全决策中的核心作用

大模型赋能的具身智能：决策与具身学习综述

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

相关论文

A Numerical Approach to Optimal Sequential Multi-Hypothesis Testing

Arxiv

0+阅读 · 2023年2月26日

On Deep Generative Models for Approximation and Estimation of Distributions on Manifolds

Arxiv

0+阅读 · 2023年2月25日

Truly Bayesian Entropy Estimation

Arxiv

0+阅读 · 2023年2月25日

Scalable Unbalanced Sobolev Transport for Measures on a Graph

Arxiv

0+阅读 · 2023年2月24日

Using Automated Algorithm Configuration for Parameter Control

Arxiv

0+阅读 · 2023年2月23日

Evaluation of Combinatorial Optimisation Algorithms for c-Optimal Experimental Designs with Correlated Observations

Arxiv

0+阅读 · 2023年2月23日

High-dimensional limit theorems for SGD: Effective dynamics and critical scaling

Arxiv

0+阅读 · 2023年2月23日

When is Momentum Extragradient Optimal? A Polynomial-Based Analysis

Arxiv

0+阅读 · 2023年2月23日

The Causal Learning of Retail Delinquency

Arxiv

14+阅读 · 2020年12月17日

IEOPF: An Active Contour Model for Image Segmentation with Inhomogeneities Estimated by Orthogonal Primary Functions

Arxiv

10+阅读 · 2018年1月20日

相关基金

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

公共就业规模和结构优化的机理与模型

国家自然科学基金

0+阅读 · 2014年12月31日

接力创新中大数据价值的评估与分配研究

国家自然科学基金

1+阅读 · 2014年12月31日

采用pinball loss的MEE算法研究

国家自然科学基金

1+阅读 · 2013年12月31日

混凝土Weibull统计尺寸效应理论模型改进研究

国家自然科学基金

0+阅读 · 2013年12月31日

政府中期支出框架体系研究

国家自然科学基金

0+阅读 · 2013年12月31日

大规模机器学习问题的结构优化方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

多元代数插值的计算机数学方法

国家自然科学基金

1+阅读 · 2011年12月31日

BEC的保几何结构数值模拟与研究

国家自然科学基金

0+阅读 · 2011年12月31日

改进的Unscented卡尔曼滤波与电池组SOC快速精确估计

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员