配有差错保证的 Q-评价的超强参数选择方法 (Hyperparameter Selection Methods for Fitted Q-Evaluation with Error Guarantee) - 专知论文

会员服务 ·

0

超参数 · 策略评估 · state-of-the-art · 容差 · 优化器 ·

2022 年 2 月 18 日

Hyperparameter Selection Methods for Fitted Q-Evaluation with Error Guarantee

翻译：配有差错保证的 Q-评价的超强参数选择方法

Kohei Miyaguchi

from arxiv, AAAI22-AI4DO (workshop)

We are concerned with the problem of hyperparameter selection for the fitted Q-evaluation (FQE). FQE is one of the state-of-the-art method for offline policy evaluation (OPE), which is essential to the reinforcement learning without environment simulators. However, like other OPE methods, FQE is not hyperparameter-free itself and that undermines the utility in real-life applications. We address this issue by proposing a framework of approximate hyperparameter selection (AHS) for FQE, which defines a notion of optimality (called selection criteria) in a quantitative and interpretable manner without hyperparameters. We then derive four AHS methods each of which has different characteristics such as distribution-mismatch tolerance and time complexity. We also confirm in experiments that the error bound given by the theory matches empirical observations.

翻译：我们所关注的是为适应的Q评价选择超参数的问题,FQE是离线政策评价的最先进方法之一,这是在没有环境模拟器的情况下加强学习的关键,然而,同其他OPE方法一样,FQE本身不是无超参数的,而且损害实际应用的效用。我们提出一个框架,为FQE选择近似超参数(AHS)来解决这个问题,它以定量和可解释的方式界定最佳性(所谓选择标准)的概念,而没有超参数。我们然后得出四种AHS方法,其中每一种都有不同的特征,如分布-分布式容忍度和时间复杂性。我们还在实验中确认,理论所约束的错误与经验观测相吻合。

0

相关内容

超参数

在贝叶斯统计中，超参数是先验分布的参数；该术语用于将它们与所分析的基础系统的模型参数区分开。

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

【推荐】YOLO实时目标检测(6fps)

【推荐】YOLO实时目标检测(6fps)

机器学习研究会

20+阅读 · 2017年11月5日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

大规模分数阶微分系统的高性能并行算法研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于多偏好与变量分解的大规模高维目标优化方法及应用研究

国家自然科学基金

0+阅读 · 2014年12月31日

考虑复杂环境影响的大跨度斜拉桥异常状态统计识别

国家自然科学基金

0+阅读 · 2013年12月31日

统一贝叶斯框架下的活动轮廓模型OCT心管图像序列分割

国家自然科学基金

0+阅读 · 2012年12月31日

自然视觉的选择性注意在计算机视觉中的实现

国家自然科学基金

1+阅读 · 2012年12月31日

基于稳健统计的SAR图像配准方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

云服务环境下服务选择与组合优化方法

国家自然科学基金

0+阅读 · 2011年12月31日

广义Fermat猜想与相关的丢番图方程

国家自然科学基金

1+阅读 · 2009年12月31日

分布式数据流的集成模式挖掘模型和概念漂移检测算法研究

国家自然科学基金

2+阅读 · 2008年12月31日

启发式算法设计中的骨架分析与应用

国家自然科学基金

0+阅读 · 2008年12月31日

Per-run Algorithm Selection with Warm-starting using Trajectory-based Features

Per-run Algorithm Selection with Warm-starting using Trajectory-based Features

Arxiv

0+阅读 · 2022年4月20日

Memory-Constrained Policy Optimization

Arxiv

0+阅读 · 2022年4月20日

Expert-Calibrated Learning for Online Optimization with Switching Costs

Arxiv

0+阅读 · 2022年4月18日

Selection of proposal distributions for multiple importance sampling

Arxiv

0+阅读 · 2022年4月18日

Subset selection for linear mixed models

Arxiv

1+阅读 · 2022年4月18日

Empirical Evaluation and Theoretical Analysis for Representation Learning: A Survey

Arxiv

0+阅读 · 2022年4月18日

Risk and optimal policies in bandit experiments

Risk and optimal policies in bandit experiments

Arxiv

0+阅读 · 2022年4月18日

Proximal nested sampling for high-dimensional Bayesian model selection

Proximal nested sampling for high-dimensional Bayesian model selection

Arxiv

0+阅读 · 2022年4月15日

Towards a Unified Framework for Uncertainty-aware Nonlinear Variable Selection with Theoretical Guarantees

Arxiv

0+阅读 · 2022年4月15日

A general framework for identification of permissible variable subsets and development of structured variable selection methods

Arxiv

0+阅读 · 2022年4月14日

VIP会员

文章信息

相关主题

state-of-the-art

相关VIP内容

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

美空军指挥参谋学院 · 联合空中作战规划课程介绍（2025年） | 22页

【普林斯顿博士论文】在线学习：优化、控制与学习理论

北约第十七届（2025年）网络冲突国际会议论文集 | 272页

【NeurIPS2025】《LeapFactual：基于条件流匹配的可靠视觉反事实解释》

相关资讯

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

【推荐】YOLO实时目标检测(6fps)

【推荐】YOLO实时目标检测(6fps)

机器学习研究会

20+阅读 · 2017年11月5日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

相关论文

Per-run Algorithm Selection with Warm-starting using Trajectory-based Features

Per-run Algorithm Selection with Warm-starting using Trajectory-based Features

Arxiv

0+阅读 · 2022年4月20日

Memory-Constrained Policy Optimization

Arxiv

0+阅读 · 2022年4月20日

Expert-Calibrated Learning for Online Optimization with Switching Costs

Arxiv

0+阅读 · 2022年4月18日

Selection of proposal distributions for multiple importance sampling

Arxiv

0+阅读 · 2022年4月18日

Subset selection for linear mixed models

Arxiv

1+阅读 · 2022年4月18日

Empirical Evaluation and Theoretical Analysis for Representation Learning: A Survey

Arxiv

0+阅读 · 2022年4月18日

Risk and optimal policies in bandit experiments

Risk and optimal policies in bandit experiments

Arxiv

0+阅读 · 2022年4月18日

Proximal nested sampling for high-dimensional Bayesian model selection

Proximal nested sampling for high-dimensional Bayesian model selection

Arxiv

0+阅读 · 2022年4月15日

Towards a Unified Framework for Uncertainty-aware Nonlinear Variable Selection with Theoretical Guarantees

Arxiv

0+阅读 · 2022年4月15日

A general framework for identification of permissible variable subsets and development of structured variable selection methods

Arxiv

0+阅读 · 2022年4月14日

相关基金

大规模分数阶微分系统的高性能并行算法研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于多偏好与变量分解的大规模高维目标优化方法及应用研究

国家自然科学基金

0+阅读 · 2014年12月31日

考虑复杂环境影响的大跨度斜拉桥异常状态统计识别

国家自然科学基金

0+阅读 · 2013年12月31日

统一贝叶斯框架下的活动轮廓模型OCT心管图像序列分割

国家自然科学基金

0+阅读 · 2012年12月31日

自然视觉的选择性注意在计算机视觉中的实现

国家自然科学基金

1+阅读 · 2012年12月31日

基于稳健统计的SAR图像配准方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

云服务环境下服务选择与组合优化方法

国家自然科学基金

0+阅读 · 2011年12月31日

广义Fermat猜想与相关的丢番图方程

国家自然科学基金

1+阅读 · 2009年12月31日

分布式数据流的集成模式挖掘模型和概念漂移检测算法研究

国家自然科学基金

2+阅读 · 2008年12月31日

启发式算法设计中的骨架分析与应用

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员