选择顶层替代物的简单最佳抽样政策 (Asymptotically Optimal Sampling Policy for Selecting Top-m Alternatives) - 专知论文

会员服务 ·

0

优化器 · 样本 · 值函数近似 · 价值函数 · 蒙特卡罗 ·

2021 年 11 月 30 日

Asymptotically Optimal Sampling Policy for Selecting Top-m Alternatives

翻译：选择顶层替代物的简单最佳抽样政策

Gongbo Zhang,Yijie Peng,Jianghua Zhang,Enlu Zhou

We consider selecting the top-$m$ alternatives from a finite number of alternatives via Monte Carlo simulation. Under a Bayesian framework, we formulate the sampling decision as a stochastic dynamic programming problem, and develop a sequential sampling policy that maximizes a value function approximation one-step look ahead. To show the asymptotic optimality of the proposed procedure, the asymptotically optimal sampling ratios which optimize large deviations rate of the probability of false selection for selecting top-$m$ alternatives has been rigorously defined. The proposed sampling policy is not only proved to be consistent but also achieves the asymptotically optimal sampling ratios. Numerical experiments demonstrate superiority of the proposed allocation procedure over existing ones.

翻译：我们考虑通过Monte Carlo模拟从一定数量的替代品中选择最高至百万美元的替代品。在Bayesian框架下,我们将抽样决定作为随机动态程序拟定问题,并制定一项顺序抽样政策,使价值函数的近似值最大化。为了显示拟议程序的无症状最佳性,已经严格界定了无症状最佳采样比率,该比率优化了选择最高至百万美元替代品的虚假选择概率的巨大偏差率。拟议的采样政策不仅证明是一致的,而且还实现了无症状最佳采样比率。数字实验表明,拟议的采样程序优于现有的采样程序。

0

相关内容

优化器

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

不可错过！UIUC最新《统计强化学习》课程！

专知会员服务

54+阅读 · 2020年9月7日

【经典书】贝叶斯编程，378页pdf，Bayesian Programming

【经典书】贝叶斯编程，378页pdf，Bayesian Programming

专知会员服务

251+阅读 · 2020年5月18日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

在线变分推断，76页ppt，A Regret Bound for Online Variational Inference

在线变分推断，76页ppt，A Regret Bound for Online Variational Inference

专知会员服务

21+阅读 · 2019年12月2日

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

10+阅读 · 2019年10月24日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

LibRec 精选：AutoML for Contextual Bandits

LibRec 精选：AutoML for Contextual Bandits

LibRec智能推荐

7+阅读 · 2019年9月19日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

LibRec 精选：推荐的可解释性[综述]

LibRec 精选：推荐的可解释性[综述]

LibRec智能推荐

10+阅读 · 2018年5月4日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

Quasi Maximum Likelihood Estimation and Inference of Large Approximate Dynamic Factor Models via the EM algorithm

Arxiv

0+阅读 · 2022年2月1日

PAGE-PG: A Simple and Loopless Variance-Reduced Policy Gradient Method with Probabilistic Gradient Estimation

Arxiv

0+阅读 · 2022年2月1日

Optimal Estimation of Off-Policy Policy Gradient via Double Fitted Iteration

Arxiv

0+阅读 · 2022年1月31日

Near-Optimal Regret for Adversarial MDP with Delayed Bandit Feedback

Arxiv

0+阅读 · 2022年1月31日

Scheduling Policies for Stability and Optimal Server Running Cost in Cloud Computing Platforms

Arxiv

0+阅读 · 2022年1月31日

Optimal Cox Regression Subsampling Procedure with Rare Events

Arxiv

0+阅读 · 2022年1月30日

Constrained Variational Policy Optimization for Safe Reinforcement Learning

Arxiv

0+阅读 · 2022年1月28日

Minimal Variance Sampling with Provable Guarantees for Fast Training of Graph Neural Networks

Minimal Variance Sampling with Provable Guarantees for Fast Training of Graph Neural Networks

Arxiv

13+阅读 · 2020年6月24日

A fast algorithm with minimax optimal guarantees for topic models with an unknown number of topics

Arxiv

7+阅读 · 2018年6月12日

Optimal Algorithms for Distributed Optimization

Arxiv

3+阅读 · 2017年12月1日

VIP会员

文章信息

相关主题

值函数近似

相关VIP内容

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

不可错过！UIUC最新《统计强化学习》课程！

专知会员服务

54+阅读 · 2020年9月7日

【经典书】贝叶斯编程，378页pdf，Bayesian Programming

【经典书】贝叶斯编程，378页pdf，Bayesian Programming

专知会员服务

251+阅读 · 2020年5月18日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

在线变分推断，76页ppt，A Regret Bound for Online Variational Inference

在线变分推断，76页ppt，A Regret Bound for Online Variational Inference

专知会员服务

21+阅读 · 2019年12月2日

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

10+阅读 · 2019年10月24日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

热门VIP内容

开通专知VIP会员享更多权益服务

接触战中的无人机优势：美军旅级部队面临的小型无人机系统挑战与调整

从代码基础模型到智能体与应用：代码智能的全面综述与实践指南

《俄乌战争背景下俄罗斯的战略性海军分析（2022-2025年）》最新100页报告

【斯坦福博士论文】数据、决策与依赖：构建可信人工智能的挑战

相关资讯

LibRec 精选：AutoML for Contextual Bandits

LibRec 精选：AutoML for Contextual Bandits

LibRec智能推荐

7+阅读 · 2019年9月19日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

LibRec 精选：推荐的可解释性[综述]

LibRec 精选：推荐的可解释性[综述]

LibRec智能推荐

10+阅读 · 2018年5月4日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

相关论文

Quasi Maximum Likelihood Estimation and Inference of Large Approximate Dynamic Factor Models via the EM algorithm

Arxiv

0+阅读 · 2022年2月1日

PAGE-PG: A Simple and Loopless Variance-Reduced Policy Gradient Method with Probabilistic Gradient Estimation

Arxiv

0+阅读 · 2022年2月1日

Optimal Estimation of Off-Policy Policy Gradient via Double Fitted Iteration

Arxiv

0+阅读 · 2022年1月31日

Near-Optimal Regret for Adversarial MDP with Delayed Bandit Feedback

Arxiv

0+阅读 · 2022年1月31日

Scheduling Policies for Stability and Optimal Server Running Cost in Cloud Computing Platforms

Arxiv

0+阅读 · 2022年1月31日

Optimal Cox Regression Subsampling Procedure with Rare Events

Arxiv

0+阅读 · 2022年1月30日

Constrained Variational Policy Optimization for Safe Reinforcement Learning

Arxiv

0+阅读 · 2022年1月28日

Minimal Variance Sampling with Provable Guarantees for Fast Training of Graph Neural Networks

Minimal Variance Sampling with Provable Guarantees for Fast Training of Graph Neural Networks

Arxiv

13+阅读 · 2020年6月24日

A fast algorithm with minimax optimal guarantees for topic models with an unknown number of topics

Arxiv

7+阅读 · 2018年6月12日

Optimal Algorithms for Distributed Optimization

Arxiv

3+阅读 · 2017年12月1日

微信扫码咨询专知VIP会员