土匪实验的风险和最佳政策 (Risk and optimal policies in bandit experiments) - 专知论文

会员服务 ·

0

Minimax · 赌博机/老虎机 · 贝叶斯风险 · 优化器 · PDE ·

2022 年 8 月 9 日

Risk and optimal policies in bandit experiments

翻译：土匪实验的风险和最佳政策

Karun Adusumilli

We provide a decision theoretic analysis of bandit experiments. Working within the framework of diffusion asymptotics, we define suitable notions of asymptotic Bayes and minimax risk for these experiments. For normally distributed rewards, the minimal Bayes risk can be characterized as the solution to a second-order partial differential equation (PDE). Using a limit of experiments approach, we show that this PDE characterization also holds asymptotically under both parametric and non-parametric distributions of the rewards. The approach further describes the state variables it is asymptotically sufficient to restrict attention to, and thereby suggests a practical strategy for dimension reduction. The PDEs characterizing minimal Bayes risk can be solved efficiently using sparse matrix routines. We derive the optimal Bayes and minimax policies from their numerical solutions. These optimal policies substantially dominate existing methods such as Thompson sampling and UCB, often by a factor of two. The framework also covers time discounting and pure exploration.

翻译：我们提供了对盗匪实验的决定理论分析。在扩散无症状的框架中,我们界定了这些实验的无症状湾和微粒风险的适当概念。对于通常分布的奖励,最低限度的贝ys风险可以被定性为二级局部差异方程(PDE)的解决方案。我们用一个实验的限度来表明,这种PDE特征在对准和非参数的奖励分布下也处于无症状状态。该方法进一步描述了限制注意力的状态变量,从而提出了减少维度的实用战略。将最小海湾风险定性为最小海湾的PDE可以使用稀有的矩阵常规有效解决。我们从数字解决方案中获取最佳的湾和小型马克斯政策。这些最佳政策在很大程度上支配了诸如汤普森采样和UCB(UCB)等现有方法,通常以两个要素为主。框架还包括时间折扣和纯度勘探。

0

相关内容

Minimax

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

专知会员服务

69+阅读 · 2021年3月27日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

2020数据工程师成长路线图

专知会员服务

41+阅读 · 2020年9月6日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

UC.Berkeley CS189讲义教材:《机器学习全面指南》，185页pdf

专知会员服务

162+阅读 · 2020年1月16日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

Hamilton-Jacibi方程的弱KAM理论

国家自然科学基金

2+阅读 · 2017年12月31日

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

Anderson型多酸的不对称修饰及可控组装研究

国家自然科学基金

1+阅读 · 2014年12月31日

解析函数空间上的Toeplitz型奇异积分算子

国家自然科学基金

0+阅读 · 2014年12月31日

关于具有奇异参数的偏微分方程边值问题与带双边反射的随机偏微分方程的研究

国家自然科学基金

0+阅读 · 2013年12月31日

半参数回归分析的随机函数法及其高维情形

国家自然科学基金

2+阅读 · 2012年12月31日

UV-C、JA和H2O2调控葡萄白藜芦醇合成信号转导途径的交叉对话分子机制

国家自然科学基金

0+阅读 · 2011年12月31日

过渡金属催化的二茂铁联烯化合物的偶联反应研究

国家自然科学基金

0+阅读 · 2009年12月31日

拋物奇异积分算子有界性及其应用

国家自然科学基金

0+阅读 · 2009年12月31日

铁电配合物的合成，结构与性质研究

国家自然科学基金

0+阅读 · 2008年12月31日

Inference on Causal Effects of Interventions in Time using Gaussian Processes

Arxiv

0+阅读 · 2022年10月6日

Numerical Modelling of the Brain Poromechanics by High-Order Discontinuous Galerkin Methods

Arxiv

0+阅读 · 2022年10月5日

Efficient Estimation Under Data Fusion

Arxiv

0+阅读 · 2022年10月5日

Renewable Composite Quantile Method and Algorithm for Nonparametric Models with Streaming Data

Arxiv

0+阅读 · 2022年10月3日

Bayesian Inference using the Proximal Mapping: Uncertainty Quantification under Varying Dimensionality

Arxiv

0+阅读 · 2022年10月3日

Inferring Manifolds From Noisy Data Using Gaussian Processes

Arxiv

0+阅读 · 2022年10月2日

A General Framework for Sample-Efficient Function Approximation in Reinforcement Learning

A General Framework for Sample-Efficient Function Approximation in Reinforcement Learning

Arxiv

1+阅读 · 2022年9月30日

Generalized Fiducial Inference on Differentiable Manifolds

Arxiv

0+阅读 · 2022年9月30日

Inverse Online Learning: Understanding Non-Stationary and Reactionary Policies

Arxiv

0+阅读 · 2022年9月30日

A Posteriori Risk Classification and Ratemaking with Random Effects in the Mixture-of-Experts Model

Arxiv

0+阅读 · 2022年9月30日

VIP会员

文章信息

相关主题

赌博机/老虎机

贝叶斯风险

相关VIP内容

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

专知会员服务

69+阅读 · 2021年3月27日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

2020数据工程师成长路线图

专知会员服务

41+阅读 · 2020年9月6日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

UC.Berkeley CS189讲义教材:《机器学习全面指南》，185页pdf

专知会员服务

162+阅读 · 2020年1月16日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

新书册《几何深度学习的数学基础》

中程单向攻击无人机的战略意义：俄乌战争启示

在无标注条件下适配视觉—语言模型：全面综述

面向视觉语言模型的持续学习：遗忘之外的综述与分类体系

相关资讯

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

相关论文

Inference on Causal Effects of Interventions in Time using Gaussian Processes

Arxiv

0+阅读 · 2022年10月6日

Numerical Modelling of the Brain Poromechanics by High-Order Discontinuous Galerkin Methods

Arxiv

0+阅读 · 2022年10月5日

Efficient Estimation Under Data Fusion

Arxiv

0+阅读 · 2022年10月5日

Renewable Composite Quantile Method and Algorithm for Nonparametric Models with Streaming Data

Arxiv

0+阅读 · 2022年10月3日

Bayesian Inference using the Proximal Mapping: Uncertainty Quantification under Varying Dimensionality

Arxiv

0+阅读 · 2022年10月3日

Inferring Manifolds From Noisy Data Using Gaussian Processes

Arxiv

0+阅读 · 2022年10月2日

A General Framework for Sample-Efficient Function Approximation in Reinforcement Learning

A General Framework for Sample-Efficient Function Approximation in Reinforcement Learning

Arxiv

1+阅读 · 2022年9月30日

Generalized Fiducial Inference on Differentiable Manifolds

Arxiv

0+阅读 · 2022年9月30日

Inverse Online Learning: Understanding Non-Stationary and Reactionary Policies

Arxiv

0+阅读 · 2022年9月30日

A Posteriori Risk Classification and Ratemaking with Random Effects in the Mixture-of-Experts Model

Arxiv

0+阅读 · 2022年9月30日

相关基金

Hamilton-Jacibi方程的弱KAM理论

国家自然科学基金

2+阅读 · 2017年12月31日

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

Anderson型多酸的不对称修饰及可控组装研究

国家自然科学基金

1+阅读 · 2014年12月31日

解析函数空间上的Toeplitz型奇异积分算子

国家自然科学基金

0+阅读 · 2014年12月31日

关于具有奇异参数的偏微分方程边值问题与带双边反射的随机偏微分方程的研究

国家自然科学基金

0+阅读 · 2013年12月31日

半参数回归分析的随机函数法及其高维情形

国家自然科学基金

2+阅读 · 2012年12月31日

UV-C、JA和H2O2调控葡萄白藜芦醇合成信号转导途径的交叉对话分子机制

国家自然科学基金

0+阅读 · 2011年12月31日

过渡金属催化的二茂铁联烯化合物的偶联反应研究

国家自然科学基金

0+阅读 · 2009年12月31日

拋物奇异积分算子有界性及其应用

国家自然科学基金

0+阅读 · 2009年12月31日

铁电配合物的合成，结构与性质研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员