A Convex Relaxation Approach to Bayesian Regret Minimization in Offline Bandits - 专知论文

会员服务 ·

0

赌博机/老虎机 · 置信度 · 优化器 · Learning · Bandits ·

2023 年 6 月 2 日

A Convex Relaxation Approach to Bayesian Regret Minimization in Offline Bandits

翻译：暂无翻译

Mohammad Ghavamzadeh,Marek Petrik,Guy Tennenholtz

Algorithms for offline bandits must optimize decisions in uncertain environments using only offline data. A compelling and increasingly popular objective in offline bandits is to learn a policy which achieves low Bayesian regret with high confidence. An appealing approach to this problem, inspired by recent offline reinforcement learning results, is to maximize a form of lower confidence bound (LCB). This paper proposes a new approach that directly minimizes upper bounds on Bayesian regret using efficient conic optimization solvers. Our bounds build on connections among Bayesian regret, Value-at-Risk (VaR), and chance-constrained optimization. Compared to prior work, our algorithm attains superior theoretical offline regret bounds and better results in numerical simulations. Finally, we provide some evidence that popular LCB-style algorithms may be unsuitable for minimizing Bayesian regret in offline bandits.

翻译：暂无翻译

0

相关内容

赌博机/老虎机

赌博机/老虎机

【干货书】数据分析优化，Optimization for Modern Data Analysis，117页pdf

【干货书】数据分析优化，Optimization for Modern Data Analysis，117页pdf

专知会员服务

65+阅读 · 2023年2月15日

不可错过！杜克大学《因果推断》课程，全面讲述因果推理

不可错过！杜克大学《因果推断》课程，全面讲述因果推理

专知会员服务

52+阅读 · 2022年10月22日

不可错过！700+ppt《因果推理》课程！杜克大学Fan Li教程

不可错过！700+ppt《因果推理》课程！杜克大学Fan Li教程

专知会员服务

72+阅读 · 2022年7月11日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

专知会员服务

104+阅读 · 2022年2月10日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

93+阅读 · 2020年2月12日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

【论文推荐】最新六篇强化学习相关论文—Sublinear、机器阅读理解、加速强化学习、对抗性奖励学习、人机交互

【论文推荐】最新六篇强化学习相关论文—Sublinear、机器阅读理解、加速强化学习、对抗性奖励学习、人机交互

专知

17+阅读 · 2018年4月28日

【论文推荐】最新八篇网络节点表示相关论文—可扩展嵌入、对抗自编码器、图划分、异构信息、显式矩阵分解、深度高斯、图、随机游走

【论文推荐】最新八篇网络节点表示相关论文—可扩展嵌入、对抗自编码器、图划分、异构信息、显式矩阵分解、深度高斯、图、随机游走

专知

14+阅读 · 2018年3月30日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

高寒草甸土壤微生物群落响应气候变化的微观机制

国家自然科学基金

0+阅读 · 2014年12月31日

沿空留巷长期扰动变形规律与卸压机理研究

国家自然科学基金

0+阅读 · 2013年12月31日

四轮独立驱动电动汽车多目标优化与控制分配方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于空间优化的连续型多设施选址方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

复合材料多尺度疲劳损伤高精度预测技术

国家自然科学基金

0+阅读 · 2012年12月31日

广西巴马地区长寿群体的认知状况调查及认知相关基因的多态性研究

国家自然科学基金

0+阅读 · 2011年12月31日

低银无铅微焊点多场耦合服役下界面演化及损伤机理

国家自然科学基金

0+阅读 · 2011年12月31日

改进Max-SAT算法的关键技术研究

国家自然科学基金

0+阅读 · 2009年12月31日

采用FRP-混凝土组合桥面板的组合梁桥疲劳性能及失效机理研究

国家自然科学基金

0+阅读 · 2009年12月31日

微生物降解多环芳烃的代谢物分析及其共代谢机理

国家自然科学基金

0+阅读 · 2009年12月31日

Open Problem: Polynomial linearly-convergent method for geodesically convex optimization?

Arxiv

0+阅读 · 2023年7月24日

Verification and Synthesis of Robust Control Barrier Functions: Multilevel Polynomial Optimization and Semidefinite Relaxation

Arxiv

0+阅读 · 2023年7月21日

Tight Bounds for $γ$-Regret via the Decision-Estimation Coefficient

Arxiv

0+阅读 · 2023年7月21日

A Competitive Learning Approach for Specialized Models: A Solution for Complex Physical Systems with Distinct Functional Regimes

Arxiv

0+阅读 · 2023年7月21日

JoinGym: An Efficient Query Optimization Environment for Reinforcement Learning

Arxiv

0+阅读 · 2023年7月21日

An Efficient Interior-Point Method for Online Convex Optimization

Arxiv

0+阅读 · 2023年7月21日

Bandits with Deterministically Evolving States

Arxiv

0+阅读 · 2023年7月21日

Investigating minimizing the training set fill distance in machine learning regression

Arxiv

0+阅读 · 2023年7月20日

Leveraging Offline Data in Online Reinforcement Learning

Arxiv

0+阅读 · 2023年7月20日

Learning Discrete Structures for Graph Neural Networks

Arxiv

17+阅读 · 2019年3月28日

VIP会员

文章信息

相关主题

赌博机/老虎机

相关VIP内容

【干货书】数据分析优化，Optimization for Modern Data Analysis，117页pdf

【干货书】数据分析优化，Optimization for Modern Data Analysis，117页pdf

专知会员服务

65+阅读 · 2023年2月15日

不可错过！杜克大学《因果推断》课程，全面讲述因果推理

不可错过！杜克大学《因果推断》课程，全面讲述因果推理

专知会员服务

52+阅读 · 2022年10月22日

不可错过！700+ppt《因果推理》课程！杜克大学Fan Li教程

不可错过！700+ppt《因果推理》课程！杜克大学Fan Li教程

专知会员服务

72+阅读 · 2022年7月11日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

专知会员服务

104+阅读 · 2022年2月10日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

93+阅读 · 2020年2月12日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

热门VIP内容

开通专知VIP会员享更多权益服务

【博士论文】在低维与高维空间中对潜在表征的分析、建模与变换

《美军使用大语言模型技术生成领域特定文档》2025最新379页

【NeurIPS 2025】以语言为中心的全模态表征学习的可扩展性研究

智能体化多模态大语言模型综述

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

【论文推荐】最新六篇强化学习相关论文—Sublinear、机器阅读理解、加速强化学习、对抗性奖励学习、人机交互

【论文推荐】最新六篇强化学习相关论文—Sublinear、机器阅读理解、加速强化学习、对抗性奖励学习、人机交互

专知

17+阅读 · 2018年4月28日

【论文推荐】最新八篇网络节点表示相关论文—可扩展嵌入、对抗自编码器、图划分、异构信息、显式矩阵分解、深度高斯、图、随机游走

【论文推荐】最新八篇网络节点表示相关论文—可扩展嵌入、对抗自编码器、图划分、异构信息、显式矩阵分解、深度高斯、图、随机游走

专知

14+阅读 · 2018年3月30日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

相关论文

Open Problem: Polynomial linearly-convergent method for geodesically convex optimization?

Arxiv

0+阅读 · 2023年7月24日

Verification and Synthesis of Robust Control Barrier Functions: Multilevel Polynomial Optimization and Semidefinite Relaxation

Arxiv

0+阅读 · 2023年7月21日

Tight Bounds for $γ$-Regret via the Decision-Estimation Coefficient

Arxiv

0+阅读 · 2023年7月21日

A Competitive Learning Approach for Specialized Models: A Solution for Complex Physical Systems with Distinct Functional Regimes

Arxiv

0+阅读 · 2023年7月21日

JoinGym: An Efficient Query Optimization Environment for Reinforcement Learning

Arxiv

0+阅读 · 2023年7月21日

An Efficient Interior-Point Method for Online Convex Optimization

Arxiv

0+阅读 · 2023年7月21日

Bandits with Deterministically Evolving States

Arxiv

0+阅读 · 2023年7月21日

Investigating minimizing the training set fill distance in machine learning regression

Arxiv

0+阅读 · 2023年7月20日

Leveraging Offline Data in Online Reinforcement Learning

Arxiv

0+阅读 · 2023年7月20日

Learning Discrete Structures for Graph Neural Networks

Arxiv

17+阅读 · 2019年3月28日

相关基金

高寒草甸土壤微生物群落响应气候变化的微观机制

国家自然科学基金

0+阅读 · 2014年12月31日

沿空留巷长期扰动变形规律与卸压机理研究

国家自然科学基金

0+阅读 · 2013年12月31日

四轮独立驱动电动汽车多目标优化与控制分配方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于空间优化的连续型多设施选址方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

复合材料多尺度疲劳损伤高精度预测技术

国家自然科学基金

0+阅读 · 2012年12月31日

广西巴马地区长寿群体的认知状况调查及认知相关基因的多态性研究

国家自然科学基金

0+阅读 · 2011年12月31日

低银无铅微焊点多场耦合服役下界面演化及损伤机理

国家自然科学基金

0+阅读 · 2011年12月31日

改进Max-SAT算法的关键技术研究

国家自然科学基金

0+阅读 · 2009年12月31日

采用FRP-混凝土组合桥面板的组合梁桥疲劳性能及失效机理研究

国家自然科学基金

0+阅读 · 2009年12月31日

微生物降解多环芳烃的代谢物分析及其共代谢机理

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员