带有Knappsacks 的大约固定的土匪</s> (Approximately Stationary Bandits with Knapsacks) - 专知论文

会员服务 ·

0

赌博机/老虎机 · CASES · 平稳的 · 近似 · CASE ·

2023 年 2 月 28 日

Approximately Stationary Bandits with Knapsacks

翻译：带有Knappsacks 的大约固定的土匪

Giannis Fikioris,Éva Tardos

Bandits with Knapsacks (BwK), the generalization of the Multi-Armed Bandits under budget constraints, has received a lot of attention in recent years. It has numerous applications, including dynamic pricing, repeated auctions, etc. Previous work has focused on one of the two extremes: Stochastic BwK where the rewards and consumptions of the resources each round are sampled from an i.i.d. distribution, and Adversarial BwK where these values are picked by an adversary. Achievable guarantees in the two cases exhibit a massive gap: No-regret learning is achievable in Stochastic BwK, but in Adversarial BwK, only competitive ratio style guarantees are achievable, where the competitive ratio depends on the budget. What makes this gap so vast is that in Adversarial BwK the guarantees get worse in the typical case when the budget is more binding. While ``best-of-both-worlds'' type algorithms are known (algorithms that provide the best achievable guarantee in both extreme cases), their guarantees degrade to the adversarial case as soon as the environment is not fully stochastic. Our work aims to bridge this gap, offering guarantees for a workload that is not exactly stochastic but is also not worst-case. We define a condition, Approximately Stationary BwK, that parameterizes how close to stochastic or adversarial an instance is. Based on these parameters, we explore what is the best competitive ratio attainable in BwK. We explore two algorithms that are oblivious to the values of the parameters but guarantee competitive ratios that smoothly transition between the best possible guarantees in the two extreme cases, depending on the values of the parameters. Our guarantees offer great improvement over the adversarial guarantee, especially when the available budget is small. We also prove bounds on the achievable guarantee, showing that our results are approximately tight when the budget is small.

翻译：具有Knapsacks (BwK) 的盗匪, 以及预算制约下多Armed Bamtits 的通用值。近几年来, 多Armed Bamtits 的通用值得到了很大的关注。它有许多应用, 包括动态定价、多次拍卖等等。以前的工作集中在两个极端之一: Stochastic BwK, 每轮资源的奖赏和消费都来自i. id. 分布, 以及 Aversarial BwK, 其中这些价值由对手取出。两个案例中的可实现性参数都显示出巨大的改善差距: 不重复性学习在Stochatic BwK 中是可以实现的, 但是在Adversarial BwK 中, 最有竞争力的预算模式的保证也并非完全可以实现的。当预算更具约束力时, 我们的双向世界的保证是可变式的。我们的保证是可变的。在两个极端案例中, 最短的保证是最接近的, 最坏的, 最坏的预算基点不是最坏的, 最坏的保证在Bribrial- 。</s>

0

相关内容

赌博机/老虎机

赌博机/老虎机

宾夕法尼亚大学最新《不确定性估计》课程笔记，134页pdf，附Slides

宾夕法尼亚大学最新《不确定性估计》课程笔记，134页pdf，附Slides

专知会员服务

49+阅读 · 2022年11月13日

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

专知会员服务

69+阅读 · 2021年3月27日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

量化金融强化学习论文集合

量化金融强化学习论文集合

专知

14+阅读 · 2019年12月18日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文推荐】最新六篇强化学习相关论文—Sublinear、机器阅读理解、加速强化学习、对抗性奖励学习、人机交互

【论文推荐】最新六篇强化学习相关论文—Sublinear、机器阅读理解、加速强化学习、对抗性奖励学习、人机交互

专知

17+阅读 · 2018年4月28日

概率和平均框架下一系列Sobolev空间中的函数逼近与恢复

国家自然科学基金

1+阅读 · 2015年12月31日

带加法噪声高维密度的最优小波点态估计

国家自然科学基金

0+阅读 · 2015年12月31日

基于随机凸优化算法的不确定跳变系统概率控制

国家自然科学基金

0+阅读 · 2013年12月31日

共轭梯度法新算法及其推广

国家自然科学基金

1+阅读 · 2013年12月31日

最小二乘有限元法的湍流大涡模拟及其并行计算

国家自然科学基金

0+阅读 · 2013年12月31日

多元函数的稀疏逼近与随机逼近

国家自然科学基金

1+阅读 · 2012年12月31日

最优和自校正广义系统信息融合状态估计算法

国家自然科学基金

0+阅读 · 2012年12月31日

基于藤Copula-GARCH与时变Levy过程的多重货币期权定价的蒙特卡罗模拟方法研究

国家自然科学基金

1+阅读 · 2011年12月31日

随机变分不等式

国家自然科学基金

0+阅读 · 2011年12月31日

相依误差下时间序列模型的统计推断

国家自然科学基金

0+阅读 · 2009年12月31日

A Lipschitz Bandits Approach for Continuous Hyperparameter Optimization

Arxiv

0+阅读 · 2023年4月17日

Learning with Differentiable Algorithms

Arxiv

11+阅读 · 2022年9月1日

Bayesian Deep Learning for Graphs

Arxiv

23+阅读 · 2022年2月24日

Neural Architecture Search without Training

Neural Architecture Search without Training

Arxiv

10+阅读 · 2021年6月11日

GraphNorm: A Principled Approach to Accelerating Graph Neural Network Training

Arxiv

14+阅读 · 2021年2月16日

Interpreting and Unifying Graph Neural Networks with An Optimization Framework

Arxiv

18+阅读 · 2021年1月28日

Attribute-Guided Adversarial Training for Robustness to Natural Perturbations

Arxiv

15+阅读 · 2020年12月3日

Principal Neighbourhood Aggregation for Graph Nets

Principal Neighbourhood Aggregation for Graph Nets

Arxiv

17+阅读 · 2020年6月7日

Robust breast cancer detection in mammography and digital breast tomosynthesis using annotation-efficient deep learning approach

Robust breast cancer detection in mammography and digital breast tomosynthesis using annotation-efficient deep learning approach

Arxiv

14+阅读 · 2019年12月27日

Differentiable Dynamic Programming for Structured Prediction and Attention

Arxiv

56+阅读 · 2018年2月20日

VIP会员

文章信息

相关主题

赌博机/老虎机

相关VIP内容

宾夕法尼亚大学最新《不确定性估计》课程笔记，134页pdf，附Slides

宾夕法尼亚大学最新《不确定性估计》课程笔记，134页pdf，附Slides

专知会员服务

49+阅读 · 2022年11月13日

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

专知会员服务

69+阅读 · 2021年3月27日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《使用量化测量将传感器节点关联到融合中心的算法设计》171页

军事前沿模型

提升军事训练能力的最佳人工智能模拟工具

《社交媒体信息作战》最新48页技术报告

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

量化金融强化学习论文集合

量化金融强化学习论文集合

专知

14+阅读 · 2019年12月18日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文推荐】最新六篇强化学习相关论文—Sublinear、机器阅读理解、加速强化学习、对抗性奖励学习、人机交互

【论文推荐】最新六篇强化学习相关论文—Sublinear、机器阅读理解、加速强化学习、对抗性奖励学习、人机交互

专知

17+阅读 · 2018年4月28日

相关论文

A Lipschitz Bandits Approach for Continuous Hyperparameter Optimization

Arxiv

0+阅读 · 2023年4月17日

Learning with Differentiable Algorithms

Arxiv

11+阅读 · 2022年9月1日

Bayesian Deep Learning for Graphs

Arxiv

23+阅读 · 2022年2月24日

Neural Architecture Search without Training

Neural Architecture Search without Training

Arxiv

10+阅读 · 2021年6月11日

GraphNorm: A Principled Approach to Accelerating Graph Neural Network Training

Arxiv

14+阅读 · 2021年2月16日

Interpreting and Unifying Graph Neural Networks with An Optimization Framework

Arxiv

18+阅读 · 2021年1月28日

Attribute-Guided Adversarial Training for Robustness to Natural Perturbations

Arxiv

15+阅读 · 2020年12月3日

Principal Neighbourhood Aggregation for Graph Nets

Principal Neighbourhood Aggregation for Graph Nets

Arxiv

17+阅读 · 2020年6月7日

Robust breast cancer detection in mammography and digital breast tomosynthesis using annotation-efficient deep learning approach

Robust breast cancer detection in mammography and digital breast tomosynthesis using annotation-efficient deep learning approach

Arxiv

14+阅读 · 2019年12月27日

Differentiable Dynamic Programming for Structured Prediction and Attention

Arxiv

56+阅读 · 2018年2月20日

相关基金

概率和平均框架下一系列Sobolev空间中的函数逼近与恢复

国家自然科学基金

1+阅读 · 2015年12月31日

带加法噪声高维密度的最优小波点态估计

国家自然科学基金

0+阅读 · 2015年12月31日

基于随机凸优化算法的不确定跳变系统概率控制

国家自然科学基金

0+阅读 · 2013年12月31日

共轭梯度法新算法及其推广

国家自然科学基金

1+阅读 · 2013年12月31日

最小二乘有限元法的湍流大涡模拟及其并行计算

国家自然科学基金

0+阅读 · 2013年12月31日

多元函数的稀疏逼近与随机逼近

国家自然科学基金

1+阅读 · 2012年12月31日

最优和自校正广义系统信息融合状态估计算法

国家自然科学基金

0+阅读 · 2012年12月31日

基于藤Copula-GARCH与时变Levy过程的多重货币期权定价的蒙特卡罗模拟方法研究

国家自然科学基金

1+阅读 · 2011年12月31日

随机变分不等式

国家自然科学基金

0+阅读 · 2011年12月31日

相依误差下时间序列模型的统计推断

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员