带有Knapsacks 背包的非静止强盗 (Non-stationary Bandits with Knapsacks) - 专知论文

会员服务 ·

0

赌博机/老虎机 · 回合 · 约束 · Continuity · ARM ·

2022 年 5 月 25 日

Non-stationary Bandits with Knapsacks

翻译：带有Knapsacks 背包的非静止强盗

Shang Liu,Jiashuo Jiang,Xiaocheng Li

In this paper, we study the problem of bandits with knapsacks (BwK) in a non-stationary environment. The BwK problem generalizes the multi-arm bandit (MAB) problem to model the resource consumption associated with playing each arm. At each time, the decision maker/player chooses to play an arm, and s/he will receive a reward and consume certain amount of resource from each of the multiple resource types. The objective is to maximize the cumulative reward over a finite horizon subject to some knapsack constraints on the resources. Existing works study the BwK problem under either a stochastic or adversarial environment. Our paper considers a non-stationary environment which continuously interpolates between these two extremes. We first show that the traditional notion of variation budget is insufficient to characterize the non-stationarity of the BwK problem for a sublinear regret due to the presence of the constraints, and then we propose a new notion of global non-stationarity measure. We employ both non-stationarity measures to derive upper and lower bounds for the problem. Our results are based on a primal-dual analysis of the underlying linear programs and highlight the interplay between the constraints and the non-stationarity. Finally, we also extend the non-stationarity measure to the problem of online convex optimization with constraints and obtain new regret bounds accordingly.

翻译：在本文中,我们研究了在非静止环境中使用 knapsacks (BwK) 的强盗问题。 BwK 问题概括了多武器强盗(MAB) 问题, 以模拟与玩弄每只手臂相关的资源消耗。每次, 决策者/ 玩家选择玩一只手臂, 并且他/ 他将得到奖励, 并从多种资源类型中的每一种资源中消耗一定的资源。目标是在资源受到某些 knapsack 限制的情况下, 在有限的地平线上最大限度地增加累积奖励。现有的工作研究环境或是在沙沙或对抗性环境下研究 BwK 问题。我们的文件认为, 一种非固定环境, 使这两种极端之间不断相互调和。我们首先显示, 传统的变换预算概念不足以说明BwK 问题的不常态性, 因为存在各种制约, 因而造成亚线性遗憾。然后我们提出了一个新的全球不静止措施概念。我们采用非固定性措施, 来找出问题的上下界和下界界限。我们的结果是基于一个直线性和在线限制, 也显示了我们最后的直线性限制, 。

0

相关内容

赌博机/老虎机

赌博机/老虎机

不可错过！700+ppt《因果推理》课程！杜克大学Fan Li教程

不可错过！700+ppt《因果推理》课程！杜克大学Fan Li教程

专知会员服务

72+阅读 · 2022年7月11日

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

不可错过！UIUC最新《统计强化学习》课程！

专知会员服务

53+阅读 · 2020年9月7日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

80+阅读 · 2020年7月26日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Industry Talk2

【ICIG2021】Latest News & Announcements of the Industry Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年7月29日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

专知

13+阅读 · 2018年6月24日

【推荐】SVM实例教程

【推荐】SVM实例教程

机器学习研究会

17+阅读 · 2017年8月26日

miR-29b/STAT3/GATA3正反馈环路逆转TAM极化并抑制尤文肉瘤恶性表型的机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

黄河中下游湿地土壤亚铁厌氧氧化效应及动力学

国家自然科学基金

0+阅读 · 2015年12月31日

IL-17介导巨噬细胞极化对BPA诱导肥胖中脂肪组织炎症的调控及机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

中国田鼠亚科 Microtini族(Rodentia: Cricetidae: Arvicolinae)的分类与系统发育研究

国家自然科学基金

0+阅读 · 2014年12月31日

表面等离激元增强GaAs基化合物太阳电池的光电转换效率

国家自然科学基金

0+阅读 · 2013年12月31日

Fe、Co、Ni超细纳米结构制备与催化放氢研究

国家自然科学基金

0+阅读 · 2012年12月31日

可压缩Navier-Stokes方程全局光滑解的适定性问题

国家自然科学基金

0+阅读 · 2012年12月31日

退化k-Hessian方程解的正则性研究

国家自然科学基金

0+阅读 · 2011年12月31日

紧缩极化SAR在地物分类和舰船检测中的理论方法研究

国家自然科学基金

0+阅读 · 2011年12月31日

TR3相互作用新蛋白机理研究

国家自然科学基金

1+阅读 · 2008年12月31日

Sample-dependent Adaptive Temperature Scaling for Improved Calibration

Sample-dependent Adaptive Temperature Scaling for Improved Calibration

Arxiv

0+阅读 · 2022年7月13日

Caching with Reserves

Arxiv

0+阅读 · 2022年7月13日

Contextual Bandits with Large Action Spaces: Made Practical

Arxiv

0+阅读 · 2022年7月12日

An information upper bound for probability sensitivity

Arxiv

0+阅读 · 2022年7月11日

LIDL: Local Intrinsic Dimension Estimation Using Approximate Likelihood

Arxiv

0+阅读 · 2022年7月11日

Robust finite element discretization and solvers for distributed elliptic optimal control problems

Arxiv

0+阅读 · 2022年7月11日

Sampling Random Group Fair Rankings

Arxiv

0+阅读 · 2022年7月11日

On resampling schemes for particle filters with weakly informative observations

Arxiv

0+阅读 · 2022年7月9日

Information-Gathering in Latent Bandits

Arxiv

0+阅读 · 2022年7月8日

A Calibration Approach to Transportability and Data-Fusion with Observational Data

Arxiv

0+阅读 · 2022年7月7日

VIP会员

文章信息

相关主题

赌博机/老虎机

相关VIP内容

不可错过！700+ppt《因果推理》课程！杜克大学Fan Li教程

不可错过！700+ppt《因果推理》课程！杜克大学Fan Li教程

专知会员服务

72+阅读 · 2022年7月11日

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

不可错过！UIUC最新《统计强化学习》课程！

专知会员服务

53+阅读 · 2020年9月7日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

80+阅读 · 2020年7月26日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《基于大型语言模型的软件工程自动化研究》最新264页

《基于大型语言模型的信号处理管线研究：推进军事电子情报工作流程》最新76页

中文版 | 战争算法：生成式人工智能在战场的崛起

中文版《美国陆军：战术行为性远程医疗实施观察与建议》

相关资讯

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Industry Talk2

【ICIG2021】Latest News & Announcements of the Industry Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年7月29日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

专知

13+阅读 · 2018年6月24日

【推荐】SVM实例教程

【推荐】SVM实例教程

机器学习研究会

17+阅读 · 2017年8月26日

相关论文

Sample-dependent Adaptive Temperature Scaling for Improved Calibration

Sample-dependent Adaptive Temperature Scaling for Improved Calibration

Arxiv

0+阅读 · 2022年7月13日

Caching with Reserves

Arxiv

0+阅读 · 2022年7月13日

Contextual Bandits with Large Action Spaces: Made Practical

Arxiv

0+阅读 · 2022年7月12日

An information upper bound for probability sensitivity

Arxiv

0+阅读 · 2022年7月11日

LIDL: Local Intrinsic Dimension Estimation Using Approximate Likelihood

Arxiv

0+阅读 · 2022年7月11日

Robust finite element discretization and solvers for distributed elliptic optimal control problems

Arxiv

0+阅读 · 2022年7月11日

Sampling Random Group Fair Rankings

Arxiv

0+阅读 · 2022年7月11日

On resampling schemes for particle filters with weakly informative observations

Arxiv

0+阅读 · 2022年7月9日

Information-Gathering in Latent Bandits

Arxiv

0+阅读 · 2022年7月8日

A Calibration Approach to Transportability and Data-Fusion with Observational Data

Arxiv

0+阅读 · 2022年7月7日

相关基金

miR-29b/STAT3/GATA3正反馈环路逆转TAM极化并抑制尤文肉瘤恶性表型的机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

黄河中下游湿地土壤亚铁厌氧氧化效应及动力学

国家自然科学基金

0+阅读 · 2015年12月31日

IL-17介导巨噬细胞极化对BPA诱导肥胖中脂肪组织炎症的调控及机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

中国田鼠亚科 Microtini族(Rodentia: Cricetidae: Arvicolinae)的分类与系统发育研究

国家自然科学基金

0+阅读 · 2014年12月31日

表面等离激元增强GaAs基化合物太阳电池的光电转换效率

国家自然科学基金

0+阅读 · 2013年12月31日

Fe、Co、Ni超细纳米结构制备与催化放氢研究

国家自然科学基金

0+阅读 · 2012年12月31日

可压缩Navier-Stokes方程全局光滑解的适定性问题

国家自然科学基金

0+阅读 · 2012年12月31日

退化k-Hessian方程解的正则性研究

国家自然科学基金

0+阅读 · 2011年12月31日

紧缩极化SAR在地物分类和舰船检测中的理论方法研究

国家自然科学基金

0+阅读 · 2011年12月31日

TR3相互作用新蛋白机理研究

国家自然科学基金

1+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员