最坏情况之外带背包的强盗 (Bandits with Knapsacks beyond the Worst-Case) - 专知论文

会员服务 ·

0

赌博机/老虎机 · Bandits · Performer · SimPLe · 生成模型 ·

2021 年 10 月 26 日

Bandits with Knapsacks beyond the Worst-Case

翻译：最坏情况之外带背包的强盗

Karthik Abinav Sankararaman,Aleksandrs Slivkins

from arxiv, The initial version, titled "Advances in Bandits with Knapsacks", was published on arxiv.org in Jan'20. The present version improves both upper and lower bounds, deriving Theorem 3.2(ii) and Theorem 4.2. Moreover, it simplifies the algorithm and analysis in the main result, and fixes several issues in the lower bounds

Bandits with Knapsacks (BwK) is a general model for multi-armed bandits under supply/budget constraints. While worst-case regret bounds for BwK are well-understood, we present three results that go beyond the worst-case perspective. First, we provide upper and lower bounds which amount to a full characterization for logarithmic, instance-dependent regret rates. Second, we consider "simple regret" in BwK, which tracks algorithm's performance in a given round, and prove that it is small in all but a few rounds. Third, we provide a general "reduction" from BwK to bandits which takes advantage of some known helpful structure, and apply this reduction to combinatorial semi-bandits, linear contextual bandits, and multinomial-logit bandits. Our results build on the BwK algorithm from \citet{AgrawalDevanur-ec14}, providing new analyses thereof.

翻译：使用 Knapsacks (BwK) 的强盗是多武装强盗在供应/预算限制下的一般模式。虽然 BwK 最坏的遗憾界限已经非常清楚, 但我们展示了三种结果, 超越了最坏的视角。首先, 我们提供上下界限, 相当于对数率的完整描述。第二, 我们考虑 BwK 中的“ 简单遗憾 ”, 它跟踪了算法在特定回合中的表现, 并证明算法在几轮之外都很小。第三, 我们提供从 BwK 到强盗的一般“ 削减 ”, 利用一些已知的有用结构, 并将这一削减应用于组合式半带、线性背景强盗和多种族- logit 强盗。我们的结果建立在\ citet{ AgrawalDevanur- ec14} 的 BwK 算法上, 提供了新的分析。

0

相关内容

赌博机/老虎机

赌博机/老虎机

Fariz Darari简明《博弈论Game Theory》介绍，35页ppt

Fariz Darari简明《博弈论Game Theory》介绍，35页ppt

专知会员服务

112+阅读 · 2020年5月15日

【基于模型的强化学习的博弈论框架】A Game Theoretic Framework for Model Based Reinforcement Learning

【基于模型的强化学习的博弈论框架】A Game Theoretic Framework for Model Based Reinforcement Learning

专知会员服务

131+阅读 · 2020年4月19日

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

专知会员服务

77+阅读 · 2020年2月8日

【论文推荐WWW2020-UIUC】修正排序系统中的选择偏差：Correcting for Selection Bias in Learning-to-rank Systems

【论文推荐WWW2020-UIUC】修正排序系统中的选择偏差：Correcting for Selection Bias in Learning-to-rank Systems

专知会员服务

32+阅读 · 2020年2月1日

在线变分推断，76页ppt，A Regret Bound for Online Variational Inference

在线变分推断，76页ppt，A Regret Bound for Online Variational Inference

专知会员服务

21+阅读 · 2019年12月2日

【电子书推荐】强化学习（Reinforcement Learning）法兰克福大学 | Cornelius Weber

【电子书推荐】强化学习（Reinforcement Learning）法兰克福大学 | Cornelius Weber

专知会员服务

45+阅读 · 2019年11月19日

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

10+阅读 · 2019年10月24日

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

15+阅读 · 2019年10月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

LibRec 精选：近期15篇推荐系统论文

LibRec 精选：近期15篇推荐系统论文

LibRec智能推荐

5+阅读 · 2019年3月5日

【泡泡一分钟】基于运动估计的激光雷达和相机标定方法

【泡泡一分钟】基于运动估计的激光雷达和相机标定方法

泡泡机器人SLAM

25+阅读 · 2019年1月17日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

【论文推荐】最新八篇推荐系统相关论文—亿级商品嵌入、主动学习、树深度模型、知识图谱、注意力感知、矩阵分解、神经个性化嵌入

【论文推荐】最新八篇推荐系统相关论文—亿级商品嵌入、主动学习、树深度模型、知识图谱、注意力感知、矩阵分解、神经个性化嵌入

专知

15+阅读 · 2018年6月15日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

强化学习 cartpole_a3c

强化学习 cartpole_a3c

CreateAMind

9+阅读 · 2017年7月21日

Hermite--Padé approximations with Pfaffian structures: Novikov peakon equation and integrable lattices

Arxiv

0+阅读 · 2021年12月23日

A GenEO Domain Decomposition method for Saddle Point problems

Arxiv

0+阅读 · 2021年12月23日

Active Learning for Domain Adaptation: An Energy-based Approach

Arxiv

13+阅读 · 2021年12月2日

Stochastic Shortest Path: Minimax, Parameter-Free and Towards Horizon-Free Regret

Arxiv

8+阅读 · 2021年4月22日

Data Poisoning Attacks and Defenses to Crowdsourcing Systems

Arxiv

8+阅读 · 2021年2月18日

Unifying Online and Counterfactual Learning to Rank

Arxiv

6+阅读 · 2020年12月8日

Model-based Adversarial Meta-Reinforcement Learning

Arxiv

5+阅读 · 2020年6月16日

Knowledge Graph Completion to Predict Polypharmacy Side Effects

Arxiv

3+阅读 · 2018年10月22日

(FPT-)Approximation Algorithms for the Virtual Network Embedding Problem

Arxiv

4+阅读 · 2018年3月12日

Variance-based regularization with convex objectives

Arxiv

5+阅读 · 2017年12月14日

VIP会员

文章信息

相关主题

赌博机/老虎机

相关VIP内容

Fariz Darari简明《博弈论Game Theory》介绍，35页ppt

Fariz Darari简明《博弈论Game Theory》介绍，35页ppt

专知会员服务

112+阅读 · 2020年5月15日

【基于模型的强化学习的博弈论框架】A Game Theoretic Framework for Model Based Reinforcement Learning

【基于模型的强化学习的博弈论框架】A Game Theoretic Framework for Model Based Reinforcement Learning

专知会员服务

131+阅读 · 2020年4月19日

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

专知会员服务

77+阅读 · 2020年2月8日

【论文推荐WWW2020-UIUC】修正排序系统中的选择偏差：Correcting for Selection Bias in Learning-to-rank Systems

【论文推荐WWW2020-UIUC】修正排序系统中的选择偏差：Correcting for Selection Bias in Learning-to-rank Systems

专知会员服务

32+阅读 · 2020年2月1日

在线变分推断，76页ppt，A Regret Bound for Online Variational Inference

在线变分推断，76页ppt，A Regret Bound for Online Variational Inference

专知会员服务

21+阅读 · 2019年12月2日

【电子书推荐】强化学习（Reinforcement Learning）法兰克福大学 | Cornelius Weber

【电子书推荐】强化学习（Reinforcement Learning）法兰克福大学 | Cornelius Weber

专知会员服务

45+阅读 · 2019年11月19日

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

10+阅读 · 2019年10月24日

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

15+阅读 · 2019年10月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【CMU博士论文】基础模型训练中网络规模数据的负责任与高效使用

《俄乌战争背景下俄罗斯的战略性海军分析（2022-2025年）》最新100页报告

人工智能时代背景下的未来海战

相关资讯

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

LibRec 精选：近期15篇推荐系统论文

LibRec 精选：近期15篇推荐系统论文

LibRec智能推荐

5+阅读 · 2019年3月5日

【泡泡一分钟】基于运动估计的激光雷达和相机标定方法

【泡泡一分钟】基于运动估计的激光雷达和相机标定方法

泡泡机器人SLAM

25+阅读 · 2019年1月17日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

【论文推荐】最新八篇推荐系统相关论文—亿级商品嵌入、主动学习、树深度模型、知识图谱、注意力感知、矩阵分解、神经个性化嵌入

【论文推荐】最新八篇推荐系统相关论文—亿级商品嵌入、主动学习、树深度模型、知识图谱、注意力感知、矩阵分解、神经个性化嵌入

专知

15+阅读 · 2018年6月15日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

强化学习 cartpole_a3c

强化学习 cartpole_a3c

CreateAMind

9+阅读 · 2017年7月21日

相关论文

Hermite--Padé approximations with Pfaffian structures: Novikov peakon equation and integrable lattices

Arxiv

0+阅读 · 2021年12月23日

A GenEO Domain Decomposition method for Saddle Point problems

Arxiv

0+阅读 · 2021年12月23日

Active Learning for Domain Adaptation: An Energy-based Approach

Arxiv

13+阅读 · 2021年12月2日

Stochastic Shortest Path: Minimax, Parameter-Free and Towards Horizon-Free Regret

Arxiv

8+阅读 · 2021年4月22日

Data Poisoning Attacks and Defenses to Crowdsourcing Systems

Arxiv

8+阅读 · 2021年2月18日

Unifying Online and Counterfactual Learning to Rank

Arxiv

6+阅读 · 2020年12月8日

Model-based Adversarial Meta-Reinforcement Learning

Arxiv

5+阅读 · 2020年6月16日

Knowledge Graph Completion to Predict Polypharmacy Side Effects

Arxiv

3+阅读 · 2018年10月22日

(FPT-)Approximation Algorithms for the Virtual Network Embedding Problem

Arxiv

4+阅读 · 2018年3月12日

Variance-based regularization with convex objectives

Arxiv

5+阅读 · 2017年12月14日

微信扫码咨询专知VIP会员