改进与T~Bandit~Feedback的多阶段多级包装问题 (Improved Algorithms for Multi-period Multi-class Packing Problems with~Bandit~Feedback) - 专知论文

会员服务 ·

0

Packing · 赌博机/老虎机 · 线性的 · 向量化 · 估计/估计量 ·

2023 年 1 月 31 日

Improved Algorithms for Multi-period Multi-class Packing Problems with~Bandit~Feedback

翻译：改进与T~Bandit~Feedback的多阶段多级包装问题

Wonyoung Kim,Garud Iyengar,Assaf Zeevi

from arxiv, 42 pages including Appendix

We consider the linear contextual multi-class multi-period packing problem~(LMMP) where the goal is to pack items such that the total vector of consumption is below a given budget vector and the total value is as large as possible. We consider the setting where the reward and the consumption vector associated with each action is a class-dependent linear function of the context, and the decision-maker receives bandit feedback. LMMP includes linear contextual bandits with knapsacks and online revenue management as special cases. We establish a new more efficient estimator which guarantees a faster convergence rate, and consequently, a lower regret in such problems. We propose a bandit policy that is a closed-form function of said estimated parameters. When the contexts are non-degenerate, the regret of the proposed policy is sublinear in the context dimension, the number of classes, and the time horizon~$T$ when the budget grows at least as $\sqrt{T}$. We also resolve an open problem posed in Agrawal & Devanur (2016), and extend the result to a multi-class setting. Our numerical experiments clearly demonstrate that the performance of our policy is superior to other benchmarks in the literature.

翻译：我们考虑的是线性背景多级多时期包装问题~(LMMP),其目标是将物品包装到总消费矢量低于某一预算矢量,总值尽可能大。我们考虑的是每种行动的奖赏和消费矢量与每一行动相关的消费矢量是依阶级而定的线性功能,决策者接受的是土匪反馈。LMMP包括具有knapsacks和在线收入管理的线性背景强盗作为特例。我们还建立了一个新的效率更高的估测器,保证更快的趋同率,从而降低这些问题的遗憾程度。我们建议采用土匪政策,该政策是上述估计参数的封闭形式功能。当环境不是退化时,拟议政策的遗憾是背景层面的亚线性、类别数量和当预算至少增长到$\sqrt{T}美元时的时间范围~T$。我们还解决了在Agrawal & Devanur(2016年)中出现的公开问题,并将结果扩大到多级设定。我们的数字实验清楚地表明,我们的政策绩效比其他文献中的标准要高。

0

相关内容

Packing

【干货书】工程和科学中的概率和统计，

【干货书】工程和科学中的概率和统计，

专知会员服务

58+阅读 · 2022年12月24日

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

专知会员服务

104+阅读 · 2022年2月10日

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

【推荐】SVM实例教程

【推荐】SVM实例教程

机器学习研究会

17+阅读 · 2017年8月26日

几类Pfaffian图的结构性质研究

国家自然科学基金

0+阅读 · 2013年12月31日

环的clean性及其相关广义逆的研究

国家自然科学基金

0+阅读 · 2013年12月31日

纠缠及纠缠之外的量子关联刻画

国家自然科学基金

0+阅读 · 2013年12月31日

限制性定理、谱乘子及其相关问题的研究

国家自然科学基金

1+阅读 · 2012年12月31日

函数域中的Vinogradov中值定理

国家自然科学基金

0+阅读 · 2012年12月31日

人细胞型朊病毒基因PRNP的表达调控机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

广义逆的表示、扰动和光滑分析

国家自然科学基金

0+阅读 · 2011年12月31日

网络Euler-Lagrange系统的分布式协调控制问题研究

国家自然科学基金

1+阅读 · 2011年12月31日

编码密码学中若干组合对象研究

国家自然科学基金

0+阅读 · 2009年12月31日

MeCP2基因及其所在染色体Xq28区域基因序列重复在孤独症发病机制中的作用研究

国家自然科学基金

1+阅读 · 2008年12月31日

Robust and flexible learning of a high-dimensional classification rule using auxiliary outcomes

Arxiv

0+阅读 · 2023年3月22日

A two-dimensional minimum residual technique for accelerating two-step iterative solvers with applications to discrete ill-posed problems

Arxiv

0+阅读 · 2023年3月22日

Instance-Dependent Bounds for Zeroth-order Lipschitz Optimization with Error Certificates

Arxiv

0+阅读 · 2023年3月22日

On the Complexity of Robust Multi-Stage Problems in the Polynomial Hierarchy

Arxiv

0+阅读 · 2023年3月21日

A Single-Step Multiclass SVM based on Quantum Annealing for Remote Sensing Data Classification

Arxiv

0+阅读 · 2023年3月21日

Adaptive Experimentation at Scale: Bayesian Algorithms for Flexible Batches

Arxiv

0+阅读 · 2023年3月21日

Multi-armed Bandit Learning on a Graph

Multi-armed Bandit Learning on a Graph

Arxiv

0+阅读 · 2023年3月20日

Improved Sample Complexity for Reward-free Reinforcement Learning under Low-rank MDPs

Arxiv

0+阅读 · 2023年3月20日

Distributed Optimization in Sensor Network for Scalable Multi-Robot Relative State Estimation

Arxiv

0+阅读 · 2023年3月19日

Fast solution of incompressible flow problems with two-level pressure approximation

Arxiv

0+阅读 · 2023年3月17日

VIP会员

文章信息

相关主题

赌博机/老虎机

估计/估计量

相关VIP内容

【干货书】工程和科学中的概率和统计，

【干货书】工程和科学中的概率和统计，

专知会员服务

58+阅读 · 2022年12月24日

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

专知会员服务

104+阅读 · 2022年2月10日

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【CMU博士论文】数据驱动决策中的激励、信息与不确定性

DGP双粒度提示框架：图增强大模型助力欺诈检测

【ICCV2025】ESSENTIAL：用于视频类增量学习的情景记忆与语义记忆整合

唯快不破：大型语言模型高效架构综述

相关资讯

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

【推荐】SVM实例教程

【推荐】SVM实例教程

机器学习研究会

17+阅读 · 2017年8月26日

相关论文

Robust and flexible learning of a high-dimensional classification rule using auxiliary outcomes

Arxiv

0+阅读 · 2023年3月22日

A two-dimensional minimum residual technique for accelerating two-step iterative solvers with applications to discrete ill-posed problems

Arxiv

0+阅读 · 2023年3月22日

Instance-Dependent Bounds for Zeroth-order Lipschitz Optimization with Error Certificates

Arxiv

0+阅读 · 2023年3月22日

On the Complexity of Robust Multi-Stage Problems in the Polynomial Hierarchy

Arxiv

0+阅读 · 2023年3月21日

A Single-Step Multiclass SVM based on Quantum Annealing for Remote Sensing Data Classification

Arxiv

0+阅读 · 2023年3月21日

Adaptive Experimentation at Scale: Bayesian Algorithms for Flexible Batches

Arxiv

0+阅读 · 2023年3月21日

Multi-armed Bandit Learning on a Graph

Multi-armed Bandit Learning on a Graph

Arxiv

0+阅读 · 2023年3月20日

Improved Sample Complexity for Reward-free Reinforcement Learning under Low-rank MDPs

Arxiv

0+阅读 · 2023年3月20日

Distributed Optimization in Sensor Network for Scalable Multi-Robot Relative State Estimation

Arxiv

0+阅读 · 2023年3月19日

Fast solution of incompressible flow problems with two-level pressure approximation

Arxiv

0+阅读 · 2023年3月17日

相关基金

几类Pfaffian图的结构性质研究

国家自然科学基金

0+阅读 · 2013年12月31日

环的clean性及其相关广义逆的研究

国家自然科学基金

0+阅读 · 2013年12月31日

纠缠及纠缠之外的量子关联刻画

国家自然科学基金

0+阅读 · 2013年12月31日

限制性定理、谱乘子及其相关问题的研究

国家自然科学基金

1+阅读 · 2012年12月31日

函数域中的Vinogradov中值定理

国家自然科学基金

0+阅读 · 2012年12月31日

人细胞型朊病毒基因PRNP的表达调控机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

广义逆的表示、扰动和光滑分析

国家自然科学基金

0+阅读 · 2011年12月31日

网络Euler-Lagrange系统的分布式协调控制问题研究

国家自然科学基金

1+阅读 · 2011年12月31日

编码密码学中若干组合对象研究

国家自然科学基金

0+阅读 · 2009年12月31日

MeCP2基因及其所在染色体Xq28区域基因序列重复在孤独症发病机制中的作用研究

国家自然科学基金

1+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员