调整离线下对数以解决混合多武装强盗问题的框架与强盗反馈 (A Framework for Adapting Offline Algorithms to Solve Combinatorial Multi-Armed Bandit Problems with Bandit Feedback) - 专知论文

会员服务 ·

0

赌博机/老虎机 · 近似 · 泛函 · 离散化 · 黑盒子 ·

2023 年 1 月 30 日

A Framework for Adapting Offline Algorithms to Solve Combinatorial Multi-Armed Bandit Problems with Bandit Feedback

翻译：调整离线下对数以解决混合多武装强盗问题的框架与强盗反馈

Guanyu Nie,Yididiya Y Nadew,Yanhui Zhu,Vaneet Aggarwal,Christopher John Quinn

We investigate the problem of stochastic, combinatorial multi-armed bandits where the learner only has access to bandit feedback and the reward function can be non-linear. We provide a general framework for adapting discrete offline approximation algorithms into sublinear $\alpha$-regret methods that only require bandit feedback, achieving $\mathcal{O}\left(T^\frac{2}{3}\log(T)^\frac{1}{3}\right)$ expected cumulative $\alpha$-regret dependence on the horizon $T$. The framework only requires the offline algorithms to be robust to small errors in function evaluation. The adaptation procedure does not even require explicit knowledge of the offline approximation algorithm -- the offline algorithm can be used as black box subroutine. To demonstrate the utility of the proposed framework, the proposed framework is applied to multiple problems in submodular maximization, adapting approximation algorithms for cardinality and for knapsack constraints. The new CMAB algorithms for knapsack constraints outperform a full-bandit method developed for the adversarial setting in experiments with real-world data.

翻译：我们在学习者只能获得土匪反馈而且奖励功能可以是非线性功能的情况下,调查多武装的盗匪问题。我们为将离散离线近似算法调整为亚线性$\alpha$-regret 方法提供了一个总框架,这些方法只要求土匪反馈,达到$\mathcal{O ⁇ left(T ⁇ frac{2 ⁇ 3 ⁇ log(T) ⁇ frac{1 ⁇ 3 ⁇ right),预期对地平线的累积依赖$\alpha$-regret$$l$T$。这个框架只要求离线性算法在功能评价中对小错误具有强势性。适应程序甚至不要求明确了解离线性近似算算法 -- 离线性算法可以用作黑盒子路程。为了证明拟议框架的效用,拟议框架适用于亚调最大化、为基度和Knappsack限制而调整近似算法的多重问题。这个框架只要求离线性算法在功能评估中对全波形数据进行试验。

0

相关内容

赌博机/老虎机

赌博机/老虎机

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

专知会员服务

77+阅读 · 2020年2月8日

在线变分推断，76页ppt，A Regret Bound for Online Variational Inference

在线变分推断，76页ppt，A Regret Bound for Online Variational Inference

专知会员服务

21+阅读 · 2019年12月2日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

Twitter大佬在线讲座：GNN through the Lens of Curvature

Twitter大佬在线讲座：GNN through the Lens of Curvature

图与推荐

1+阅读 · 2022年4月12日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

Forward-Looking与Backward-Looking相结合的投资组合管理

国家自然科学基金

1+阅读 · 2014年12月31日

高线密度极紫外切片多层膜光栅设计与制作方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

从TLR4/NF-κB通路探索补肾抗衰片调节脂联素表达干预非酒精性脂肪肝所致动脉粥样硬化的机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

AngⅡ通过Bcl-2/Beclin1自噬途径调控血管内皮细胞衰老的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

姿态气动耦合的高超声速飞行器分块建模及鲁棒控制

国家自然科学基金

0+阅读 · 2012年12月31日

实时安全关键系统的建模、仿真与验证

国家自然科学基金

1+阅读 · 2012年12月31日

基于融合决策的风电场建模策略与方法研究

国家自然科学基金

1+阅读 · 2012年12月31日

Chemerin通过P38MAPK途径介导糖尿病肾病及硫辛酸干预研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于T型钙通道的针刺抗哮喘气道平滑肌细胞增殖的钙调素信号通路研究

国家自然科学基金

0+阅读 · 2011年12月31日

铁磁性MnSi有序超薄膜在Si基底上的外延生长及其掺杂特性研究

国家自然科学基金

0+阅读 · 2011年12月31日

Improved Regret Bounds for Online Kernel Selection under Bandit Feedback

Arxiv

0+阅读 · 2023年3月23日

Rate-Tunable Control Barrier Functions: Methods and Algorithms for Online Adaptation

Arxiv

0+阅读 · 2023年3月23日

Multiagent Reinforcement Learning for Autonomous Routing and Pickup Problem with Adaptation to Variable Demand

Multiagent Reinforcement Learning for Autonomous Routing and Pickup Problem with Adaptation to Variable Demand

Arxiv

0+阅读 · 2023年3月21日

Multi-armed Bandit Learning on a Graph

Multi-armed Bandit Learning on a Graph

Arxiv

0+阅读 · 2023年3月20日

A Unified Framework of Policy Learning for Contextual Bandit with Confounding Bias and Missing Observations

Arxiv

0+阅读 · 2023年3月20日

Optimal schemes for combinatorial query problems with integer feedback

Arxiv

0+阅读 · 2023年3月20日

Random Inverse Problems Over Graphs: Decentralized Online Learning

Arxiv

0+阅读 · 2023年3月20日

Diverse Adaptive Bulk Search: a Framework for Solving QUBO Problems on Multiple GPUs

Arxiv

0+阅读 · 2023年3月17日

Learning with Differentiable Algorithms

Arxiv

11+阅读 · 2022年9月1日

Neural Bellman-Ford Networks: A General Graph Neural Network Framework for Link Prediction

Arxiv

21+阅读 · 2021年6月16日

VIP会员

文章信息

相关主题

赌博机/老虎机

相关VIP内容

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

专知会员服务

77+阅读 · 2020年2月8日

在线变分推断，76页ppt，A Regret Bound for Online Variational Inference

在线变分推断，76页ppt，A Regret Bound for Online Variational Inference

专知会员服务

21+阅读 · 2019年12月2日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《俄乌战争中的无人系统：新的战争方式与新兴趋势——来自前线的印象》报告

《海上自主水面船舶远程操作中心：安全可持续运行的多维度分析》

多模态大语言模型下游调优中“保持自我”的重要性

隐身自主无人水下航行器技术如何变革水下作战并重塑海军竞争

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

Twitter大佬在线讲座：GNN through the Lens of Curvature

Twitter大佬在线讲座：GNN through the Lens of Curvature

图与推荐

1+阅读 · 2022年4月12日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

相关论文

Improved Regret Bounds for Online Kernel Selection under Bandit Feedback

Arxiv

0+阅读 · 2023年3月23日

Rate-Tunable Control Barrier Functions: Methods and Algorithms for Online Adaptation

Arxiv

0+阅读 · 2023年3月23日

Multiagent Reinforcement Learning for Autonomous Routing and Pickup Problem with Adaptation to Variable Demand

Multiagent Reinforcement Learning for Autonomous Routing and Pickup Problem with Adaptation to Variable Demand

Arxiv

0+阅读 · 2023年3月21日

Multi-armed Bandit Learning on a Graph

Multi-armed Bandit Learning on a Graph

Arxiv

0+阅读 · 2023年3月20日

A Unified Framework of Policy Learning for Contextual Bandit with Confounding Bias and Missing Observations

Arxiv

0+阅读 · 2023年3月20日

Optimal schemes for combinatorial query problems with integer feedback

Arxiv

0+阅读 · 2023年3月20日

Random Inverse Problems Over Graphs: Decentralized Online Learning

Arxiv

0+阅读 · 2023年3月20日

Diverse Adaptive Bulk Search: a Framework for Solving QUBO Problems on Multiple GPUs

Arxiv

0+阅读 · 2023年3月17日

Learning with Differentiable Algorithms

Arxiv

11+阅读 · 2022年9月1日

Neural Bellman-Ford Networks: A General Graph Neural Network Framework for Link Prediction

Arxiv

21+阅读 · 2021年6月16日

相关基金

Forward-Looking与Backward-Looking相结合的投资组合管理

国家自然科学基金

1+阅读 · 2014年12月31日

高线密度极紫外切片多层膜光栅设计与制作方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

从TLR4/NF-κB通路探索补肾抗衰片调节脂联素表达干预非酒精性脂肪肝所致动脉粥样硬化的机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

AngⅡ通过Bcl-2/Beclin1自噬途径调控血管内皮细胞衰老的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

姿态气动耦合的高超声速飞行器分块建模及鲁棒控制

国家自然科学基金

0+阅读 · 2012年12月31日

实时安全关键系统的建模、仿真与验证

国家自然科学基金

1+阅读 · 2012年12月31日

基于融合决策的风电场建模策略与方法研究

国家自然科学基金

1+阅读 · 2012年12月31日

Chemerin通过P38MAPK途径介导糖尿病肾病及硫辛酸干预研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于T型钙通道的针刺抗哮喘气道平滑肌细胞增殖的钙调素信号通路研究

国家自然科学基金

0+阅读 · 2011年12月31日

铁磁性MnSi有序超薄膜在Si基底上的外延生长及其掺杂特性研究

国家自然科学基金

0+阅读 · 2011年12月31日

微信扫码咨询专知VIP会员