保守保守主义强盗 (Contextual Combinatorial Conservative Bandits) - 专知论文

会员服务 ·

0

赌博机/老虎机 · Bandits · Performer · CASE · 值域 ·

2022 年 2 月 23 日

Contextual Combinatorial Conservative Bandits

翻译：保守保守主义强盗

Xiaojin Zhang,Shuai Li,Weiwen Liu,Shengyu Zhang

The problem of multi-armed bandits (MAB) asks to make sequential decisions while balancing between exploitation and exploration, and have been successfully applied to a wide range of practical scenarios. Various algorithms have been designed to achieve a high reward in a long term. However, its short-term performance might be rather low, which is injurious in risk sensitive applications. Building on previous work of conservative bandits, we bring up a framework of contextual combinatorial conservative bandits. An algorithm is presented and a regret bound of $\tilde O(d^2+d\sqrt{T})$ is proven, where $d$ is the dimension of the feature vectors, and $T$ is the total number of time steps. We further provide an algorithm as well as regret analysis for the case when the conservative reward is unknown. Experiments are conducted, and the results validate the effectiveness of our algorithm.

翻译：多武装匪徒问题(MAB)要求在平衡开采和勘探之间作出顺序决定,并成功地应用于广泛的实际情景。各种算法的设计是为了长期获得高额报酬。然而,其短期性能可能相当低,对风险敏感应用有害。根据保守土匪以前的工作,我们提出了一个背景组合保守土匪的框架。提出了一种算法,并证实了$tilde O(d ⁇ 2+d\sqrt{T})的内含遗憾值,其中美元是特性矢量的维度,美元是时间步骤的总数。我们进一步提供了一种算法和遗憾分析,说明在不知道保守的奖励的情况下情况。进行了实验,结果证实了我们的算法的有效性。

0

相关内容

赌博机/老虎机

赌博机/老虎机

计算机科学课程与视频课件合集，Computer Science courses with video lectures

计算机科学课程与视频课件合集，Computer Science courses with video lectures

专知会员服务

37+阅读 · 2022年1月24日

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

哥伦比亚大学最新《机器学习》课程，Fall-B 2020 (Machine Learning)

专知会员服务

39+阅读 · 2020年11月3日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

会议交流 | IJCKG: International Joint Conference on Knowledge Graphs

会议交流 | IJCKG: International Joint Conference on Knowledge Graphs

开放知识图谱

0+阅读 · 2021年9月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

19篇ICML2019论文摘录选读！

19篇ICML2019论文摘录选读！

专知

28+阅读 · 2019年4月28日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文推荐】最新七篇推荐系统相关论文—影响兴趣、知识Embeddings、音乐推荐、非结构化、一致性、显式和隐式特征、知识图谱

【论文推荐】最新七篇推荐系统相关论文—影响兴趣、知识Embeddings、音乐推荐、非结构化、一致性、显式和隐式特征、知识图谱

专知

14+阅读 · 2018年3月28日

基于Amalgam空间的Hardy空间实变理论及其应用

国家自然科学基金

0+阅读 · 2017年12月31日

不确定分数阶非线性系统Mittag-Leffler自适应控制

国家自然科学基金

1+阅读 · 2016年12月31日

TMS1基因响应高温胁迫和ER Stress的分子机制

国家自然科学基金

0+阅读 · 2014年12月31日

矩阵分解问题的优化算法与理论

国家自然科学基金

8+阅读 · 2014年12月31日

套子代数的Hochschild上同调及套的分类

国家自然科学基金

3+阅读 · 2014年12月31日

控制系统的约束矩阵方程及其高效数值算法

国家自然科学基金

0+阅读 · 2013年12月31日

非线性及高维空间上的Volterra型积分方程谱配置法研究

国家自然科学基金

0+阅读 · 2013年12月31日

有限频域内不确定线性系统的鲁棒变增益控制器设计

国家自然科学基金

0+阅读 · 2012年12月31日

图的有限定条件的圈问题研究

国家自然科学基金

0+阅读 · 2012年12月31日

约束优化问题的拉格朗日乘子理论与算法研究

国家自然科学基金

1+阅读 · 2011年12月31日

Online Caching with Optimistic Learning

Arxiv

1+阅读 · 2022年4月20日

Almost Optimal Algorithms for Two-player Zero-Sum Linear Mixture Markov Games

Arxiv

0+阅读 · 2022年4月20日

Sampling Lovász Local Lemma For General Constraint Satisfaction Solutions In Near-Linear Time

Arxiv

0+阅读 · 2022年4月19日

Interaction-Aware Labeled Multi-Bernoulli Filter

Arxiv

0+阅读 · 2022年4月19日

Expert-Calibrated Learning for Online Optimization with Switching Costs

Arxiv

0+阅读 · 2022年4月18日

On-Demand Delivery from Stores: Dynamic Dispatching and Routing with Random Demand

On-Demand Delivery from Stores: Dynamic Dispatching and Routing with Random Demand

Arxiv

0+阅读 · 2022年4月18日

Mapping While Following: 2D LiDAR SLAM in Indoor Dynamic Environments with a Person Tracker

Arxiv

0+阅读 · 2022年4月18日

A Unifying Theory of Thompson Sampling for Continuous Risk-Averse Bandits

Arxiv

0+阅读 · 2022年4月17日

Space-sequential particle filters for high-dimensional dynamical systems described by stochastic differential equations

Arxiv

0+阅读 · 2022年4月15日

Interacting with Non-Cooperative User: A New Paradigm for Proactive Dialogue Policy

Arxiv

0+阅读 · 2022年4月7日

VIP会员

文章信息

相关主题

赌博机/老虎机

相关VIP内容

计算机科学课程与视频课件合集，Computer Science courses with video lectures

计算机科学课程与视频课件合集，Computer Science courses with video lectures

专知会员服务

37+阅读 · 2022年1月24日

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

哥伦比亚大学最新《机器学习》课程，Fall-B 2020 (Machine Learning)

专知会员服务

39+阅读 · 2020年11月3日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

面向具身智能的多模态数据存储与检索：综述

《算法战争研究计划全景评估》35页

【CMU博士论文】水下三维视觉感知与生成

智能体战争：自主人工智能军备竞赛全景透视

相关资讯

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

会议交流 | IJCKG: International Joint Conference on Knowledge Graphs

会议交流 | IJCKG: International Joint Conference on Knowledge Graphs

开放知识图谱

0+阅读 · 2021年9月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

19篇ICML2019论文摘录选读！

19篇ICML2019论文摘录选读！

专知

28+阅读 · 2019年4月28日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文推荐】最新七篇推荐系统相关论文—影响兴趣、知识Embeddings、音乐推荐、非结构化、一致性、显式和隐式特征、知识图谱

【论文推荐】最新七篇推荐系统相关论文—影响兴趣、知识Embeddings、音乐推荐、非结构化、一致性、显式和隐式特征、知识图谱

专知

14+阅读 · 2018年3月28日

相关论文

Online Caching with Optimistic Learning

Arxiv

1+阅读 · 2022年4月20日

Almost Optimal Algorithms for Two-player Zero-Sum Linear Mixture Markov Games

Arxiv

0+阅读 · 2022年4月20日

Sampling Lovász Local Lemma For General Constraint Satisfaction Solutions In Near-Linear Time

Arxiv

0+阅读 · 2022年4月19日

Interaction-Aware Labeled Multi-Bernoulli Filter

Arxiv

0+阅读 · 2022年4月19日

Expert-Calibrated Learning for Online Optimization with Switching Costs

Arxiv

0+阅读 · 2022年4月18日

On-Demand Delivery from Stores: Dynamic Dispatching and Routing with Random Demand

On-Demand Delivery from Stores: Dynamic Dispatching and Routing with Random Demand

Arxiv

0+阅读 · 2022年4月18日

Mapping While Following: 2D LiDAR SLAM in Indoor Dynamic Environments with a Person Tracker

Arxiv

0+阅读 · 2022年4月18日

A Unifying Theory of Thompson Sampling for Continuous Risk-Averse Bandits

Arxiv

0+阅读 · 2022年4月17日

Space-sequential particle filters for high-dimensional dynamical systems described by stochastic differential equations

Arxiv

0+阅读 · 2022年4月15日

Interacting with Non-Cooperative User: A New Paradigm for Proactive Dialogue Policy

Arxiv

0+阅读 · 2022年4月7日

相关基金

基于Amalgam空间的Hardy空间实变理论及其应用

国家自然科学基金

0+阅读 · 2017年12月31日

不确定分数阶非线性系统Mittag-Leffler自适应控制

国家自然科学基金

1+阅读 · 2016年12月31日

TMS1基因响应高温胁迫和ER Stress的分子机制

国家自然科学基金

0+阅读 · 2014年12月31日

矩阵分解问题的优化算法与理论

国家自然科学基金

8+阅读 · 2014年12月31日

套子代数的Hochschild上同调及套的分类

国家自然科学基金

3+阅读 · 2014年12月31日

控制系统的约束矩阵方程及其高效数值算法

国家自然科学基金

0+阅读 · 2013年12月31日

非线性及高维空间上的Volterra型积分方程谱配置法研究

国家自然科学基金

0+阅读 · 2013年12月31日

有限频域内不确定线性系统的鲁棒变增益控制器设计

国家自然科学基金

0+阅读 · 2012年12月31日

图的有限定条件的圈问题研究

国家自然科学基金

0+阅读 · 2012年12月31日

约束优化问题的拉格朗日乘子理论与算法研究

国家自然科学基金

1+阅读 · 2011年12月31日

微信扫码咨询专知VIP会员