Continuous Mean-Covariance Bandits - 专知论文

会员服务 ·

0

赌博机/老虎机 · Continuity · 相关系数 · 权值向量 · 相互独立的 ·

2023 年 5 月 11 日

Continuous Mean-Covariance Bandits

翻译：暂无翻译

Yihan Du,Siwei Wang,Zhixuan Fang,Longbo Huang

Existing risk-aware multi-armed bandit models typically focus on risk measures of individual options such as variance. As a result, they cannot be directly applied to important real-world online decision making problems with correlated options. In this paper, we propose a novel Continuous Mean-Covariance Bandit (CMCB) model to explicitly take into account option correlation. Specifically, in CMCB, there is a learner who sequentially chooses weight vectors on given options and observes random feedback according to the decisions. The agent's objective is to achieve the best trade-off between reward and risk, measured with option covariance. To capture different reward observation scenarios in practice, we consider three feedback settings, i.e., full-information, semi-bandit and full-bandit feedback. We propose novel algorithms with optimal regrets (within logarithmic factors), and provide matching lower bounds to validate their optimalities. The experimental results also demonstrate the superiority of our algorithms. To the best of our knowledge, this is the first work that considers option correlation in risk-aware bandits and explicitly quantifies how arbitrary covariance structures impact the learning performance. The novel analytical techniques we developed for exploiting the estimated covariance to build concentration and bounding the risk of selected actions based on sampling strategy properties can likely find applications in other bandit analysis and be of independent interests.

翻译：暂无翻译

0

相关内容

赌博机/老虎机

赌博机/老虎机

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

163+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

SiC/Ti基复合材料C/Mo双涂层界面改性机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

钛铝基自润滑材料的宽温域摩擦学特性研究

国家自然科学基金

0+阅读 · 2012年12月31日

机载InSAR区域网平差方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

功能化石墨烯材料对放射性核素吸附及其机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

Cystatin B缺失与Prion疾病自噬作用机制的研究

国家自然科学基金

0+阅读 · 2011年12月31日

Geometry-Aware Approaches for Balancing Performance and Theoretical Guarantees in Linear Bandits

Arxiv

0+阅读 · 2023年6月26日

The DeCAMFounder: Non-Linear Causal Discovery in the Presence of Hidden Variables

Arxiv

0+阅读 · 2023年6月25日

Locally Differentially Private Distributed Online Learning with Guaranteed Optimality

Arxiv

0+阅读 · 2023年6月25日

On the Two-sided Permutation Inversion Problem

Arxiv

0+阅读 · 2023年6月23日

Best Arm Identification in Stochastic Bandits: Beyond $β-$optimality

Arxiv

0+阅读 · 2023年6月22日

VIP会员

文章信息

相关主题

赌博机/老虎机

相互独立的

相关VIP内容

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

163+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《基于AI的动态任务分配策略实现多智能体系统有意义人类控制》报告

《超越连接：AI驱动网络未来愿景》最新报告

人工智能赋能多域作战：能力与挑战

《战场空间决策优势：AI基础与应用研究》总结报告

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

相关论文

Geometry-Aware Approaches for Balancing Performance and Theoretical Guarantees in Linear Bandits

Arxiv

0+阅读 · 2023年6月26日

The DeCAMFounder: Non-Linear Causal Discovery in the Presence of Hidden Variables

Arxiv

0+阅读 · 2023年6月25日

Locally Differentially Private Distributed Online Learning with Guaranteed Optimality

Arxiv

0+阅读 · 2023年6月25日

On the Two-sided Permutation Inversion Problem

Arxiv

0+阅读 · 2023年6月23日

Best Arm Identification in Stochastic Bandits: Beyond $β-$optimality

Arxiv

0+阅读 · 2023年6月22日

相关基金

SiC/Ti基复合材料C/Mo双涂层界面改性机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

钛铝基自润滑材料的宽温域摩擦学特性研究

国家自然科学基金

0+阅读 · 2012年12月31日

机载InSAR区域网平差方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

功能化石墨烯材料对放射性核素吸附及其机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

Cystatin B缺失与Prion疾病自噬作用机制的研究

国家自然科学基金

0+阅读 · 2011年12月31日

微信扫码咨询专知VIP会员