强势上下文线性直线强盗 (Robust Contextual Linear Bandits) - 专知论文

会员服务 ·

0

赌博机/老虎机 · 稳健性 · 统计量 · 线性的 · 情景 ·

2022 年 10 月 26 日

Robust Contextual Linear Bandits

翻译：强势上下文线性直线强盗

Rong Zhu,Branislav Kveton

Model misspecification is a major consideration in applications of statistical methods and machine learning. However, it is often neglected in contextual bandits. This paper studies a common form of misspecification, an inter-arm heterogeneity that is not captured by context. To address this issue, we assume that the heterogeneity arises due to arm-specific random variables, which can be learned. We call this setting a robust contextual bandit. The arm-specific variables explain the unknown inter-arm heterogeneity, and we incorporate them in the robust contextual estimator of the mean reward and its uncertainty. We develop two efficient bandit algorithms for our setting: a UCB algorithm called RoLinUCB and a posterior-sampling algorithm called RoLinTS. We analyze both algorithms and bound their $n$-round Bayes regret. Our experiments show that RoLinTS is comparably statistically efficient to the classic methods when the misspecification is low, more robust when the misspecification is high, and significantly more computationally efficient than its naive implementation.

翻译：模型误差是应用统计方法和机器学习的一个主要考虑因素。但是, 模型误差往往被背景强盗忽略。本文研究一种常见的误差形式, 一种没有被上下文所捕捉到的武器间差异性。为了解决这个问题, 我们假设, 差异性是由于可学的、特定手臂随机变量造成的。我们称此为强力背景强盗。具体手臂变数解释了未知的武器间差异性, 我们将这些变数纳入一个强大的平均奖赏和不确定性的背景估计器中。我们为我们的设置开发了两种高效的土匪算法: 一个叫RoLinUCB的UCB算法和一个叫RoLintS的远地点抽样算法。我们分析两种算法, 并约束其每圆湾$的随机变量。我们的实验显示, 当误差时, RoLinTS在统计上与经典方法相对有效, 当误差时, 当标值高时, 当误差时, 并且计算效率远高于天真执行时, 我们的实验显示, RoLinTS在统计上比较有效。

0

相关内容

赌博机/老虎机

赌博机/老虎机

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

【北京大学】Locally Differentially Private (Contextual) Bandits Learning

【北京大学】Locally Differentially Private (Contextual) Bandits Learning

专知会员服务

13+阅读 · 2020年6月8日

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

专知会员服务

115+阅读 · 2020年4月5日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【推荐】SVM实例教程

【推荐】SVM实例教程

机器学习研究会

17+阅读 · 2017年8月26日

粗糙回归模型与算法研究

国家自然科学基金

8+阅读 · 2015年12月31日

矩阵不等式约束矩阵最小二乘问题的投影算法研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于路段的风险型交通分配模型与网络风险评估方法

国家自然科学基金

0+阅读 · 2013年12月31日

基于事件触发机制的多智能体系统分布式协调控制研究

国家自然科学基金

3+阅读 · 2012年12月31日

基于PCE的多层多域光网络QoS组播路由多目标优化算法研究

国家自然科学基金

0+阅读 · 2012年12月31日

高时效性商品在线多属性逆向拍卖定价决策与商业模式选择

国家自然科学基金

0+阅读 · 2012年12月31日

基于网络拓扑和交通拥堵分析的公交网络生成优化模型与算法研究

国家自然科学基金

0+阅读 · 2012年12月31日

用外显子组捕获测序技术鉴定Olmsted型掌跖角化症的致病基因

国家自然科学基金

0+阅读 · 2011年12月31日

基于需求不确定性的OD矩阵估计模型与算法研究

国家自然科学基金

0+阅读 · 2009年12月31日

图像处理问题的快速数值方法

国家自然科学基金

1+阅读 · 2008年12月31日

Multi-armed Bandit Learning on a Graph

Multi-armed Bandit Learning on a Graph

Arxiv

0+阅读 · 2022年12月13日

CORAL: Contextual Response Retrievability Loss Function for Training Dialog Generation Models

Arxiv

0+阅读 · 2022年12月13日

Autoregressive Bandits

Arxiv

0+阅读 · 2022年12月12日

Analysis and Detectability of Offline Data Poisoning Attacks on Linear Systems

Arxiv

0+阅读 · 2022年12月12日

Estimators of Entropy and Information via Inference in Probabilistic Models

Arxiv

0+阅读 · 2022年12月12日

Corruption-Robust Algorithms with Uncertainty Weighting for Nonlinear Contextual Bandits and Markov Decision Processes

Arxiv

0+阅读 · 2022年12月12日

Retire: Robust Expectile Regression in High Dimensions

Arxiv

0+阅读 · 2022年12月11日

What Makes A Good Fisherman? Linear Regression under Self-Selection Bias

Arxiv

0+阅读 · 2022年12月10日

Information-Theoretic Safe Exploration with Gaussian Processes

Information-Theoretic Safe Exploration with Gaussian Processes

Arxiv

0+阅读 · 2022年12月9日

Informer: Beyond Efficient Transformer for Long Sequence Time-Series Forecasting

Informer: Beyond Efficient Transformer for Long Sequence Time-Series Forecasting

Arxiv

21+阅读 · 2020年12月17日

VIP会员

文章信息

相关主题

赌博机/老虎机

相关VIP内容

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

【北京大学】Locally Differentially Private (Contextual) Bandits Learning

【北京大学】Locally Differentially Private (Contextual) Bandits Learning

专知会员服务

13+阅读 · 2020年6月8日

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

专知会员服务

115+阅读 · 2020年4月5日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《使用量化测量将传感器节点关联到融合中心的算法设计》171页

军事前沿模型

提升军事训练能力的最佳人工智能模拟工具

《社交媒体信息作战》最新48页技术报告

相关资讯

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【推荐】SVM实例教程

【推荐】SVM实例教程

机器学习研究会

17+阅读 · 2017年8月26日

相关论文

Multi-armed Bandit Learning on a Graph

Multi-armed Bandit Learning on a Graph

Arxiv

0+阅读 · 2022年12月13日

CORAL: Contextual Response Retrievability Loss Function for Training Dialog Generation Models

Arxiv

0+阅读 · 2022年12月13日

Autoregressive Bandits

Arxiv

0+阅读 · 2022年12月12日

Analysis and Detectability of Offline Data Poisoning Attacks on Linear Systems

Arxiv

0+阅读 · 2022年12月12日

Estimators of Entropy and Information via Inference in Probabilistic Models

Arxiv

0+阅读 · 2022年12月12日

Corruption-Robust Algorithms with Uncertainty Weighting for Nonlinear Contextual Bandits and Markov Decision Processes

Arxiv

0+阅读 · 2022年12月12日

Retire: Robust Expectile Regression in High Dimensions

Arxiv

0+阅读 · 2022年12月11日

What Makes A Good Fisherman? Linear Regression under Self-Selection Bias

Arxiv

0+阅读 · 2022年12月10日

Information-Theoretic Safe Exploration with Gaussian Processes

Information-Theoretic Safe Exploration with Gaussian Processes

Arxiv

0+阅读 · 2022年12月9日

Informer: Beyond Efficient Transformer for Long Sequence Time-Series Forecasting

Informer: Beyond Efficient Transformer for Long Sequence Time-Series Forecasting

Arxiv

21+阅读 · 2020年12月17日

相关基金

粗糙回归模型与算法研究

国家自然科学基金

8+阅读 · 2015年12月31日

矩阵不等式约束矩阵最小二乘问题的投影算法研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于路段的风险型交通分配模型与网络风险评估方法

国家自然科学基金

0+阅读 · 2013年12月31日

基于事件触发机制的多智能体系统分布式协调控制研究

国家自然科学基金

3+阅读 · 2012年12月31日

基于PCE的多层多域光网络QoS组播路由多目标优化算法研究

国家自然科学基金

0+阅读 · 2012年12月31日

高时效性商品在线多属性逆向拍卖定价决策与商业模式选择

国家自然科学基金

0+阅读 · 2012年12月31日

基于网络拓扑和交通拥堵分析的公交网络生成优化模型与算法研究

国家自然科学基金

0+阅读 · 2012年12月31日

用外显子组捕获测序技术鉴定Olmsted型掌跖角化症的致病基因

国家自然科学基金

0+阅读 · 2011年12月31日

基于需求不确定性的OD矩阵估计模型与算法研究

国家自然科学基金

0+阅读 · 2009年12月31日

图像处理问题的快速数值方法

国家自然科学基金

1+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员