使用 $\ ell_p$p$ 信任套件对离线线线外内地土匪的悲观主义 (Pessimism for Offline Linear Contextual Bandits using $\ell_p$ Confidence Sets) - 专知论文

会员服务 ·

0

上下文赌博机/上下文老虎机 · 置信度 · 线性的 · 赌博机/老虎机 · 学成 ·

2022 年 5 月 21 日

Pessimism for Offline Linear Contextual Bandits using $\ell_p$ Confidence Sets

翻译：使用 $\ ell_p$p$ 信任套件对离线线线外内地土匪的悲观主义

Gene Li,Cong Ma,Nathan Srebro

We present a family $\{\hat{\pi}\}_{p\ge 1}$ of pessimistic learning rules for offline learning of linear contextual bandits, relying on confidence sets with respect to different $\ell_p$ norms, where $\hat{\pi}_2$ corresponds to Bellman-consistent pessimism (BCP), while $\hat{\pi}_\infty$ is a novel generalization of lower confidence bound (LCB) to the linear setting. We show that the novel $\hat{\pi}_\infty$ learning rule is, in a sense, adaptively optimal, as it achieves the minimax performance (up to log factors) against all $\ell_q$-constrained problems, and as such it strictly dominates all other predictors in the family, including $\hat{\pi}_2$.

翻译：我们为线性背景强盗的离线学习展示了一个家庭1美元悲观的学习规则,依靠对不同的美元标准的信任,美元相当于贝尔曼一致的悲观主义(BCP),而美元则是对线性环境的低信任约束(LCB)的一种新颖的概括。我们表明,在某种意义上,小说中的美元是适应性最佳的学习规则,因为它针对所有美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元/美元

0

相关内容

上下文赌博机/上下文老虎机

上下文赌博机/上下文老虎机

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

33页PPT【AI+天气预测】，AI and Machine learning for weather predictions

33页PPT【AI+天气预测】，AI and Machine learning for weather predictions

专知会员服务

34+阅读 · 2022年3月5日

【北京大学】Locally Differentially Private (Contextual) Bandits Learning

【北京大学】Locally Differentially Private (Contextual) Bandits Learning

专知会员服务

13+阅读 · 2020年6月8日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文推荐】最新七篇强化学习相关论文—逻辑约束、综述、多任务深度强化学习、参数服务器、事件抽取、分层强化学习、过拟合研究

【论文推荐】最新七篇强化学习相关论文—逻辑约束、综述、多任务深度强化学习、参数服务器、事件抽取、分层强化学习、过拟合研究

专知

25+阅读 · 2018年4月29日

NR3C1基因突变在成人急性淋巴细胞白血病耐药与复发中的作用与机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

CaHsfA2和CaHsfA6b转录因子对辣椒温敏雄性不育系育性转换的调控机制

国家自然科学基金

0+阅读 · 2014年12月31日

mTOR功能性单倍体通过ERS-IRE1/α-JNK通路调控乳腺癌细胞药物敏感性的机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

细胞周期蛋白依赖性激酶11在人乳腺癌细胞增殖中的作用及分子机制

国家自然科学基金

0+阅读 · 2013年12月31日

Hint1与Girdin/Akt及Src信号通路串话在肝癌细胞增殖中的调控机制

国家自然科学基金

0+阅读 · 2012年12月31日

Bim在介导非小细胞肺癌ALK抑制剂获得性耐药中的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

LncRNAs在非小细胞肺癌EGFR-TKIs耐药中的作用及分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

基于Decorin基因甲基化调控的非小细胞肺癌转移的分子机制

国家自然科学基金

0+阅读 · 2011年12月31日

DNA损伤修复通路相关基因突变和功能区SNP与汉族人乳腺癌遗传易感性的关系研究

国家自然科学基金

0+阅读 · 2009年12月31日

HOXD13与GLI3基因在马蹄内翻足发病机制中的意义研究

国家自然科学基金

0+阅读 · 2009年12月31日

Balanced Self-Paced Learning for AUC Maximization

Arxiv

0+阅读 · 2022年7月8日

Efficiency of non-truthful auctions under auto-bidding

Arxiv

0+阅读 · 2022年7月8日

Aerobatic Trajectory Generation for a VTOL Fixed-Wing Aircraft Using Differential Flatness

Arxiv

1+阅读 · 2022年7月7日

Understanding Gradual Domain Adaptation: Improved Analysis, Optimal Path and Beyond

Arxiv

0+阅读 · 2022年7月7日

A Simple and Provably Efficient Algorithm for Asynchronous Federated Contextual Linear Bandits

A Simple and Provably Efficient Algorithm for Asynchronous Federated Contextual Linear Bandits

Arxiv

0+阅读 · 2022年7月7日

Learning Interpretable Models Using an Oracle

Arxiv

0+阅读 · 2022年7月7日

Offline Meta-Reinforcement Learning with Online Self-Supervision

Arxiv

1+阅读 · 2022年7月7日

Confidence surfaces for the mean of locally stationary functional time series

Arxiv

0+阅读 · 2022年7月6日

Linear Jamming Bandits: Sample-Efficient Learning for Non-Coherent Digital Jamming

Arxiv

0+阅读 · 2022年7月5日

Instance-optimal PAC Algorithms for Contextual Bandits

Arxiv

0+阅读 · 2022年7月5日

VIP会员

文章信息

相关主题

上下文赌博机/上下文老虎机

赌博机/老虎机

相关VIP内容

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

33页PPT【AI+天气预测】，AI and Machine learning for weather predictions

33页PPT【AI+天气预测】，AI and Machine learning for weather predictions

专知会员服务

34+阅读 · 2022年3月5日

【北京大学】Locally Differentially Private (Contextual) Bandits Learning

【北京大学】Locally Differentially Private (Contextual) Bandits Learning

专知会员服务

13+阅读 · 2020年6月8日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《乌克兰无人机产业：志愿者与政策在构建新兴无人机产业中的协同作用》最新报告

《人工智能辅助决策中的数据可视化：系统性综述》

人工智能驱动弹药制造现代化：美国陆军转型之路

《敏捷作战部署中枢纽-辐条基地选址优化研究》80页

相关资讯

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文推荐】最新七篇强化学习相关论文—逻辑约束、综述、多任务深度强化学习、参数服务器、事件抽取、分层强化学习、过拟合研究

【论文推荐】最新七篇强化学习相关论文—逻辑约束、综述、多任务深度强化学习、参数服务器、事件抽取、分层强化学习、过拟合研究

专知

25+阅读 · 2018年4月29日

相关论文

Balanced Self-Paced Learning for AUC Maximization

Arxiv

0+阅读 · 2022年7月8日

Efficiency of non-truthful auctions under auto-bidding

Arxiv

0+阅读 · 2022年7月8日

Aerobatic Trajectory Generation for a VTOL Fixed-Wing Aircraft Using Differential Flatness

Arxiv

1+阅读 · 2022年7月7日

Understanding Gradual Domain Adaptation: Improved Analysis, Optimal Path and Beyond

Arxiv

0+阅读 · 2022年7月7日

A Simple and Provably Efficient Algorithm for Asynchronous Federated Contextual Linear Bandits

A Simple and Provably Efficient Algorithm for Asynchronous Federated Contextual Linear Bandits

Arxiv

0+阅读 · 2022年7月7日

Learning Interpretable Models Using an Oracle

Arxiv

0+阅读 · 2022年7月7日

Offline Meta-Reinforcement Learning with Online Self-Supervision

Arxiv

1+阅读 · 2022年7月7日

Confidence surfaces for the mean of locally stationary functional time series

Arxiv

0+阅读 · 2022年7月6日

Linear Jamming Bandits: Sample-Efficient Learning for Non-Coherent Digital Jamming

Arxiv

0+阅读 · 2022年7月5日

Instance-optimal PAC Algorithms for Contextual Bandits

Arxiv

0+阅读 · 2022年7月5日

相关基金

NR3C1基因突变在成人急性淋巴细胞白血病耐药与复发中的作用与机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

CaHsfA2和CaHsfA6b转录因子对辣椒温敏雄性不育系育性转换的调控机制

国家自然科学基金

0+阅读 · 2014年12月31日

mTOR功能性单倍体通过ERS-IRE1/α-JNK通路调控乳腺癌细胞药物敏感性的机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

细胞周期蛋白依赖性激酶11在人乳腺癌细胞增殖中的作用及分子机制

国家自然科学基金

0+阅读 · 2013年12月31日

Hint1与Girdin/Akt及Src信号通路串话在肝癌细胞增殖中的调控机制

国家自然科学基金

0+阅读 · 2012年12月31日

Bim在介导非小细胞肺癌ALK抑制剂获得性耐药中的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

LncRNAs在非小细胞肺癌EGFR-TKIs耐药中的作用及分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

基于Decorin基因甲基化调控的非小细胞肺癌转移的分子机制

国家自然科学基金

0+阅读 · 2011年12月31日

DNA损伤修复通路相关基因突变和功能区SNP与汉族人乳腺癌遗传易感性的关系研究

国家自然科学基金

0+阅读 · 2009年12月31日

HOXD13与GLI3基因在马蹄内翻足发病机制中的意义研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员