跟踪大盗中最严重的武器变化 (Tracking Most Severe Arm Changes in Bandits) - 专知论文

会员服务 ·

0

赌博机/老虎机 · Bandits · ARM · 优化器 · CASE ·

2022 年 1 月 5 日

Tracking Most Severe Arm Changes in Bandits

翻译：跟踪大盗中最严重的武器变化

Joe Suk,Samory Kpotufe

In bandits with distribution shifts, one aims to automatically detect an unknown number $L$ of changes in reward distribution, and restart exploration when necessary. While this problem remained open for many years, a recent breakthrough of Auer et al. (2018, 2019) provide the first adaptive procedure to guarantee an optimal (dynamic) regret $\sqrt{LT}$, for $T$ rounds, with no knowledge of $L$. However, not all distributional shifts are equally severe, e.g., suppose no best arm switches occur, then we cannot rule out that a regret $O(\sqrt{T})$ may remain possible; in other words, is it possible to achieve dynamic regret that optimally scales only with an unknown number of severe shifts? This unfortunately has remained elusive, despite various attempts (Auer et al., 2019, Foster et al., 2020). We resolve this problem in the case of two-armed bandits: we derive an adaptive procedure that guarantees a dynamic regret of order $\tilde{O}(\sqrt{\tilde{L} T})$, where $\tilde L \ll L$ captures an unknown number of severe best arm changes, i.e., with significant switches in rewards, and which last sufficiently long to actually require a restart. As a consequence, for any number $L$ of distributional shifts outside of these severe shifts, our procedure achieves regret just $\tilde{O}(\sqrt{T})\ll \tilde{O}(\sqrt{LT})$. Finally, we note that our notion of severe shift applies in both classical settings of stochastic switching bandits and of adversarial bandits.

翻译：在分布变换的土匪中,一个目标是自动检测一个未知的美元 { 美元 { 分配变化的金额 { 分配变化 { 美元 { 分配变化的金额 {, 并在必要时重新开始勘探。这个问题虽然多年来一直存在, 但最近Auer等人( 2018, 2019) 的突破提供了第一个适应程序, 保证美元( 动力) 最佳( 动力) 遗憾 $ Qrt{ 立特 $, 并且不知道$ 。然而, 不是所有分配变换都同样严重, 例如, 假设没有出现最佳的手臂开关, 那么我们不能排除一个遗憾 $( ) ; 换掉换掉 ; 换掉换掉, 可能实现最佳的最佳规模? 可惜, 尽管做出了各种尝试( 亚瑟、 2019 、福斯特和 2020 ), 这个问题仍然难以解决。我们从两条土匪中获取一个适应程序, 保证这些顺序的动态后悔, 。

0

相关内容

赌博机/老虎机

赌博机/老虎机

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

2019年机器学习框架回顾

2019年机器学习框架回顾

专知会员服务

36+阅读 · 2019年10月11日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Plenary Talk2

【ICIG2021】Latest News & Announcements of the Plenary Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年11月2日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

M365热招 | N+Offer“职”等你来

M365热招 | N+Offer“职”等你来

微软招聘

0+阅读 · 2021年3月17日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文推荐】最新七篇图像检索相关论文—草图、Tie-Aware、场景图解析、叠加跨注意力机制、深度哈希、人群估计

【论文推荐】最新七篇图像检索相关论文—草图、Tie-Aware、场景图解析、叠加跨注意力机制、深度哈希、人群估计

专知

10+阅读 · 2018年4月22日

Volterra积分微分方程的多区间Chebyshev和Legendre谱配置法

国家自然科学基金

0+阅读 · 2015年12月31日

套子代数的Hochschild上同调及套的分类

国家自然科学基金

3+阅读 · 2014年12月31日

基于Lanchester方程的作战混合动态对策及其应用研究

国家自然科学基金

6+阅读 · 2013年12月31日

基于压缩感知和非负矩阵分解理论的高光谱混合像元分解

国家自然科学基金

0+阅读 · 2012年12月31日

EDSOA理论基础研究

国家自然科学基金

0+阅读 · 2012年12月31日

大波数Helmholtz方程新型、高效积分方程解法的研究

国家自然科学基金

0+阅读 · 2012年12月31日

仿射技巧在复几何的应用

国家自然科学基金

0+阅读 · 2012年12月31日

图的有限定条件的圈问题研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于零空间追踪的电压波动与闪变检测算法研究

国家自然科学基金

0+阅读 · 2011年12月31日

HDAC抑制剂治疗视网膜感光细胞变性的分子基础

国家自然科学基金

1+阅读 · 2011年12月31日

Approximate Sampling and Counting of Graphs with Near-$P$-stable Degree Intervals

Arxiv

0+阅读 · 2022年4月20日

A Best Cost-Sharing Rule for Selfish Bin Packing

Arxiv

0+阅读 · 2022年4月20日

Toward Understanding the Use of Centralized Exchanges for Decentralized Cryptocurrency

Arxiv

0+阅读 · 2022年4月19日

Spot the Difference: A Novel Task for Embodied Agents in Changing Environments

Arxiv

0+阅读 · 2022年4月18日

Low Degree Testing over the Reals

Arxiv

0+阅读 · 2022年4月18日

Safe rules for the identification of zeros in the solutions of the SLOPE problem

Arxiv

0+阅读 · 2022年4月18日

Risk and optimal policies in bandit experiments

Risk and optimal policies in bandit experiments

Arxiv

0+阅读 · 2022年4月18日

NICO++: Towards Better Benchmarking for Domain Generalization

Arxiv

1+阅读 · 2022年4月17日

Nested smoothing algorithms for inference and tracking of heterogeneous multi-scale state-space systems

Arxiv

0+阅读 · 2022年4月16日

Towards a Stronger Theory for Permutation-based Evolutionary Algorithms

Arxiv

0+阅读 · 2022年4月15日

VIP会员

文章信息

相关主题

赌博机/老虎机

相关VIP内容

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

2019年机器学习框架回顾

2019年机器学习框架回顾

专知会员服务

36+阅读 · 2019年10月11日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

《小型无人机系统侦测追踪技术：声学、计算机视觉与深度学习融合方案》最新98页

《"牧羊人网格"拦截策略：实现无人机集群可靠拦截的新范式》

光纤无人机：反无人机系统的重大挑战

《作战建模与仿真实证研究》

相关资讯

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Plenary Talk2

【ICIG2021】Latest News & Announcements of the Plenary Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年11月2日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

M365热招 | N+Offer“职”等你来

M365热招 | N+Offer“职”等你来

微软招聘

0+阅读 · 2021年3月17日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文推荐】最新七篇图像检索相关论文—草图、Tie-Aware、场景图解析、叠加跨注意力机制、深度哈希、人群估计

【论文推荐】最新七篇图像检索相关论文—草图、Tie-Aware、场景图解析、叠加跨注意力机制、深度哈希、人群估计

专知

10+阅读 · 2018年4月22日

相关论文

Approximate Sampling and Counting of Graphs with Near-$P$-stable Degree Intervals

Arxiv

0+阅读 · 2022年4月20日

A Best Cost-Sharing Rule for Selfish Bin Packing

Arxiv

0+阅读 · 2022年4月20日

Toward Understanding the Use of Centralized Exchanges for Decentralized Cryptocurrency

Arxiv

0+阅读 · 2022年4月19日

Spot the Difference: A Novel Task for Embodied Agents in Changing Environments

Arxiv

0+阅读 · 2022年4月18日

Low Degree Testing over the Reals

Arxiv

0+阅读 · 2022年4月18日

Safe rules for the identification of zeros in the solutions of the SLOPE problem

Arxiv

0+阅读 · 2022年4月18日

Risk and optimal policies in bandit experiments

Risk and optimal policies in bandit experiments

Arxiv

0+阅读 · 2022年4月18日

NICO++: Towards Better Benchmarking for Domain Generalization

Arxiv

1+阅读 · 2022年4月17日

Nested smoothing algorithms for inference and tracking of heterogeneous multi-scale state-space systems

Arxiv

0+阅读 · 2022年4月16日

Towards a Stronger Theory for Permutation-based Evolutionary Algorithms

Arxiv

0+阅读 · 2022年4月15日

相关基金

Volterra积分微分方程的多区间Chebyshev和Legendre谱配置法

国家自然科学基金

0+阅读 · 2015年12月31日

套子代数的Hochschild上同调及套的分类

国家自然科学基金

3+阅读 · 2014年12月31日

基于Lanchester方程的作战混合动态对策及其应用研究

国家自然科学基金

6+阅读 · 2013年12月31日

基于压缩感知和非负矩阵分解理论的高光谱混合像元分解

国家自然科学基金

0+阅读 · 2012年12月31日

EDSOA理论基础研究

国家自然科学基金

0+阅读 · 2012年12月31日

大波数Helmholtz方程新型、高效积分方程解法的研究

国家自然科学基金

0+阅读 · 2012年12月31日

仿射技巧在复几何的应用

国家自然科学基金

0+阅读 · 2012年12月31日

图的有限定条件的圈问题研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于零空间追踪的电压波动与闪变检测算法研究

国家自然科学基金

0+阅读 · 2011年12月31日

HDAC抑制剂治疗视网膜感光细胞变性的分子基础

国家自然科学基金

1+阅读 · 2011年12月31日

微信扫码咨询专知VIP会员