跟踪大盗中最重要的武器开关 (Tracking Most Significant Arm Switches in Bandits) - 专知论文

会员服务 ·

0

赌博机/老虎机 · ARM · 优化器 · 知识 (knowledge) · CASE ·

2022 年 6 月 16 日

Tracking Most Significant Arm Switches in Bandits

翻译：跟踪大盗中最重要的武器开关

Joe Suk,Samory Kpotufe

In bandit with distribution shifts, one aims to automatically adapt to unknown changes in reward distribution, and restart exploration when necessary. While this problem has been studied for many years, a recent breakthrough of Auer et al. (2018, 2019) provides the first adaptive procedure to guarantee an optimal (dynamic) regret $\sqrt{LT}$, for $T$ rounds, and an unknown number $L$ of changes. However, while this rate is tight in the worst case, it remained open whether faster rates are possible, without prior knowledge, if few changes in distribution are actually severe. To resolve this question, we propose a new notion of significant shift, which only counts very severe changes that clearly necessitate a restart: roughly, these are changes involving not only best arm switches, but also involving large aggregate differences in reward overtime. Thus, our resulting procedure adaptively achieves rates always faster (sometimes significantly) than $O(\sqrt{ST})$, where $S\ll L$ only counts best arm switches, while at the same time, always faster than the optimal $O(V^{\frac{1}{3}}T^{\frac{2}{3}})$ when expressed in terms of total variation $V$ (which aggregates differences overtime). Our results are expressed in enough generality to also capture non-stochastic adversarial settings.

翻译：虽然这个问题已经研究多年,但最近Auer等人(2018年,2019年)的突破提供了第一个适应性程序,可以保证最优(动力)遗憾$@sqrt{LT}美元($T$)和变化的金额($O(sqrt{LT}),尽管在最坏的情况下,这一利率比较紧,但是,如果分配的变动实际上很少发生严重,那么在不事先知道的情况下,能否实现更快的费率,以及必要时重新开始勘探。为了解决这个问题,我们提出了一个重大转变的新概念,它只计得非常严重的变化,显然需要重新启动:这些变化不仅涉及最好的手臂开关,而且还涉及报酬加班方面的巨大总体差异。因此,我们由此产生的程序所实现的利率总是比美元(sqrt{L}$(sqrt{ST})高($llL$)总是比美元(sllL$)只算最佳的手臂开关,而与此同时,在一般情况下,我们表示的非加班费差异时,总是比美元($=======总差额)。

0

相关内容

赌博机/老虎机

赌博机/老虎机

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

复杂的序列数据分析：现有算法的系统文献综述，Complex Sequential Data Analysis: A Systematic Literature Review of Existing Algorithms

复杂的序列数据分析：现有算法的系统文献综述，Complex Sequential Data Analysis: A Systematic Literature Review of Existing Algorithms

专知会员服务

27+阅读 · 2020年7月24日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

会议交流 | IJCKG: International Joint Conference on Knowledge Graphs

会议交流 | IJCKG: International Joint Conference on Knowledge Graphs

开放知识图谱

0+阅读 · 2021年9月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

有机蒙脱石诱导g-C3N4/TiO2异质结复合机理及材料光催化性能研究

国家自然科学基金

0+阅读 · 2015年12月31日

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

多开关磁阻电机同步协调控制系统的研究

国家自然科学基金

0+阅读 · 2015年12月31日

小波分析在R-L分数阶微分方程数值解中的应用

国家自然科学基金

0+阅读 · 2014年12月31日

基于多角度遥感反演森林冠层结构参数及在碳循环模型中的应用

国家自然科学基金

0+阅读 · 2013年12月31日

分数阶微分方程并行算法研究

国家自然科学基金

0+阅读 · 2013年12月31日

应力对FeRh薄膜磁卡效应的调控研究

国家自然科学基金

0+阅读 · 2013年12月31日

改进型量子进化算法在极紫外多层膜研究中的应用

国家自然科学基金

0+阅读 · 2013年12月31日

拟Frobenius-Lusztig核

国家自然科学基金

0+阅读 · 2012年12月31日

晶态桥联聚倍半硅氧烷的自导向组装（self-directed assembly）及其发光性能

国家自然科学基金

0+阅读 · 2011年12月31日

Hybrid cuckoo search algorithm for the minimum dominating set problem

Arxiv

0+阅读 · 2022年8月5日

Jumping Evaluation of Nested Regular Path Queries

Arxiv

0+阅读 · 2022年8月5日

Improved Rates of Bootstrap Approximation for the Operator Norm: A Coordinate-Free Approach

Arxiv

0+阅读 · 2022年8月5日

A New Expert Questioning Approach to More Efficient Fault Localization in Ontologies

Arxiv

0+阅读 · 2022年8月5日

Learning the Trading Algorithm in Simulated Markets with Non-stationary Continuum Bandits

Arxiv

0+阅读 · 2022年8月4日

Bayesian Optimization with Informative Covariance

Arxiv

0+阅读 · 2022年8月4日

PolarMOT: How Far Can Geometric Relations Take Us in 3D Multi-Object Tracking?

Arxiv

0+阅读 · 2022年8月3日

Adversarial Bandits with Knapsacks

Arxiv

0+阅读 · 2022年8月3日

An Algorithm for Ennola's Second Theorem and Counting Smooth Numbers in Practice

Arxiv

0+阅读 · 2022年8月2日

Adaptive Correlation Filters with Long-Term and Short-Term Memory for Object Tracking

Arxiv

11+阅读 · 2018年3月23日

VIP会员

文章信息

相关主题

赌博机/老虎机

知识 (knowledge)

相关VIP内容

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

复杂的序列数据分析：现有算法的系统文献综述，Complex Sequential Data Analysis: A Systematic Literature Review of Existing Algorithms

复杂的序列数据分析：现有算法的系统文献综述，Complex Sequential Data Analysis: A Systematic Literature Review of Existing Algorithms

专知会员服务

27+阅读 · 2020年7月24日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

新书册《几何深度学习的数学基础》

中程单向攻击无人机的战略意义：俄乌战争启示

在无标注条件下适配视觉—语言模型：全面综述

面向视觉语言模型的持续学习：遗忘之外的综述与分类体系

相关资讯

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

会议交流 | IJCKG: International Joint Conference on Knowledge Graphs

会议交流 | IJCKG: International Joint Conference on Knowledge Graphs

开放知识图谱

0+阅读 · 2021年9月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

相关论文

Hybrid cuckoo search algorithm for the minimum dominating set problem

Arxiv

0+阅读 · 2022年8月5日

Jumping Evaluation of Nested Regular Path Queries

Arxiv

0+阅读 · 2022年8月5日

Improved Rates of Bootstrap Approximation for the Operator Norm: A Coordinate-Free Approach

Arxiv

0+阅读 · 2022年8月5日

A New Expert Questioning Approach to More Efficient Fault Localization in Ontologies

Arxiv

0+阅读 · 2022年8月5日

Learning the Trading Algorithm in Simulated Markets with Non-stationary Continuum Bandits

Arxiv

0+阅读 · 2022年8月4日

Bayesian Optimization with Informative Covariance

Arxiv

0+阅读 · 2022年8月4日

PolarMOT: How Far Can Geometric Relations Take Us in 3D Multi-Object Tracking?

Arxiv

0+阅读 · 2022年8月3日

Adversarial Bandits with Knapsacks

Arxiv

0+阅读 · 2022年8月3日

An Algorithm for Ennola's Second Theorem and Counting Smooth Numbers in Practice

Arxiv

0+阅读 · 2022年8月2日

Adaptive Correlation Filters with Long-Term and Short-Term Memory for Object Tracking

Arxiv

11+阅读 · 2018年3月23日

相关基金

有机蒙脱石诱导g-C3N4/TiO2异质结复合机理及材料光催化性能研究

国家自然科学基金

0+阅读 · 2015年12月31日

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

多开关磁阻电机同步协调控制系统的研究

国家自然科学基金

0+阅读 · 2015年12月31日

小波分析在R-L分数阶微分方程数值解中的应用

国家自然科学基金

0+阅读 · 2014年12月31日

基于多角度遥感反演森林冠层结构参数及在碳循环模型中的应用

国家自然科学基金

0+阅读 · 2013年12月31日

分数阶微分方程并行算法研究

国家自然科学基金

0+阅读 · 2013年12月31日

应力对FeRh薄膜磁卡效应的调控研究

国家自然科学基金

0+阅读 · 2013年12月31日

改进型量子进化算法在极紫外多层膜研究中的应用

国家自然科学基金

0+阅读 · 2013年12月31日

拟Frobenius-Lusztig核

国家自然科学基金

0+阅读 · 2012年12月31日

晶态桥联聚倍半硅氧烷的自导向组装（self-directed assembly）及其发光性能

国家自然科学基金

0+阅读 · 2011年12月31日

微信扫码咨询专知VIP会员