Implicitly normalized forecaster with clipping for linear and non-linear heavy-tailed multi-armed bandits - 专知论文

会员服务 ·

0

规范化的 · 赌博机/老虎机 · 线性的 · 优化器 · CASES ·

2023 年 5 月 19 日

Implicitly normalized forecaster with clipping for linear and non-linear heavy-tailed multi-armed bandits

翻译：暂无翻译

Yuriy Dorn,Nikita Kornilov,Nikolay Kutuzov,Alexander Nazin,Eduard Gorbunov,Alexander Gasnikov

The Implicitly Normalized Forecaster (INF) algorithm is considered to be an optimal solution for adversarial multi-armed bandit (MAB) problems. However, most of the existing complexity results for INF rely on restrictive assumptions, such as bounded rewards. Recently, a related algorithm was proposed that works for both adversarial and stochastic heavy-tailed MAB settings. However, this algorithm fails to fully exploit the available data. In this paper, we propose a new version of INF called the Implicitly Normalized Forecaster with clipping (INF-clip) for MAB problems with heavy-tailed reward distributions. We establish convergence results under mild assumptions on the rewards distribution and demonstrate that INF-clip is optimal for linear heavy-tailed stochastic MAB problems and works well for non-linear ones. Furthermore, we show that INF-clip outperforms the best-of-both-worlds algorithm in cases where it is difficult to distinguish between different arms.

翻译：暂无翻译

0

相关内容

规范化的

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Anderson型多酸的不对称修饰及可控组装研究

国家自然科学基金

1+阅读 · 2014年12月31日

复杂医疗保健数据的统计推断和过程控制

国家自然科学基金

1+阅读 · 2013年12月31日

重稀土元素对铁基块体非晶合金过冷液相热稳定性及其磁学性能影响机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

设计合成用于白光LED的全色单一稀土配位聚合物

国家自然科学基金

0+阅读 · 2011年12月31日

外场作用下光子带隙可调的光子晶体的组装及性能研究

国家自然科学基金

0+阅读 · 2009年12月31日

PCL-Indexability and Whittle Index for Restless Bandits with General Observation Models

PCL-Indexability and Whittle Index for Restless Bandits with General Observation Models

Arxiv

0+阅读 · 2023年7月6日

Strong convergence rates for a full discretization of stochastic wave equation with nonlinear damping

Arxiv

0+阅读 · 2023年7月5日

Minimizing Age of Information for Mobile Edge Computing Systems: A Nested Index Approach

Arxiv

0+阅读 · 2023年7月3日

Statistical Inference on Multi-armed Bandits with Delayed Feedback

Arxiv

0+阅读 · 2023年7月3日

Efficient Algorithms for Euclidean Steiner Minimal Tree on Near-Convex Terminal Sets

Arxiv

0+阅读 · 2023年7月1日

VIP会员

文章信息

相关主题

赌博机/老虎机

相关VIP内容

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

大模型推理时代的知识编辑

《利用人工智能对军事行动进行建模》

【MIT博士论文】加速科学发现的因果建模实践算法

机器人、无人机与实时影像：应对城市爆炸威胁的三大技术方案

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

PCL-Indexability and Whittle Index for Restless Bandits with General Observation Models

PCL-Indexability and Whittle Index for Restless Bandits with General Observation Models

Arxiv

0+阅读 · 2023年7月6日

Strong convergence rates for a full discretization of stochastic wave equation with nonlinear damping

Arxiv

0+阅读 · 2023年7月5日

Minimizing Age of Information for Mobile Edge Computing Systems: A Nested Index Approach

Arxiv

0+阅读 · 2023年7月3日

Statistical Inference on Multi-armed Bandits with Delayed Feedback

Arxiv

0+阅读 · 2023年7月3日

Efficient Algorithms for Euclidean Steiner Minimal Tree on Near-Convex Terminal Sets

Arxiv

0+阅读 · 2023年7月1日

相关基金

Anderson型多酸的不对称修饰及可控组装研究

国家自然科学基金

1+阅读 · 2014年12月31日

复杂医疗保健数据的统计推断和过程控制

国家自然科学基金

1+阅读 · 2013年12月31日

重稀土元素对铁基块体非晶合金过冷液相热稳定性及其磁学性能影响机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

设计合成用于白光LED的全色单一稀土配位聚合物

国家自然科学基金

0+阅读 · 2011年12月31日

外场作用下光子带隙可调的光子晶体的组装及性能研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员