学习和优化季节模式 (Learning and Optimization with Seasonal Patterns) - 专知论文

会员服务 ·

0

周期的 · 估计/估计量 · 优化器 · 学成 · 均值 ·

2021 年 8 月 22 日

Learning and Optimization with Seasonal Patterns

翻译：学习和优化季节模式

Ningyuan Chen,Chun Wang,Longlin Wang

A standard assumption adopted in the multi-armed bandit (MAB) framework is that the mean rewards are constant over time. This assumption can be restrictive in the business world as decision-makers often face an evolving environment where the mean rewards are time-varying. In this paper, we consider a non-stationary MAB model with $K$ arms whose mean rewards vary over time in a periodic manner. The unknown periods can be different across arms and scale with the length of the horizon $T$ polynomially. We propose a two-stage policy that combines the Fourier analysis with a confidence-bound-based learning procedure to learn the periods and minimize the regret. In stage one, the policy correctly estimates the periods of all arms with high probability. In stage two, the policy explores the periodic mean rewards of arms using the periods estimated in stage one and exploits the optimal arm in the long run. We show that our learning policy incurs a regret upper bound $\tilde{O}(\sqrt{T\sum_{k=1}^K T_k})$ where $T_k$ is the period of arm $k$. Moreover, we establish a general lower bound $\Omega(\sqrt{T\max_{k}\{ T_k\}})$ for any policy. Therefore, our policy is near-optimal up to a factor of $\sqrt{K}$.

翻译：多武装土匪(MAB)框架采用的标准假设是,平均报酬是长期不变的。这种假设在商业界可能是限制性的,因为决策者经常面对一种不断变化的环境,而平均报酬是时间变化的。在本文中,我们考虑的是非静止的MAB模式,其平均报酬是K美元,其平均报酬是定期的。不同武器和规模的未知时期可能不同,其间平面值为$T$ 多元球度。我们建议了两阶段政策,将Fourier分析与基于信任的学习程序相结合,以学习时期并尽量减少遗憾。在第一阶段,政策正确估计所有武器的期限。在第二阶段,我们考虑的是使用第一阶段估计的定期平均武器报酬,并在长期利用最佳手臂。我们显示,我们的学习政策需要为最高约束$T=K=K=K=KT_k}(sqrt=K=K=K+K}美元,其中$T_k$是接近美元的政策期限。此外,我们为General_k_Orts)。此外,我们为整个政策设定了一个较低的约束。

0

相关内容

周期的

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

哥伦比亚大学最新《机器学习》课程，Fall-B 2020 (Machine Learning)

专知会员服务

39+阅读 · 2020年11月3日

最新《机器学习最优化》课程笔记，36页pdf，Optimization for Machine Learning

专知会员服务

170+阅读 · 2020年5月10日

【机器学习最优化课程笔记】Optimization for Machine Learning，36页pdf

【机器学习最优化课程笔记】Optimization for Machine Learning，36页pdf

专知会员服务

117+阅读 · 2020年3月25日

【论文笔记】深度学习对设计模式组织自动化的影响：Implications of deep learning for the automation of design patterns organization

专知会员服务

8+阅读 · 2020年1月7日

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

专知会员服务

244+阅读 · 2019年10月21日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

spinningup.openai 强化学习资源完整

spinningup.openai 强化学习资源完整

CreateAMind

6+阅读 · 2018年12月17日

笔记 | Deep active learning for named entity recognition

笔记 | Deep active learning for named entity recognition

黑龙江大学自然语言处理实验室

24+阅读 · 2018年5月27日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

【行人识别】Deep Transfer Learning for Person Re-identification

【行人识别】Deep Transfer Learning for Person Re-identification

极市平台

6+阅读 · 2017年7月5日

Scale Free Adversarial Multi Armed Bandits

Arxiv

0+阅读 · 2021年10月8日

Convergence of Finite Memory Q-Learning for POMDPs and Near Optimality of Learned Policies under Filter Stability

Arxiv

0+阅读 · 2021年10月2日

TFPnP: Tuning-free Plug-and-Play Proximal Algorithm with Applications to Inverse Imaging Problems

Arxiv

0+阅读 · 2021年9月18日

Inverse Constrained Reinforcement Learning

Arxiv

8+阅读 · 2021年5月21日

Deep learning: a statistical viewpoint

Arxiv

18+阅读 · 2021年3月16日

Deep Learning for Energy Markets

Deep Learning for Energy Markets

Arxiv

10+阅读 · 2019年4月10日

Reinforcement Learning with Perturbed Rewards

Arxiv

4+阅读 · 2018年10月5日

Variational Bayesian Reinforcement Learning with Regret Bounds

Arxiv

3+阅读 · 2018年7月25日

A Multi-Objective Deep Reinforcement Learning Framework

A Multi-Objective Deep Reinforcement Learning Framework

Arxiv

16+阅读 · 2018年6月27日

Accelerated Reinforcement Learning

Arxiv

6+阅读 · 2018年4月24日

VIP会员

文章信息

相关主题

估计/估计量

相关VIP内容

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

哥伦比亚大学最新《机器学习》课程，Fall-B 2020 (Machine Learning)

专知会员服务

39+阅读 · 2020年11月3日

最新《机器学习最优化》课程笔记，36页pdf，Optimization for Machine Learning

专知会员服务

170+阅读 · 2020年5月10日

【机器学习最优化课程笔记】Optimization for Machine Learning，36页pdf

【机器学习最优化课程笔记】Optimization for Machine Learning，36页pdf

专知会员服务

117+阅读 · 2020年3月25日

【论文笔记】深度学习对设计模式组织自动化的影响：Implications of deep learning for the automation of design patterns organization

专知会员服务

8+阅读 · 2020年1月7日

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

专知会员服务

244+阅读 · 2019年10月21日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

热门VIP内容

开通专知VIP会员享更多权益服务

【博士论文】低维与高维空间中潜在表征的分析、建模与变换

《生态建模密码破译：建模与编程实践》美陆军最新报告

大模型解决方案白皮书：社交陪伴场景全流程落地指南

面向具身操作的视觉-语言-动作模型综述

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

spinningup.openai 强化学习资源完整

spinningup.openai 强化学习资源完整

CreateAMind

6+阅读 · 2018年12月17日

笔记 | Deep active learning for named entity recognition

笔记 | Deep active learning for named entity recognition

黑龙江大学自然语言处理实验室

24+阅读 · 2018年5月27日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

【行人识别】Deep Transfer Learning for Person Re-identification

【行人识别】Deep Transfer Learning for Person Re-identification

极市平台

6+阅读 · 2017年7月5日

相关论文

Scale Free Adversarial Multi Armed Bandits

Arxiv

0+阅读 · 2021年10月8日

Convergence of Finite Memory Q-Learning for POMDPs and Near Optimality of Learned Policies under Filter Stability

Arxiv

0+阅读 · 2021年10月2日

TFPnP: Tuning-free Plug-and-Play Proximal Algorithm with Applications to Inverse Imaging Problems

Arxiv

0+阅读 · 2021年9月18日

Inverse Constrained Reinforcement Learning

Arxiv

8+阅读 · 2021年5月21日

Deep learning: a statistical viewpoint

Arxiv

18+阅读 · 2021年3月16日

Deep Learning for Energy Markets

Deep Learning for Energy Markets

Arxiv

10+阅读 · 2019年4月10日

Reinforcement Learning with Perturbed Rewards

Arxiv

4+阅读 · 2018年10月5日

Variational Bayesian Reinforcement Learning with Regret Bounds

Arxiv

3+阅读 · 2018年7月25日

A Multi-Objective Deep Reinforcement Learning Framework

A Multi-Objective Deep Reinforcement Learning Framework

Arxiv

16+阅读 · 2018年6月27日

Accelerated Reinforcement Learning

Arxiv

6+阅读 · 2018年4月24日

微信扫码咨询专知VIP会员