放松I. I. D. 假设:通过根- E. 常规化,适应性地微量微量最佳劣势 (Relaxing the I.I.D. Assumption: Adaptively Minimax Optimal Regret via Root-Entropic Regularization) - 专知论文

会员服务 ·

0

优化器 · 正则化项 · 约束 · 自适应学习 · 学成 ·

2021 年 1 月 11 日

Relaxing the I.I.D. Assumption: Adaptively Minimax Optimal Regret via Root-Entropic Regularization

翻译：放松I. I. D. 假设:通过根- E. 常规化,适应性地微量微量最佳劣势

Blair Bilodeau,Jeffrey Negrea,Daniel M. Roy

from arxiv, 71 pages, 2 figures. Blair Bilodeau and Jeffrey Negrea are equal-contribution authors; order was determined randomly

We consider sequential prediction with expert advice when data are generated from distributions varying arbitrarily within an unknown constraint set. We quantify relaxations of the classical i.i.d. assumption in terms of these constraint sets, with i.i.d. sequences at one extreme and adversarial mechanisms at the other. The Hedge algorithm, long known to be minimax optimal in the adversarial regime, was recently shown to be minimax optimal for i.i.d. data. We show that Hedge with deterministic learning rates is suboptimal between these extremes, and present a new algorithm that adaptively achieves the minimax optimal rate of regret with respect to our relaxations of the i.i.d. assumption, and does so without knowledge of the underlying constraint set. We analyze our algorithm using the follow-the-regularized-leader framework, and prove it corresponds to Hedge with an adaptive learning rate that implicitly scales as the square root of the entropy of the current predictive distribution, rather than the entropy of the initial predictive distribution.

翻译：我们认为,如果数据是在一个未知的制约下,从分布上任意生成的,则有专家建议进行顺序预测。我们用这些制约组来量化古典i.d.假设的放松,在一个极端和对立机制中以i.d.d.顺序进行计算。在对抗制中久以迷你最大优化而闻名的格子算法,最近被证明对i.d.数据来说是最优的。我们显示,具有确定学习率的格子在这些极端之间并不理想,我们提出一种新的算法,在适应性地实现与i.i.d.假设的放松有关的最起码的遗憾率,并且在这样做时没有了解基本的制约组。我们用后定型领导框架来分析我们的算法,并证明它与适应性学习率相匹配,后者隐含着当前预测分布的正方根,而不是最初预测分布的方根。

0

相关内容

优化器

应用机器学习书稿，361页pdf

应用机器学习书稿，361页pdf

专知会员服务

59+阅读 · 2020年11月24日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

Fariz Darari简明《博弈论Game Theory》介绍，35页ppt

Fariz Darari简明《博弈论Game Theory》介绍，35页ppt

专知会员服务

111+阅读 · 2020年5月15日

因果图，Causal Graphs，52页ppt

因果图，Causal Graphs，52页ppt

专知会员服务

250+阅读 · 2020年4月19日

【斯坦福大学】Dropout的隐性和显性正则化效应，Regularization Effects

【斯坦福大学】Dropout的隐性和显性正则化效应，Regularization Effects

专知会员服务

34+阅读 · 2020年3月4日

【微软Alekh等开放新书】强化学习理论与算法（Reinforcement Learning:Theory and Algorithms），附83页pdf

【微软Alekh等开放新书】强化学习理论与算法（Reinforcement Learning:Theory and Algorithms），附83页pdf

专知会员服务

121+阅读 · 2019年11月24日

【新书稿：强化学习：理论与算法】《Reinforcement Learning: Theory and Algorithms》by Alekh Agarwal, Nan Jiang, Sham M. Kakade (2019)，(附83页pdf)

【新书稿：强化学习：理论与算法】《Reinforcement Learning: Theory and Algorithms》by Alekh Agarwal, Nan Jiang, Sham M. Kakade (2019)，(附83页pdf)

专知会员服务

79+阅读 · 2019年11月23日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

MIT新书《强化学习与最优控制》

MIT新书《强化学习与最优控制》

专知会员服务

280+阅读 · 2019年10月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

RL 真经

CreateAMind

5+阅读 · 2018年12月28日

【NIPS2018】接收论文列表

【NIPS2018】接收论文列表

专知

5+阅读 · 2018年9月10日

已删除

将门创投

5+阅读 · 2018年3月21日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

强化学习 cartpole_a3c

强化学习 cartpole_a3c

CreateAMind

9+阅读 · 2017年7月21日

Minimal Overhead ARQ Sharing Strategies for URLLC in Multi-Hop Networks

Arxiv

0+阅读 · 2021年3月8日

Optimal Rates for Learning Hidden Tree Structures

Arxiv

0+阅读 · 2021年3月7日

Suboptimality of Constrained Least Squares and Improvements via Non-Linear Predictors

Arxiv

0+阅读 · 2021年3月7日

Approximation Algorithms for Active Sequential Hypothesis Testing

Arxiv

0+阅读 · 2021年3月7日

Constrained Model-based Reinforcement Learning with Robust Cross-Entropy Method

Arxiv

0+阅读 · 2021年3月6日

Efficient Least Squares for Estimating Total Effects under Linearity and Causal Sufficiency

Arxiv

0+阅读 · 2021年3月5日

Inverse Reinforcement Learning with Explicit Policy Estimates

Arxiv

0+阅读 · 2021年3月4日

Minimax Risk and Uniform Convergence Rates for Nonparametric Dyadic Regression

Arxiv

0+阅读 · 2021年3月4日

Policy Decomposition: Approximate Optimal Control with Suboptimality Estimates

Arxiv

0+阅读 · 2021年3月3日

Generalization and Regularization in DQN

Generalization and Regularization in DQN

Arxiv

6+阅读 · 2019年1月30日

VIP会员

文章信息

相关主题

自适应学习

相关VIP内容

应用机器学习书稿，361页pdf

应用机器学习书稿，361页pdf

专知会员服务

59+阅读 · 2020年11月24日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

Fariz Darari简明《博弈论Game Theory》介绍，35页ppt

Fariz Darari简明《博弈论Game Theory》介绍，35页ppt

专知会员服务

111+阅读 · 2020年5月15日

因果图，Causal Graphs，52页ppt

因果图，Causal Graphs，52页ppt

专知会员服务

250+阅读 · 2020年4月19日

【斯坦福大学】Dropout的隐性和显性正则化效应，Regularization Effects

【斯坦福大学】Dropout的隐性和显性正则化效应，Regularization Effects

专知会员服务

34+阅读 · 2020年3月4日

【微软Alekh等开放新书】强化学习理论与算法（Reinforcement Learning:Theory and Algorithms），附83页pdf

【微软Alekh等开放新书】强化学习理论与算法（Reinforcement Learning:Theory and Algorithms），附83页pdf

专知会员服务

121+阅读 · 2019年11月24日

【新书稿：强化学习：理论与算法】《Reinforcement Learning: Theory and Algorithms》by Alekh Agarwal, Nan Jiang, Sham M. Kakade (2019)，(附83页pdf)

【新书稿：强化学习：理论与算法】《Reinforcement Learning: Theory and Algorithms》by Alekh Agarwal, Nan Jiang, Sham M. Kakade (2019)，(附83页pdf)

专知会员服务

79+阅读 · 2019年11月23日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

MIT新书《强化学习与最优控制》

MIT新书《强化学习与最优控制》

专知会员服务

280+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

面向具身智能的多模态数据存储与检索：综述

《算法战争研究计划全景评估》35页

【CMU博士论文】水下三维视觉感知与生成

智能体战争：自主人工智能军备竞赛全景透视

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

RL 真经

CreateAMind

5+阅读 · 2018年12月28日

【NIPS2018】接收论文列表

【NIPS2018】接收论文列表

专知

5+阅读 · 2018年9月10日

已删除

将门创投

5+阅读 · 2018年3月21日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

强化学习 cartpole_a3c

强化学习 cartpole_a3c

CreateAMind

9+阅读 · 2017年7月21日

相关论文

Minimal Overhead ARQ Sharing Strategies for URLLC in Multi-Hop Networks

Arxiv

0+阅读 · 2021年3月8日

Optimal Rates for Learning Hidden Tree Structures

Arxiv

0+阅读 · 2021年3月7日

Suboptimality of Constrained Least Squares and Improvements via Non-Linear Predictors

Arxiv

0+阅读 · 2021年3月7日

Approximation Algorithms for Active Sequential Hypothesis Testing

Arxiv

0+阅读 · 2021年3月7日

Constrained Model-based Reinforcement Learning with Robust Cross-Entropy Method

Arxiv

0+阅读 · 2021年3月6日

Efficient Least Squares for Estimating Total Effects under Linearity and Causal Sufficiency

Arxiv

0+阅读 · 2021年3月5日

Inverse Reinforcement Learning with Explicit Policy Estimates

Arxiv

0+阅读 · 2021年3月4日

Minimax Risk and Uniform Convergence Rates for Nonparametric Dyadic Regression

Arxiv

0+阅读 · 2021年3月4日

Policy Decomposition: Approximate Optimal Control with Suboptimality Estimates

Arxiv

0+阅读 · 2021年3月3日

Generalization and Regularization in DQN

Generalization and Regularization in DQN

Arxiv

6+阅读 · 2019年1月30日

微信扫码咨询专知VIP会员