线性强盗中代表性学习的影响 (Impact of Representation Learning in Linear Bandits) - 专知论文

会员服务 ·

0

赌博机/老虎机 · 表示学习 · 线性的 · 知识神经元 · 学成 ·

2021 年 5 月 5 日

Impact of Representation Learning in Linear Bandits

翻译：线性强盗中代表性学习的影响

Jiaqi Yang,Wei Hu,Jason D. Lee,Simon S. Du

from arxiv, 25 pages, 6 figures

We study how representation learning can improve the efficiency of bandit problems. We study the setting where we play $T$ linear bandits with dimension $d$ concurrently, and these $T$ bandit tasks share a common $k (\ll d)$ dimensional linear representation. For the finite-action setting, we present a new algorithm which achieves $\widetilde{O}(T\sqrt{kN} + \sqrt{dkNT})$ regret, where $N$ is the number of rounds we play for each bandit. When $T$ is sufficiently large, our algorithm significantly outperforms the naive algorithm (playing $T$ bandits independently) that achieves $\widetilde{O}(T\sqrt{d N})$ regret. We also provide an $\Omega(T\sqrt{kN} + \sqrt{dkNT})$ regret lower bound, showing that our algorithm is minimax-optimal up to poly-logarithmic factors. Furthermore, we extend our algorithm to the infinite-action setting and obtain a corresponding regret bound which demonstrates the benefit of representation learning in certain regimes. We also present experiments on synthetic and real-world data to illustrate our theoretical findings and demonstrate the effectiveness of our proposed algorithms.

翻译：我们研究代表性学习如何能提高土匪问题的效率。我们研究的是我们同时玩T$美元线性强盗和维度美元同时玩T$的线性强盗的布局, 而这些$T$的土匪任务共享一个通用的美元( ll d) 维度线性代表。对于有限行动设置, 我们提出一种新的算法, 实现$( 全方位){O}( T\ qrt{ kN} +\ sqrt{ dkNT} +\ sqrt{ dkNT} $( ) 遗憾, 低限是我们为每个土匪玩的回合数。当美元足够大的时候, 我们的算法大大超越了天真算法( 独立玩$T$ 土匪), 从而实现了 $( 美元美元 ) ( 美元 ) ( 美元) 维度线性代表。对于有限行动 { ( T\ qrr\ d} ) ( ), 我们提出一个新的算法, 我们提供$( T) + sqrockrestal- laviewalalisalalalal resmation) 。

0

相关内容

赌博机/老虎机

赌博机/老虎机

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

最新《自监督表示学习》报告，70页ppt

最新《自监督表示学习》报告，70页ppt

专知会员服务

86+阅读 · 2020年12月22日

最新《联邦学习Federated Learning》报告，Federated Learning

最新《联邦学习Federated Learning》报告，Federated Learning

专知会员服务

89+阅读 · 2020年12月2日

【北京大学】Locally Differentially Private (Contextual) Bandits Learning

【北京大学】Locally Differentially Private (Contextual) Bandits Learning

专知会员服务

13+阅读 · 2020年6月8日

Fariz Darari简明《博弈论Game Theory》介绍，35页ppt

Fariz Darari简明《博弈论Game Theory》介绍，35页ppt

专知会员服务

111+阅读 · 2020年5月15日

【基于模型的强化学习的博弈论框架】A Game Theoretic Framework for Model Based Reinforcement Learning

【基于模型的强化学习的博弈论框架】A Game Theoretic Framework for Model Based Reinforcement Learning

专知会员服务

131+阅读 · 2020年4月19日

强化学习的对比无监督表示，CURL: Contrastive Unsupervised Representations for Reinforcement Learning

强化学习的对比无监督表示，CURL: Contrastive Unsupervised Representations for Reinforcement Learning

专知会员服务

41+阅读 · 2020年4月11日

【ACL 2019 Tutorials】无监督的跨语言表征学习（Unsupervised Cross-Lingual Representation Learning），Sebastian Ruder, Anders Søgaard，Ivan Vulić

【ACL 2019 Tutorials】无监督的跨语言表征学习（Unsupervised Cross-Lingual Representation Learning），Sebastian Ruder, Anders Søgaard，Ivan Vulić

专知会员服务

15+阅读 · 2019年11月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

Successor representations 强化学习表示的生物学启发

Successor representations 强化学习表示的生物学启发

CreateAMind

6+阅读 · 2019年9月5日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

RL 真经

CreateAMind

5+阅读 · 2018年12月28日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

Representation Learning on Network 网络表示学习

Representation Learning on Network 网络表示学习

全球人工智能

10+阅读 · 2017年10月19日

Representation Learning on Network 网络表示学习笔记

Representation Learning on Network 网络表示学习笔记

全球人工智能

5+阅读 · 2017年9月30日

A Representation Learning Perspective on the Importance of Train-Validation Splitting in Meta-Learning

Arxiv

0+阅读 · 2021年6月29日

Learning from an Exploring Demonstrator: Optimal Reward Estimation for Bandits

Arxiv

0+阅读 · 2021年6月28日

Dynamic Planning and Learning under Recovering Rewards

Arxiv

0+阅读 · 2021年6月28日

Regret Analysis in Deterministic Reinforcement Learning

Arxiv

0+阅读 · 2021年6月27日

Time-Series Representation Learning via Temporal and Contextual Contrasting

Arxiv

0+阅读 · 2021年6月26日

Transfer Learning in Bandits with Latent Continuity

Arxiv

0+阅读 · 2021年6月25日

Multi-player Multi-armed Bandits with Collision-Dependent Reward Distributions

Multi-player Multi-armed Bandits with Collision-Dependent Reward Distributions

Arxiv

0+阅读 · 2021年6月25日

Learning Optimal Representations with the Decodable Information Bottleneck

Arxiv

6+阅读 · 2020年9月27日

Financial Time Series Representation Learning

Financial Time Series Representation Learning

Arxiv

10+阅读 · 2020年3月27日

Accelerated Reinforcement Learning

Arxiv

6+阅读 · 2018年4月24日

VIP会员

文章信息

相关主题

赌博机/老虎机

知识神经元

相关VIP内容

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

最新《自监督表示学习》报告，70页ppt

最新《自监督表示学习》报告，70页ppt

专知会员服务

86+阅读 · 2020年12月22日

最新《联邦学习Federated Learning》报告，Federated Learning

最新《联邦学习Federated Learning》报告，Federated Learning

专知会员服务

89+阅读 · 2020年12月2日

【北京大学】Locally Differentially Private (Contextual) Bandits Learning

【北京大学】Locally Differentially Private (Contextual) Bandits Learning

专知会员服务

13+阅读 · 2020年6月8日

Fariz Darari简明《博弈论Game Theory》介绍，35页ppt

Fariz Darari简明《博弈论Game Theory》介绍，35页ppt

专知会员服务

111+阅读 · 2020年5月15日

【基于模型的强化学习的博弈论框架】A Game Theoretic Framework for Model Based Reinforcement Learning

【基于模型的强化学习的博弈论框架】A Game Theoretic Framework for Model Based Reinforcement Learning

专知会员服务

131+阅读 · 2020年4月19日

强化学习的对比无监督表示，CURL: Contrastive Unsupervised Representations for Reinforcement Learning

强化学习的对比无监督表示，CURL: Contrastive Unsupervised Representations for Reinforcement Learning

专知会员服务

41+阅读 · 2020年4月11日

【ACL 2019 Tutorials】无监督的跨语言表征学习（Unsupervised Cross-Lingual Representation Learning），Sebastian Ruder, Anders Søgaard，Ivan Vulić

【ACL 2019 Tutorials】无监督的跨语言表征学习（Unsupervised Cross-Lingual Representation Learning），Sebastian Ruder, Anders Søgaard，Ivan Vulić

专知会员服务

15+阅读 · 2019年11月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

热门VIP内容

开通专知VIP会员享更多权益服务

《物联网（IoT）中的无人机通信高效控制》135页

《在GNSS信号降级环境中利用共识实现无人机集群稳健协调》

中程单向攻击无人机的战略意义：俄乌战争启示

《面向无人机集群的避障动态传感器覆盖算法》最新38页

相关资讯

Successor representations 强化学习表示的生物学启发

Successor representations 强化学习表示的生物学启发

CreateAMind

6+阅读 · 2019年9月5日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

RL 真经

CreateAMind

5+阅读 · 2018年12月28日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

Representation Learning on Network 网络表示学习

Representation Learning on Network 网络表示学习

全球人工智能

10+阅读 · 2017年10月19日

Representation Learning on Network 网络表示学习笔记

Representation Learning on Network 网络表示学习笔记

全球人工智能

5+阅读 · 2017年9月30日

相关论文

A Representation Learning Perspective on the Importance of Train-Validation Splitting in Meta-Learning

Arxiv

0+阅读 · 2021年6月29日

Learning from an Exploring Demonstrator: Optimal Reward Estimation for Bandits

Arxiv

0+阅读 · 2021年6月28日

Dynamic Planning and Learning under Recovering Rewards

Arxiv

0+阅读 · 2021年6月28日

Regret Analysis in Deterministic Reinforcement Learning

Arxiv

0+阅读 · 2021年6月27日

Time-Series Representation Learning via Temporal and Contextual Contrasting

Arxiv

0+阅读 · 2021年6月26日

Transfer Learning in Bandits with Latent Continuity

Arxiv

0+阅读 · 2021年6月25日

Multi-player Multi-armed Bandits with Collision-Dependent Reward Distributions

Multi-player Multi-armed Bandits with Collision-Dependent Reward Distributions

Arxiv

0+阅读 · 2021年6月25日

Learning Optimal Representations with the Decodable Information Bottleneck

Arxiv

6+阅读 · 2020年9月27日

Financial Time Series Representation Learning

Financial Time Series Representation Learning

Arxiv

10+阅读 · 2020年3月27日

Accelerated Reinforcement Learning

Arxiv

6+阅读 · 2018年4月24日

微信扫码咨询专知VIP会员