高能多势低维多维线性线性强盗 (Multi-Agent Low-Dimensional Linear Bandits) - 专知论文

会员服务 ·

0

子空间 · 赌博机/老虎机 · 线性的 · 可约的 · 向量化 ·

2021 年 5 月 16 日

Multi-Agent Low-Dimensional Linear Bandits

翻译：高能多势低维多维线性线性强盗

Ronshee Chawla,Abishek Sankararaman,Sanjay Shakkottai

We study a multi-agent stochastic linear bandit with side information, parameterized by an unknown vector $\theta^* \in \mathbb{R}^d$. The side information consists of a finite collection of low-dimensional subspaces, one of which contains $\theta^*$. In our setting, agents can collaborate to reduce regret by sending recommendations across a communication graph connecting them. We present a novel decentralized algorithm, where agents communicate subspace indices with each other and each agent plays a projected variant of LinUCB on the corresponding (low-dimensional) subspace. By distributing the search for the optimal subspace across users and learning of the unknown vector by each agent in the corresponding low-dimensional subspace, we show that the per-agent finite-time regret is much smaller than the case when agents do not communicate. We finally complement these results through simulations.

翻译：我们用一个未知矢量 $\theta\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\可以\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\

0

相关内容

子空间

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

【Google】梯度下降，48页ppt

【Google】梯度下降，48页ppt

专知会员服务

81+阅读 · 2020年12月5日

【DeepMind】强化学习教程，83页ppt

【DeepMind】强化学习教程，83页ppt

专知会员服务

158+阅读 · 2020年8月7日

迁移学习简明教程，11页ppt

迁移学习简明教程，11页ppt

专知会员服务

108+阅读 · 2020年8月4日

【ACL 2020】低维双曲知识图谱嵌入，Low-Dimensional Hyperbolic Knowledge Graph Embeddings

【ACL 2020】低维双曲知识图谱嵌入，Low-Dimensional Hyperbolic Knowledge Graph Embeddings

专知会员服务

77+阅读 · 2020年6月14日

Fariz Darari简明《博弈论Game Theory》介绍，35页ppt

Fariz Darari简明《博弈论Game Theory》介绍，35页ppt

专知会员服务

112+阅读 · 2020年5月15日

因果图，Causal Graphs，52页ppt

因果图，Causal Graphs，52页ppt

专知会员服务

253+阅读 · 2020年4月19日

【AAAI2020教程】强化学习中的Exploration-Exploitation in Reinforcement Learning

专知会员服务

101+阅读 · 2020年2月8日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

顶会论文 || 65篇"IJCAI"深度强化学习论文汇总

顶会论文 || 65篇"IJCAI"深度强化学习论文汇总

深度强化学习实验室

3+阅读 · 2020年3月15日

LibRec 精选：AutoML for Contextual Bandits

LibRec 精选：AutoML for Contextual Bandits

LibRec智能推荐

7+阅读 · 2019年9月19日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

动物脑的好奇心和强化学习的好奇心

动物脑的好奇心和强化学习的好奇心

CreateAMind

10+阅读 · 2019年1月26日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

已删除

雪球

6+阅读 · 2018年8月19日

LibRec 精选：连通知识图谱与推荐系统

LibRec 精选：连通知识图谱与推荐系统

LibRec智能推荐

3+阅读 · 2018年8月9日

笔记 | Sentiment Analysis

笔记 | Sentiment Analysis

黑龙江大学自然语言处理实验室

10+阅读 · 2018年5月6日

机器学习线性代数速查

机器学习线性代数速查

机器学习研究会

19+阅读 · 2018年2月25日

Approximate Multi-Agent Fitted Q Iteration

Arxiv

0+阅读 · 2021年7月7日

Episodic Bandits with Stochastic Experts

Episodic Bandits with Stochastic Experts

Arxiv

0+阅读 · 2021年7月7日

Neural Contextual Bandits without Regret

Arxiv

1+阅读 · 2021年7月7日

Dueling Bandits with Team Comparisons

Arxiv

0+阅读 · 2021年7月6日

A Short Note on the Relationship of Information Gain and Eluder Dimension

Arxiv

0+阅读 · 2021年7月6日

Near-optimal inference in adaptive linear regression

Arxiv

0+阅读 · 2021年7月5日

Multidimensional Byzantine Agreement in a Synchronous Setting

Arxiv

0+阅读 · 2021年7月5日

Low-Dimensional State and Action Representation Learning with MDP Homomorphism Metrics

Arxiv

0+阅读 · 2021年7月4日

On Tightness of the Tsaknakis-Spirakis Algorithm for Approximate Nash Equilibrium

Arxiv

0+阅读 · 2021年7月3日

Exploration-Exploitation in Multi-Agent Learning: Catastrophe Theory Meets Game Theory

Exploration-Exploitation in Multi-Agent Learning: Catastrophe Theory Meets Game Theory

Arxiv

15+阅读 · 2020年12月15日

VIP会员

文章信息

相关主题

赌博机/老虎机

相关VIP内容

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

【Google】梯度下降，48页ppt

【Google】梯度下降，48页ppt

专知会员服务

81+阅读 · 2020年12月5日

【DeepMind】强化学习教程，83页ppt

【DeepMind】强化学习教程，83页ppt

专知会员服务

158+阅读 · 2020年8月7日

迁移学习简明教程，11页ppt

迁移学习简明教程，11页ppt

专知会员服务

108+阅读 · 2020年8月4日

【ACL 2020】低维双曲知识图谱嵌入，Low-Dimensional Hyperbolic Knowledge Graph Embeddings

【ACL 2020】低维双曲知识图谱嵌入，Low-Dimensional Hyperbolic Knowledge Graph Embeddings

专知会员服务

77+阅读 · 2020年6月14日

Fariz Darari简明《博弈论Game Theory》介绍，35页ppt

Fariz Darari简明《博弈论Game Theory》介绍，35页ppt

专知会员服务

112+阅读 · 2020年5月15日

因果图，Causal Graphs，52页ppt

因果图，Causal Graphs，52页ppt

专知会员服务

253+阅读 · 2020年4月19日

【AAAI2020教程】强化学习中的Exploration-Exploitation in Reinforcement Learning

专知会员服务

101+阅读 · 2020年2月8日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

热门VIP内容

开通专知VIP会员享更多权益服务

【博士论文】在低维和高维空间中分析、建模和转换潜在表征

从无人机到数据：揭示边缘计算作为新作战域

可解释人工智能的基础

大规模视觉模型中的基于提示的适应：综述

相关资讯

顶会论文 || 65篇"IJCAI"深度强化学习论文汇总

顶会论文 || 65篇"IJCAI"深度强化学习论文汇总

深度强化学习实验室

3+阅读 · 2020年3月15日

LibRec 精选：AutoML for Contextual Bandits

LibRec 精选：AutoML for Contextual Bandits

LibRec智能推荐

7+阅读 · 2019年9月19日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

动物脑的好奇心和强化学习的好奇心

动物脑的好奇心和强化学习的好奇心

CreateAMind

10+阅读 · 2019年1月26日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

已删除

雪球

6+阅读 · 2018年8月19日

LibRec 精选：连通知识图谱与推荐系统

LibRec 精选：连通知识图谱与推荐系统

LibRec智能推荐

3+阅读 · 2018年8月9日

笔记 | Sentiment Analysis

笔记 | Sentiment Analysis

黑龙江大学自然语言处理实验室

10+阅读 · 2018年5月6日

机器学习线性代数速查

机器学习线性代数速查

机器学习研究会

19+阅读 · 2018年2月25日

相关论文

Approximate Multi-Agent Fitted Q Iteration

Arxiv

0+阅读 · 2021年7月7日

Episodic Bandits with Stochastic Experts

Episodic Bandits with Stochastic Experts

Arxiv

0+阅读 · 2021年7月7日

Neural Contextual Bandits without Regret

Arxiv

1+阅读 · 2021年7月7日

Dueling Bandits with Team Comparisons

Arxiv

0+阅读 · 2021年7月6日

A Short Note on the Relationship of Information Gain and Eluder Dimension

Arxiv

0+阅读 · 2021年7月6日

Near-optimal inference in adaptive linear regression

Arxiv

0+阅读 · 2021年7月5日

Multidimensional Byzantine Agreement in a Synchronous Setting

Arxiv

0+阅读 · 2021年7月5日

Low-Dimensional State and Action Representation Learning with MDP Homomorphism Metrics

Arxiv

0+阅读 · 2021年7月4日

On Tightness of the Tsaknakis-Spirakis Algorithm for Approximate Nash Equilibrium

Arxiv

0+阅读 · 2021年7月3日

Exploration-Exploitation in Multi-Agent Learning: Catastrophe Theory Meets Game Theory

Exploration-Exploitation in Multi-Agent Learning: Catastrophe Theory Meets Game Theory

Arxiv

15+阅读 · 2020年12月15日

微信扫码咨询专知VIP会员