图形双线强盗中最佳武器标识 (Best Arm Identification in Graphical Bilinear Bandits) - 专知论文

会员服务 ·

0

赌博机/老虎机 · 学习器 · ARM · 图 · entity ·

2021 年 2 月 12 日

Best Arm Identification in Graphical Bilinear Bandits

翻译：图形双线强盗中最佳武器标识

Geovani Rizk,Albert Thomas,Igor Colin,Rida Laraki,Yann Chevaleyre

We introduce a new graphical bilinear bandit problem where a learner (or a \emph{central entity}) allocates arms to the nodes of a graph and observes for each edge a noisy bilinear reward representing the interaction between the two end nodes. We study the best arm identification problem in which the learner wants to find the graph allocation maximizing the sum of the bilinear rewards. By efficiently exploiting the geometry of this bandit problem, we propose a \emph{decentralized} allocation strategy based on random sampling with theoretical guarantees. In particular, we characterize the influence of the graph structure (e.g. star, complete or circle) on the convergence rate and propose empirical experiments that confirm this dependency.

翻译：我们引入一个新的图形双线强盗问题, 即学习者( 或 \ emph{ 中央实体 ) 将手臂分配到图表的节点上, 并观察每个边缘的噪音双线性奖赏, 代表两个端节点之间的相互作用。我们研究最好的手臂识别问题, 学习者想在其中找到图表分配方式, 使双线性奖赏之和最大化。通过高效利用这个土匪问题的几何学, 我们提出了一个基于随机抽样和理论保证的分配策略。特别是, 我们描述图表结构( 如恒星、完整或圆形) 对汇合率的影响, 并提出证实这一依赖性的经验实验。

0

相关内容

赌博机/老虎机

赌博机/老虎机

不可错过！CMU《机器学习导论》2021课程，ML祖师爷Tom Mitchell带队主讲

不可错过！CMU《机器学习导论》2021课程，ML祖师爷Tom Mitchell带队主讲

专知会员服务

64+阅读 · 2021年3月20日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【快讯】ECCV 2020论文出炉，1361篇上榜，你的paper中了吗？

【快讯】ECCV 2020论文出炉，1361篇上榜，你的paper中了吗？

专知会员服务

57+阅读 · 2020年7月3日

因果图，Causal Graphs，52页ppt

因果图，Causal Graphs，52页ppt

专知会员服务

253+阅读 · 2020年4月19日

【牛津大学】深度残差强化学习，Deep Residual Reinforcement Learning

【牛津大学】深度残差强化学习，Deep Residual Reinforcement Learning

专知会员服务

84+阅读 · 2020年2月18日

【伯克利，基于模型的强化学习：理论与实践】《Model-Based Reinforcement Learning:Theory and Practice》，Michael Janner

【伯克利，基于模型的强化学习：理论与实践】《Model-Based Reinforcement Learning:Theory and Practice》，Michael Janner

专知会员服务

35+阅读 · 2019年12月12日

在线变分推断，76页ppt，A Regret Bound for Online Variational Inference

在线变分推断，76页ppt，A Regret Bound for Online Variational Inference

专知会员服务

21+阅读 · 2019年12月2日

新书分享：强化学习最新书稿《强化学习导论》（Reinforcement Learning An Introduction）第二版出炉

新书分享：强化学习最新书稿《强化学习导论》（Reinforcement Learning An Introduction）第二版出炉

专知会员服务

118+阅读 · 2019年10月25日

【新书】Python编程基础，669页pdf

【新书】Python编程基础，669页pdf

专知会员服务

197+阅读 · 2019年10月10日

人工智能乳房x线照相术和数字化乳房人工合成:当前的概念和未来的展望综述论文

人工智能乳房x线照相术和数字化乳房人工合成:当前的概念和未来的展望综述论文

专知会员服务

5+阅读 · 2019年9月25日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

已删除

将门创投

8+阅读 · 2019年3月18日

Bypassing the Monster: A Faster and Simpler Optimal Algorithm for Contextual Bandits under Realizability

Arxiv

0+阅读 · 2021年4月6日

Neural Online Graph Exploration

Arxiv

0+阅读 · 2021年4月6日

Multi-Robot Pickup and Delivery via Distributed Resource Allocation

Arxiv

0+阅读 · 2021年4月6日

Fast-Convergent Federated Learning with Adaptive Weighting

Arxiv

0+阅读 · 2021年4月6日

Game of Thrones: Fully Distributed Learning for Multi-Player Bandits

Arxiv

0+阅读 · 2021年4月5日

Robust Bandit Learning with Imperfect Context

Arxiv

0+阅读 · 2021年4月4日

Bayesian estimation of nonlinear Hawkes process

Arxiv

0+阅读 · 2021年4月2日

An Online Projection Estimator for Nonparametric Regression in Reproducing Kernel Hilbert Spaces

Arxiv

0+阅读 · 2021年4月1日

Weighted Bilinear Coding over Salient Body Parts for Person Re-identification

Arxiv

4+阅读 · 2018年4月30日

Latent nested nonparametric priors

Arxiv

4+阅读 · 2018年1月15日

VIP会员

文章信息

相关主题

赌博机/老虎机

相关VIP内容

不可错过！CMU《机器学习导论》2021课程，ML祖师爷Tom Mitchell带队主讲

不可错过！CMU《机器学习导论》2021课程，ML祖师爷Tom Mitchell带队主讲

专知会员服务

64+阅读 · 2021年3月20日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【快讯】ECCV 2020论文出炉，1361篇上榜，你的paper中了吗？

【快讯】ECCV 2020论文出炉，1361篇上榜，你的paper中了吗？

专知会员服务

57+阅读 · 2020年7月3日

因果图，Causal Graphs，52页ppt

因果图，Causal Graphs，52页ppt

专知会员服务

253+阅读 · 2020年4月19日

【牛津大学】深度残差强化学习，Deep Residual Reinforcement Learning

【牛津大学】深度残差强化学习，Deep Residual Reinforcement Learning

专知会员服务

84+阅读 · 2020年2月18日

【伯克利，基于模型的强化学习：理论与实践】《Model-Based Reinforcement Learning:Theory and Practice》，Michael Janner

【伯克利，基于模型的强化学习：理论与实践】《Model-Based Reinforcement Learning:Theory and Practice》，Michael Janner

专知会员服务

35+阅读 · 2019年12月12日

在线变分推断，76页ppt，A Regret Bound for Online Variational Inference

在线变分推断，76页ppt，A Regret Bound for Online Variational Inference

专知会员服务

21+阅读 · 2019年12月2日

新书分享：强化学习最新书稿《强化学习导论》（Reinforcement Learning An Introduction）第二版出炉

新书分享：强化学习最新书稿《强化学习导论》（Reinforcement Learning An Introduction）第二版出炉

专知会员服务

118+阅读 · 2019年10月25日

【新书】Python编程基础，669页pdf

【新书】Python编程基础，669页pdf

专知会员服务

197+阅读 · 2019年10月10日

人工智能乳房x线照相术和数字化乳房人工合成:当前的概念和未来的展望综述论文

人工智能乳房x线照相术和数字化乳房人工合成:当前的概念和未来的展望综述论文

专知会员服务

5+阅读 · 2019年9月25日

热门VIP内容

开通专知VIP会员享更多权益服务

《自主武器》365页书籍

《奇点战争：元理论框架》

人工智能在空战中的局限及其真正适用领域

《美国国防技术项目简报：无人机威胁》

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

已删除

将门创投

8+阅读 · 2019年3月18日

相关论文

Bypassing the Monster: A Faster and Simpler Optimal Algorithm for Contextual Bandits under Realizability

Arxiv

0+阅读 · 2021年4月6日

Neural Online Graph Exploration

Arxiv

0+阅读 · 2021年4月6日

Multi-Robot Pickup and Delivery via Distributed Resource Allocation

Arxiv

0+阅读 · 2021年4月6日

Fast-Convergent Federated Learning with Adaptive Weighting

Arxiv

0+阅读 · 2021年4月6日

Game of Thrones: Fully Distributed Learning for Multi-Player Bandits

Arxiv

0+阅读 · 2021年4月5日

Robust Bandit Learning with Imperfect Context

Arxiv

0+阅读 · 2021年4月4日

Bayesian estimation of nonlinear Hawkes process

Arxiv

0+阅读 · 2021年4月2日

An Online Projection Estimator for Nonparametric Regression in Reproducing Kernel Hilbert Spaces

Arxiv

0+阅读 · 2021年4月1日

Weighted Bilinear Coding over Salient Body Parts for Person Re-identification

Arxiv

4+阅读 · 2018年4月30日

Latent nested nonparametric priors

Arxiv

4+阅读 · 2018年1月15日

微信扫码咨询专知VIP会员