图形双线强盗中最佳武器标识 (Best Arm Identification in Graphical Bilinear Bandits) - 专知论文

会员服务 ·

0

赌博机/老虎机 · 学习器 · ARM · 图 · entity ·

2020 年 12 月 14 日

Best Arm Identification in Graphical Bilinear Bandits

翻译：图形双线强盗中最佳武器标识

Geovani Rizk,Albert Thomas,Igor Colin,Rida Laraki,Yann Chevaleyre

We introduce a new graphical bilinear bandit problem where a learner (or a \emph{central entity}) allocates arms to the nodes of a graph and observes for each edge a noisy bilinear reward representing the interaction between the two end nodes. We study the best arm identification problem in which the learner wants to find the graph allocation maximizing the sum of the bilinear rewards. By efficiently exploiting the geometry of this bandit problem, we propose a somehow \emph{decentralized} allocation strategy based on random sampling with theoretical guarantees. In particular, we characterize the influence of the graph structure (e.g. star, complete or circle) on the convergence rate and propose empirical experiments that confirm this dependency.

翻译：我们引入一个新的图形双线强盗问题, 即学习者( 或 \ emph{ 中央实体 ) 将手臂分配到图表的节点上, 并观察每个边缘的噪音双线性奖赏, 代表两个端节点之间的相互作用。我们研究最好的手臂识别问题, 学习者想在其中找到图表分配方式, 使双线性奖赏之和最大化。通过高效利用这个土匪问题的几何学, 我们提出了一个基于随机抽样的基于理论保证的 emph{ 分散化} 分配策略。特别是, 我们描述图表结构( 如恒星、完整或圆形) 对汇合率的影响, 并提出证实这一依赖性的经验实验。

0

相关内容

赌博机/老虎机

赌博机/老虎机

【ETH】最新《几何数据分析》2020课程，附PPT下载

专知会员服务

45+阅读 · 2020年12月18日

最新《图嵌入组合优化》综述论文，40页pdf

最新《图嵌入组合优化》综述论文，40页pdf

专知会员服务

35+阅读 · 2020年9月7日

零样本文本分类，Zero-Shot Learning for Text Classification

零样本文本分类，Zero-Shot Learning for Text Classification

专知会员服务

97+阅读 · 2020年5月31日

简明扼要！Python教程手册，206页pdf

简明扼要！Python教程手册，206页pdf

专知会员服务

48+阅读 · 2020年3月24日

【Uber AI新论文】持续元学习，Learning to Continually Learn

【Uber AI新论文】持续元学习，Learning to Continually Learn

专知会员服务

37+阅读 · 2020年2月27日

2019->2020必看的十篇「深度学习领域综述」论文

2019->2020必看的十篇「深度学习领域综述」论文

专知会员服务

275+阅读 · 2020年1月1日

【WSDM 2020 论文】网络嵌入的初始化：一种图划分方法（Initialization for Network Embedding: A Graph Partition Approach）

【WSDM 2020 论文】网络嵌入的初始化：一种图划分方法（Initialization for Network Embedding: A Graph Partition Approach）

专知会员服务

44+阅读 · 2019年11月20日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

【新书】Python编程基础，669页pdf

【新书】Python编程基础，669页pdf

专知会员服务

197+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

LibRec 精选：AutoML for Contextual Bandits

LibRec 精选：AutoML for Contextual Bandits

LibRec智能推荐

7+阅读 · 2019年9月19日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

2012-2018-CS顶会历届最佳论文大列表

2012-2018-CS顶会历届最佳论文大列表

深度学习与NLP

6+阅读 · 2019年2月1日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

【论文】图上的表示学习综述

【论文】图上的表示学习综述

机器学习研究会

15+阅读 · 2017年9月24日

LibRec 每周算法：parameter-free contextual bandits (SIGIR'15)

LibRec 每周算法：parameter-free contextual bandits (SIGIR'15)

LibRec智能推荐

5+阅读 · 2017年6月12日

Fast Graphical Population Protocols

Arxiv

0+阅读 · 2021年2月17日

The Reflectron: Exploiting geometry for learning generalized linear models

Arxiv

0+阅读 · 2021年2月16日

Variational Deep Learning for the Identification and Reconstruction of Chaotic and Stochastic Dynamical Systems from Noisy and Partial Observations

Arxiv

0+阅读 · 2021年2月16日

Inverse Reinforcement Learning in the Continuous Setting with Formal Guarantees

Arxiv

0+阅读 · 2021年2月16日

Top-$k$ eXtreme Contextual Bandits with Arm Hierarchy

Arxiv

0+阅读 · 2021年2月15日

Rebounding Bandits for Modeling Satiation Effects

Rebounding Bandits for Modeling Satiation Effects

Arxiv

0+阅读 · 2021年2月15日

Full state approximation by Galerkin projection reduced order models for stochastic and bilinear systems

Arxiv

0+阅读 · 2021年2月15日

Optimal Regret Algorithm for Pseudo-1d Bandit Convex Optimization

Arxiv

0+阅读 · 2021年2月15日

Pareto Optimal Model Selection in Linear Bandits

Arxiv

0+阅读 · 2021年2月12日

Multi-Agent Multi-Armed Bandits with Limited Communication

Arxiv

0+阅读 · 2021年2月10日

VIP会员

文章信息

相关主题

赌博机/老虎机

相关VIP内容

【ETH】最新《几何数据分析》2020课程，附PPT下载

专知会员服务

45+阅读 · 2020年12月18日

最新《图嵌入组合优化》综述论文，40页pdf

最新《图嵌入组合优化》综述论文，40页pdf

专知会员服务

35+阅读 · 2020年9月7日

零样本文本分类，Zero-Shot Learning for Text Classification

零样本文本分类，Zero-Shot Learning for Text Classification

专知会员服务

97+阅读 · 2020年5月31日

简明扼要！Python教程手册，206页pdf

简明扼要！Python教程手册，206页pdf

专知会员服务

48+阅读 · 2020年3月24日

【Uber AI新论文】持续元学习，Learning to Continually Learn

【Uber AI新论文】持续元学习，Learning to Continually Learn

专知会员服务

37+阅读 · 2020年2月27日

2019->2020必看的十篇「深度学习领域综述」论文

2019->2020必看的十篇「深度学习领域综述」论文

专知会员服务

275+阅读 · 2020年1月1日

【WSDM 2020 论文】网络嵌入的初始化：一种图划分方法（Initialization for Network Embedding: A Graph Partition Approach）

【WSDM 2020 论文】网络嵌入的初始化：一种图划分方法（Initialization for Network Embedding: A Graph Partition Approach）

专知会员服务

44+阅读 · 2019年11月20日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

【新书】Python编程基础，669页pdf

【新书】Python编程基础，669页pdf

专知会员服务

197+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

人机协同时代的军事指挥控制演进

《英国智库：瓦解俄罗斯防空系统生产，夺回制空权》最新报告

《通过仿真与开源数据提升战略决策：机遇与局限》最新报告

《战术突击工具包：军队的“边缘”操作系统》报告

相关资讯

LibRec 精选：AutoML for Contextual Bandits

LibRec 精选：AutoML for Contextual Bandits

LibRec智能推荐

7+阅读 · 2019年9月19日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

2012-2018-CS顶会历届最佳论文大列表

2012-2018-CS顶会历届最佳论文大列表

深度学习与NLP

6+阅读 · 2019年2月1日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

【论文】图上的表示学习综述

【论文】图上的表示学习综述

机器学习研究会

15+阅读 · 2017年9月24日

LibRec 每周算法：parameter-free contextual bandits (SIGIR'15)

LibRec 每周算法：parameter-free contextual bandits (SIGIR'15)

LibRec智能推荐

5+阅读 · 2017年6月12日

相关论文

Fast Graphical Population Protocols

Arxiv

0+阅读 · 2021年2月17日

The Reflectron: Exploiting geometry for learning generalized linear models

Arxiv

0+阅读 · 2021年2月16日

Variational Deep Learning for the Identification and Reconstruction of Chaotic and Stochastic Dynamical Systems from Noisy and Partial Observations

Arxiv

0+阅读 · 2021年2月16日

Inverse Reinforcement Learning in the Continuous Setting with Formal Guarantees

Arxiv

0+阅读 · 2021年2月16日

Top-$k$ eXtreme Contextual Bandits with Arm Hierarchy

Arxiv

0+阅读 · 2021年2月15日

Rebounding Bandits for Modeling Satiation Effects

Rebounding Bandits for Modeling Satiation Effects

Arxiv

0+阅读 · 2021年2月15日

Full state approximation by Galerkin projection reduced order models for stochastic and bilinear systems

Arxiv

0+阅读 · 2021年2月15日

Optimal Regret Algorithm for Pseudo-1d Bandit Convex Optimization

Arxiv

0+阅读 · 2021年2月15日

Pareto Optimal Model Selection in Linear Bandits

Arxiv

0+阅读 · 2021年2月12日

Multi-Agent Multi-Armed Bandits with Limited Communication

Arxiv

0+阅读 · 2021年2月10日

微信扫码咨询专知VIP会员