终止或删除: 串行独裁中的分散式竞相强盗 (Dominate or Delete: Decentralized Competing Bandits in Serial Dictatorship) - 专知论文

会员服务 ·

0

序列化 · 赌博机/老虎机 · Continuity · INTERACT · INFORMS ·

2021 年 3 月 12 日

Dominate or Delete: Decentralized Competing Bandits in Serial Dictatorship

翻译：终止或删除: 串行独裁中的分散式竞相强盗

Abishek Sankararaman,Soumya Basu,Karthik Abinav Sankararaman

from arxiv, AISTATS, 2021

Online learning in a two-sided matching market, with demand side agents continuously competing to be matched with supply side (arms), abstracts the complex interactions under partial information on matching platforms (e.g. UpWork, TaskRabbit). We study the decentralized serial dictatorship setting, a two-sided matching market where the demand side agents have unknown and heterogeneous valuation over the supply side (arms), while the arms have known uniform preference over the demand side (agents). We design the first decentralized algorithm -- UCB with Decentralized Dominant-arm Deletion (UCB-D3), for the agents, that does not require any knowledge of reward gaps or time horizon. UCB-D3 works in phases, where in each phase, agents delete \emph{dominated arms} -- the arms preferred by higher ranked agents, and play only from the non-dominated arms according to the UCB. At the end of the phase, agents broadcast in a decentralized fashion, their estimated preferred arms through {\em pure exploitation}. We prove both, a new regret lower bound for the decentralized serial dictatorship model, and that UCB-D3 is order optimal.

翻译：在双面匹配的市场上进行在线学习,需求方代理商不断与供应方(武器)竞争,在匹配平台(例如UpWork、TattRabbbit)的部分信息下总结复杂的互动关系。我们研究了分散的连续独裁环境,这是一个双面匹配的市场,需求方代理商对供应方(武器)的估价不尽人意,而武器对需求方(代理商)有不同的偏好。我们为代理商设计了第一个分散的算法 -- -- UCB, 使用分散式武器删除(UCB-D3),这不需要对报酬差距或时间跨度有任何了解。UCB-D3分阶段工作,在每一个阶段,代理商删除\emph{以武器为主,根据UCB公司的说法,武器只从非主导型武器中播放。在阶段结束时,代理商以分散式广播,通过纯开采来估计他们喜欢的武器。我们证明,对分散式的连续独裁模式有新的遗憾,而UCB-D3是最佳秩序。

0

相关内容

序列化

序列化 (Serialization)将对象的状态信息转换为可以存储或传输的形式的过程。

【硬核课】机器人学习课程，UT Austin朱玉可博士讲述自主机器人的人工智能与机器学习机器学习算法

【硬核课】机器人学习课程，UT Austin朱玉可博士讲述自主机器人的人工智能与机器学习机器学习算法

专知会员服务

40+阅读 · 2020年9月21日

不可错过！UIUC最新《统计强化学习》课程！

专知会员服务

54+阅读 · 2020年9月7日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

IJCAI2020接受论文列表，592篇论文pdf都在这了！

IJCAI2020接受论文列表，592篇论文pdf都在这了！

专知会员服务

64+阅读 · 2020年7月16日

Fariz Darari简明《博弈论Game Theory》介绍，35页ppt

Fariz Darari简明《博弈论Game Theory》介绍，35页ppt

专知会员服务

112+阅读 · 2020年5月15日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

MIT新书《强化学习与最优控制》

MIT新书《强化学习与最优控制》

专知会员服务

280+阅读 · 2019年10月9日

计算机 | IUI 2020等国际会议信息4条

计算机 | IUI 2020等国际会议信息4条

Call4Papers

6+阅读 · 2019年6月17日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

19篇ICML2019论文摘录选读！

19篇ICML2019论文摘录选读！

专知

28+阅读 · 2019年4月28日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

Python机器学习教程资料/代码

Python机器学习教程资料/代码

机器学习研究会

8+阅读 · 2018年2月22日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

A Cooperative Memory Network for Personalized Task-oriented Dialogue Systems with Incomplete User Profiles

Arxiv

8+阅读 · 2021年2月16日

A Modern Introduction to Online Learning

A Modern Introduction to Online Learning

Arxiv

21+阅读 · 2019年12月31日

Finding Needles in a Moving Haystack: Prioritizing Alerts with Adversarial Reinforcement Learning

Finding Needles in a Moving Haystack: Prioritizing Alerts with Adversarial Reinforcement Learning

Arxiv

3+阅读 · 2019年6月20日

One-shot Information Extraction from Document Images using Neuro-Deductive Program Synthesis

One-shot Information Extraction from Document Images using Neuro-Deductive Program Synthesis

Arxiv

3+阅读 · 2019年6月6日

What's in a Name? Reducing Bias in Bios without Access to Protected Attributes

Arxiv

3+阅读 · 2019年4月10日

Learning Graph Embedding with Adversarial Training Methods

Learning Graph Embedding with Adversarial Training Methods

Arxiv

6+阅读 · 2019年1月4日

On Improving Decentralized Hysteretic Deep Reinforcement Learning

On Improving Decentralized Hysteretic Deep Reinforcement Learning

Arxiv

4+阅读 · 2018年12月15日

Geometry-Based Multiple Camera Head Detection in Dense Crowds

Geometry-Based Multiple Camera Head Detection in Dense Crowds

Arxiv

3+阅读 · 2018年8月2日

MQGrad: Reinforcement Learning of Gradient Quantization in Parameter Server

Arxiv

4+阅读 · 2018年4月22日

Together or Alone: The Price of Privacy in Collaborative Learning

Arxiv

4+阅读 · 2018年2月28日

VIP会员

文章信息

相关主题

赌博机/老虎机

相关VIP内容

【硬核课】机器人学习课程，UT Austin朱玉可博士讲述自主机器人的人工智能与机器学习机器学习算法

【硬核课】机器人学习课程，UT Austin朱玉可博士讲述自主机器人的人工智能与机器学习机器学习算法

专知会员服务

40+阅读 · 2020年9月21日

不可错过！UIUC最新《统计强化学习》课程！

专知会员服务

54+阅读 · 2020年9月7日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

IJCAI2020接受论文列表，592篇论文pdf都在这了！

IJCAI2020接受论文列表，592篇论文pdf都在这了！

专知会员服务

64+阅读 · 2020年7月16日

Fariz Darari简明《博弈论Game Theory》介绍，35页ppt

Fariz Darari简明《博弈论Game Theory》介绍，35页ppt

专知会员服务

112+阅读 · 2020年5月15日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

MIT新书《强化学习与最优控制》

MIT新书《强化学习与最优控制》

专知会员服务

280+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《利用射频传感器载荷增强无人机的侦察、监视与目标获取（ISR）能力》报告

《导航战》2025最新报告

人工智能驱动的国防战术通信与网络：提升现代战争中的态势感知、安全性与自主决策 | 万字长文

《有人-无人轻型驱逐舰与中型无人水面艇支队在第二与第一岛链作战中的部署概念（CONOPS）》56页报告

相关资讯

计算机 | IUI 2020等国际会议信息4条

计算机 | IUI 2020等国际会议信息4条

Call4Papers

6+阅读 · 2019年6月17日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

19篇ICML2019论文摘录选读！

19篇ICML2019论文摘录选读！

专知

28+阅读 · 2019年4月28日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

Python机器学习教程资料/代码

Python机器学习教程资料/代码

机器学习研究会

8+阅读 · 2018年2月22日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

相关论文

A Cooperative Memory Network for Personalized Task-oriented Dialogue Systems with Incomplete User Profiles

Arxiv

8+阅读 · 2021年2月16日

A Modern Introduction to Online Learning

A Modern Introduction to Online Learning

Arxiv

21+阅读 · 2019年12月31日

Finding Needles in a Moving Haystack: Prioritizing Alerts with Adversarial Reinforcement Learning

Finding Needles in a Moving Haystack: Prioritizing Alerts with Adversarial Reinforcement Learning

Arxiv

3+阅读 · 2019年6月20日

One-shot Information Extraction from Document Images using Neuro-Deductive Program Synthesis

One-shot Information Extraction from Document Images using Neuro-Deductive Program Synthesis

Arxiv

3+阅读 · 2019年6月6日

What's in a Name? Reducing Bias in Bios without Access to Protected Attributes

Arxiv

3+阅读 · 2019年4月10日

Learning Graph Embedding with Adversarial Training Methods

Learning Graph Embedding with Adversarial Training Methods

Arxiv

6+阅读 · 2019年1月4日

On Improving Decentralized Hysteretic Deep Reinforcement Learning

On Improving Decentralized Hysteretic Deep Reinforcement Learning

Arxiv

4+阅读 · 2018年12月15日

Geometry-Based Multiple Camera Head Detection in Dense Crowds

Geometry-Based Multiple Camera Head Detection in Dense Crowds

Arxiv

3+阅读 · 2018年8月2日

MQGrad: Reinforcement Learning of Gradient Quantization in Parameter Server

Arxiv

4+阅读 · 2018年4月22日

Together or Alone: The Price of Privacy in Collaborative Learning

Arxiv

4+阅读 · 2018年2月28日

微信扫码咨询专知VIP会员