激励混合强盗勘探 (Incentivizing Combinatorial Bandit Exploration) - 专知论文

会员服务 ·

0

赌博机/老虎机 · Unstructured · 贝叶斯先验 · INFORMS · 相互独立的 ·

2022 年 6 月 1 日

Incentivizing Combinatorial Bandit Exploration

翻译：激励混合强盗勘探

Xinyan Hu,Dung Daniel Ngo,Aleksandrs Slivkins,Zhiwei Steven Wu

from arxiv, 9 pages of main text, 21 pages in total

Consider a bandit algorithm that recommends actions to self-interested users in a recommendation system. The users are free to choose other actions and need to be incentivized to follow the algorithm's recommendations. While the users prefer to exploit, the algorithm can incentivize them to explore by leveraging the information collected from the previous users. All published work on this problem, known as incentivized exploration, focuses on small, unstructured action sets and mainly targets the case when the users' beliefs are independent across actions. However, realistic exploration problems often feature large, structured action sets and highly correlated beliefs. We focus on a paradigmatic exploration problem with structure: combinatorial semi-bandits. We prove that Thompson Sampling, when applied to combinatorial semi-bandits, is incentive-compatible when initialized with a sufficient number of samples of each arm (where this number is determined in advance by the Bayesian prior). Moreover, we design incentive-compatible algorithms for collecting the initial samples.

翻译：在推荐系统中, 用户可以自由选择其他动作, 并需要激励他们遵循算法的建议。虽然用户更愿意开发, 算法可以激励他们通过利用从先前用户收集的信息进行探索。所有关于该问题的公开著作, 被称为激励性探索, 都侧重于小型的、无结构的行动组, 并且主要针对用户在行动上独立时的信仰。但是, 现实的勘探问题通常以大型、结构化的动作组和高度关联的信念为特征。我们关注结构上的典型探索问题: 组合式半带。我们证明, Thompson Sampling在应用到组合式半带时, 与每个手臂足够数量的样本( 由先前的巴伊西亚人事先确定) 初始样本初始时, 具有激励性兼容性。此外, 我们设计了用于采集初始样本的激励式算法。

0

相关内容

赌博机/老虎机

赌博机/老虎机

不可错过！700+ppt《因果推理》课程！杜克大学Fan Li教程

不可错过！700+ppt《因果推理》课程！杜克大学Fan Li教程

专知会员服务

72+阅读 · 2022年7月11日

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

【干货书】机器学习设计模式，408页pdf，Machine Learning Design Patterns

【干货书】机器学习设计模式，408页pdf，Machine Learning Design Patterns

专知会员服务

138+阅读 · 2022年2月6日

哥伦比亚大学最新《机器学习》课程，Fall-B 2020 (Machine Learning)

专知会员服务

39+阅读 · 2020年11月3日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

专知会员服务

244+阅读 · 2019年10月21日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

Call for Nominations: 2022 Multimedia Prize Paper Award

Call for Nominations: 2022 Multimedia Prize Paper Award

CCF多媒体专委会

0+阅读 · 2022年2月12日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Plenary Talk2

【ICIG2021】Latest News & Announcements of the Plenary Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年11月2日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

【推荐】用Tensorflow理解LSTM

【推荐】用Tensorflow理解LSTM

机器学习研究会

36+阅读 · 2017年9月11日

基于PERK/elF2α通路研究针刺调控MCAO/R大鼠内质网应激-自噬稳态重构的分子机制

国家自然科学基金

0+阅读 · 2015年12月31日

高效Ag基阳极析氧催化剂的原位调控构筑及其电解水制氢研究

国家自然科学基金

0+阅读 · 2014年12月31日

偕二氟取代Combretastatins衍生物的设计与合成

国家自然科学基金

0+阅读 · 2014年12月31日

Cirbp在斑马鱼低温适应中的作用与机制研究

国家自然科学基金

1+阅读 · 2013年12月31日

掺杂调控p型铜铁矿基CuMO2光阴极的能带结构及其对染料敏化太阳电池效率的影响机制

国家自然科学基金

0+阅读 · 2013年12月31日

车载自组织网络路侧单元的优化部署与调度理论和方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

短周期钙钛矿铁电超晶格压电效应机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

GPS掩星资料云内温度廓线反演算法和云内温度廓线特征研究

国家自然科学基金

0+阅读 · 2012年12月31日

SARI转录抑制机制及在急性髓细胞白血病发病中的作用

国家自然科学基金

0+阅读 · 2012年12月31日

基于Decorin基因甲基化调控的非小细胞肺癌转移的分子机制

国家自然科学基金

0+阅读 · 2011年12月31日

Group Validation in Recommender Systems: Framework for Multi-layer Performance Evaluation

Arxiv

0+阅读 · 2022年7月19日

Active Exploration for Inverse Reinforcement Learning

Arxiv

0+阅读 · 2022年7月18日

A Meta-Reinforcement Learning Algorithm for Causal Discovery

Arxiv

0+阅读 · 2022年7月18日

FRAS: Federated Reinforcement Learning empowered Adaptive Point Cloud Video Streaming

Arxiv

0+阅读 · 2022年7月18日

Federated Self-Supervised Learning in Heterogeneous Settings: Limits of a Baseline Approach on HAR

Arxiv

0+阅读 · 2022年7月17日

Guided Data Discovery in Interactive Visualizations via Active Search

Arxiv

0+阅读 · 2022年7月15日

Cross-Modal Object Tracking: Modality-Aware Representations and A Unified Benchmark

Arxiv

14+阅读 · 2021年11月11日

Curriculum Learning: A Survey

Arxiv

24+阅读 · 2021年1月25日

Explainable Recommender Systems via Resolving Learning Representations

Arxiv

13+阅读 · 2020年8月21日

Curriculum Learning for Reinforcement Learning Domains: A Framework and Survey

Curriculum Learning for Reinforcement Learning Domains: A Framework and Survey

Arxiv

20+阅读 · 2020年3月10日

VIP会员

文章信息

相关主题

赌博机/老虎机

贝叶斯先验

相互独立的

相关VIP内容

不可错过！700+ppt《因果推理》课程！杜克大学Fan Li教程

不可错过！700+ppt《因果推理》课程！杜克大学Fan Li教程

专知会员服务

72+阅读 · 2022年7月11日

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

【干货书】机器学习设计模式，408页pdf，Machine Learning Design Patterns

【干货书】机器学习设计模式，408页pdf，Machine Learning Design Patterns

专知会员服务

138+阅读 · 2022年2月6日

哥伦比亚大学最新《机器学习》课程，Fall-B 2020 (Machine Learning)

专知会员服务

39+阅读 · 2020年11月3日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

专知会员服务

244+阅读 · 2019年10月21日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《生成式人工智能与大/小语言模型在供应链管理决策优化与可持续性提升中的作用评估》最新51页

白宫发布《赢得AI竞赛：美国人工智能行动计划》最新28页

地下战：地下空间的战略博弈

《美地下作战条令手册》228页

相关资讯

Call for Nominations: 2022 Multimedia Prize Paper Award

Call for Nominations: 2022 Multimedia Prize Paper Award

CCF多媒体专委会

0+阅读 · 2022年2月12日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Plenary Talk2

【ICIG2021】Latest News & Announcements of the Plenary Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年11月2日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

【推荐】用Tensorflow理解LSTM

【推荐】用Tensorflow理解LSTM

机器学习研究会

36+阅读 · 2017年9月11日

相关论文

Group Validation in Recommender Systems: Framework for Multi-layer Performance Evaluation

Arxiv

0+阅读 · 2022年7月19日

Active Exploration for Inverse Reinforcement Learning

Arxiv

0+阅读 · 2022年7月18日

A Meta-Reinforcement Learning Algorithm for Causal Discovery

Arxiv

0+阅读 · 2022年7月18日

FRAS: Federated Reinforcement Learning empowered Adaptive Point Cloud Video Streaming

Arxiv

0+阅读 · 2022年7月18日

Federated Self-Supervised Learning in Heterogeneous Settings: Limits of a Baseline Approach on HAR

Arxiv

0+阅读 · 2022年7月17日

Guided Data Discovery in Interactive Visualizations via Active Search

Arxiv

0+阅读 · 2022年7月15日

Cross-Modal Object Tracking: Modality-Aware Representations and A Unified Benchmark

Arxiv

14+阅读 · 2021年11月11日

Curriculum Learning: A Survey

Arxiv

24+阅读 · 2021年1月25日

Explainable Recommender Systems via Resolving Learning Representations

Arxiv

13+阅读 · 2020年8月21日

Curriculum Learning for Reinforcement Learning Domains: A Framework and Survey

Curriculum Learning for Reinforcement Learning Domains: A Framework and Survey

Arxiv

20+阅读 · 2020年3月10日

相关基金

基于PERK/elF2α通路研究针刺调控MCAO/R大鼠内质网应激-自噬稳态重构的分子机制

国家自然科学基金

0+阅读 · 2015年12月31日

高效Ag基阳极析氧催化剂的原位调控构筑及其电解水制氢研究

国家自然科学基金

0+阅读 · 2014年12月31日

偕二氟取代Combretastatins衍生物的设计与合成

国家自然科学基金

0+阅读 · 2014年12月31日

Cirbp在斑马鱼低温适应中的作用与机制研究

国家自然科学基金

1+阅读 · 2013年12月31日

掺杂调控p型铜铁矿基CuMO2光阴极的能带结构及其对染料敏化太阳电池效率的影响机制

国家自然科学基金

0+阅读 · 2013年12月31日

车载自组织网络路侧单元的优化部署与调度理论和方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

短周期钙钛矿铁电超晶格压电效应机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

GPS掩星资料云内温度廓线反演算法和云内温度廓线特征研究

国家自然科学基金

0+阅读 · 2012年12月31日

SARI转录抑制机制及在急性髓细胞白血病发病中的作用

国家自然科学基金

0+阅读 · 2012年12月31日

基于Decorin基因甲基化调控的非小细胞肺癌转移的分子机制

国家自然科学基金

0+阅读 · 2011年12月31日

微信扫码咨询专知VIP会员