加速用许多手的两边强盗的 " 冷起步 " 学习速度 (Speed Up the Cold-Start Learning in Two-Sided Bandits with Many Arms) - 专知论文

会员服务 ·

0

赌博机/老虎机 · 可约的 · 周期的 · 情景 · Learning ·

2022 年 10 月 1 日

Speed Up the Cold-Start Learning in Two-Sided Bandits with Many Arms

翻译：加速用许多手的两边强盗的 " 冷起步 " 学习速度

Mohsen Bayati,Junyu Cao,Wanning Chen

Multi-armed bandit (MAB) algorithms are efficient approaches to reduce the opportunity cost of online experimentation and are used by companies to find the best product from periodically refreshed product catalogs. However, these algorithms face the so-called cold-start at the onset of the experiment due to a lack of knowledge of customer preferences for new products, requiring an initial data collection phase known as the burning period. During this period, MAB algorithms operate like randomized experiments, incurring large burning costs which scale with the large number of products. We attempt to reduce the burning by identifying that many products can be cast into two-sided products, and then naturally model the rewards of the products with a matrix, whose rows and columns represent the two sides respectively. Next, we design two-phase bandit algorithms that first use subsampling and low-rank matrix estimation to obtain a substantially smaller targeted set of products and then apply a UCB procedure on the target products to find the best one. We theoretically show that the proposed algorithms lower costs and expedite the experiment in cases when there is limited experimentation time along with a large product set. Our analysis also reveals three regimes of long, short, and ultra-short horizon experiments, depending on dimensions of the matrix. Empirical evidence from both synthetic data and a real-world dataset on music streaming services validates this superior performance.

翻译：多武装土匪算法(MAB)是降低在线实验机会成本的有效方法,公司利用这些算法从定期更新的产品目录中找到最佳产品,但算法在试验开始时面临所谓的冷开点,因为缺乏对客户偏好新产品的知识,需要有一个称为燃烧期的初始数据收集阶段。在此期间,MAB算法像随机实验一样运作,造成大量产品规模的巨大燃烧成本。我们试图通过确定许多产品可以被制成双面产品,然后自然地用一个矩阵来模拟产品的奖赏,该矩阵的行和列分别代表两面。接下来,我们设计了两阶段土匪算法,首先使用子标本和低级矩阵估计,以获得数量小得多的产品,然后在目标产品上应用UCB程序寻找最佳产品。我们理论上地表明,拟议的算法成本较低,并在试验时间有限时加快试验速度,然后自然地用一个矩阵来模拟产品的奖赏,其行和柱子分别代表两面。我们的分析还揭示了长期、短期和超水平数据模型的三种制度。我们的分析还展示了从空间、超级数据流上和超级数据流中展示了真实数据。

0

相关内容

赌博机/老虎机

赌博机/老虎机

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

80+阅读 · 2020年7月26日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

中国图象图形学学会CSIG

0+阅读 · 2021年11月10日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

基于Amalgam空间的Hardy空间实变理论及其应用

国家自然科学基金

0+阅读 · 2017年12月31日

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

解析函数空间上的Toeplitz型奇异积分算子

国家自然科学基金

0+阅读 · 2014年12月31日

基于Morrey空间的函数空间实变理论及其应用

国家自然科学基金

0+阅读 · 2014年12月31日

Faecalibacterium prausnitzii协同LFA-1在炎症性肠病发生中调控淋巴细胞分化及功能的作用机制

国家自然科学基金

0+阅读 · 2014年12月31日

Kronheimer-Nakajima quiver 模空间与有理曲面

国家自然科学基金

1+阅读 · 2013年12月31日

GSK-3β调控血管平滑肌细胞特异性转录因子Myocardin对动脉粥样硬化斑块形成作用及分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

细胞色素P-450表氧化酶与5-脂氧酶调控动脉粥样硬化慢性炎症的作用与分子机制

国家自然科学基金

0+阅读 · 2011年12月31日

Dirichlet空间的分析与几何

国家自然科学基金

0+阅读 · 2011年12月31日

骨髓MSCs抑制B细胞功能及其治疗MRL/lpr狼疮鼠的机制

国家自然科学基金

0+阅读 · 2009年12月31日

The Augmentation-Speed Tradeoff for Consistent Network Updates

Arxiv

0+阅读 · 2022年11月7日

Over-The-Air Clustered Wireless Federated Learning

Arxiv

0+阅读 · 2022年11月7日

Online Learning and Bandits with Queried Hints

Arxiv

0+阅读 · 2022年11月4日

Efficient evaluation of the error probability for pilot-assisted URLLC with Massive MIMO

Arxiv

0+阅读 · 2022年11月4日

The Benefits of Model-Based Generalization in Reinforcement Learning

Arxiv

0+阅读 · 2022年11月4日

Phase Transitions in Learning and Earning under Price Protection Guarantee

Arxiv

0+阅读 · 2022年11月3日

Federated Optimization Algorithms with Random Reshuffling and Gradient Compression

Arxiv

0+阅读 · 2022年11月3日

Beyond the Best: Estimating Distribution Functionals in Infinite-Armed Bandits

Arxiv

0+阅读 · 2022年11月1日

Data-Free Knowledge Distillation for Heterogeneous Federated Learning

Arxiv

12+阅读 · 2021年6月9日

Meta-Transfer Learning for Zero-Shot Super-Resolution

Meta-Transfer Learning for Zero-Shot Super-Resolution

Arxiv

43+阅读 · 2020年2月27日

VIP会员

文章信息

相关主题

赌博机/老虎机

相关VIP内容

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

80+阅读 · 2020年7月26日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《生成式人工智能与大/小语言模型在供应链管理决策优化与可持续性提升中的作用评估》最新51页

白宫发布《赢得AI竞赛：美国人工智能行动计划》最新28页

地下战：地下空间的战略博弈

《美地下作战条令手册》228页

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

中国图象图形学学会CSIG

0+阅读 · 2021年11月10日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

The Augmentation-Speed Tradeoff for Consistent Network Updates

Arxiv

0+阅读 · 2022年11月7日

Over-The-Air Clustered Wireless Federated Learning

Arxiv

0+阅读 · 2022年11月7日

Online Learning and Bandits with Queried Hints

Arxiv

0+阅读 · 2022年11月4日

Efficient evaluation of the error probability for pilot-assisted URLLC with Massive MIMO

Arxiv

0+阅读 · 2022年11月4日

The Benefits of Model-Based Generalization in Reinforcement Learning

Arxiv

0+阅读 · 2022年11月4日

Phase Transitions in Learning and Earning under Price Protection Guarantee

Arxiv

0+阅读 · 2022年11月3日

Federated Optimization Algorithms with Random Reshuffling and Gradient Compression

Arxiv

0+阅读 · 2022年11月3日

Beyond the Best: Estimating Distribution Functionals in Infinite-Armed Bandits

Arxiv

0+阅读 · 2022年11月1日

Data-Free Knowledge Distillation for Heterogeneous Federated Learning

Arxiv

12+阅读 · 2021年6月9日

Meta-Transfer Learning for Zero-Shot Super-Resolution

Meta-Transfer Learning for Zero-Shot Super-Resolution

Arxiv

43+阅读 · 2020年2月27日

相关基金

基于Amalgam空间的Hardy空间实变理论及其应用

国家自然科学基金

0+阅读 · 2017年12月31日

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

解析函数空间上的Toeplitz型奇异积分算子

国家自然科学基金

0+阅读 · 2014年12月31日

基于Morrey空间的函数空间实变理论及其应用

国家自然科学基金

0+阅读 · 2014年12月31日

Faecalibacterium prausnitzii协同LFA-1在炎症性肠病发生中调控淋巴细胞分化及功能的作用机制

国家自然科学基金

0+阅读 · 2014年12月31日

Kronheimer-Nakajima quiver 模空间与有理曲面

国家自然科学基金

1+阅读 · 2013年12月31日

GSK-3β调控血管平滑肌细胞特异性转录因子Myocardin对动脉粥样硬化斑块形成作用及分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

细胞色素P-450表氧化酶与5-脂氧酶调控动脉粥样硬化慢性炎症的作用与分子机制

国家自然科学基金

0+阅读 · 2011年12月31日

Dirichlet空间的分析与几何

国家自然科学基金

0+阅读 · 2011年12月31日

骨髓MSCs抑制B细胞功能及其治疗MRL/lpr狼疮鼠的机制

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员