斯托卡斯内地线性土匪实验设计设计 (Design of Experiments for Stochastic Contextual Linear Bandits) - 专知论文

会员服务 ·

0

赌博机/老虎机 · 线性的 · 上下文赌博机/上下文老虎机 · 随机性策略 · CASES ·

2021 年 7 月 22 日

Design of Experiments for Stochastic Contextual Linear Bandits

翻译：斯托卡斯内地线性土匪实验设计设计

Andrea Zanette,Kefan Dong,Jonathan Lee,Emma Brunskill

from arxiv, Initial submission

In the stochastic linear contextual bandit setting there exist several minimax procedures for exploration with policies that are reactive to the data being acquired. In practice, there can be a significant engineering overhead to deploy these algorithms, especially when the dataset is collected in a distributed fashion or when a human in the loop is needed to implement a different policy. Exploring with a single non-reactive policy is beneficial in such cases. Assuming some batch contexts are available, we design a single stochastic policy to collect a good dataset from which a near-optimal policy can be extracted. We present a theoretical analysis as well as numerical experiments on both synthetic and real-world datasets.

翻译：在随机线性线性背景土匪设置中,有若干小型勘探程序,其政策对获取的数据具有反应作用,在实践中,可能存在大量的工程间接费用来部署这些算法,特别是当数据集是以分布式方式收集的,或者当需要某个人在循环中执行不同政策时。在这种情况下,采用单一的非反应政策进行探索是有益的。假设存在某些批量情况,我们设计一个单一的随机政策来收集好数据集,从中提取接近最佳的政策。我们提出了关于合成和真实世界数据集的理论分析以及数字实验。

0

相关内容

赌博机/老虎机

赌博机/老虎机

计算机理论顶会STOC 2021奖项出炉，滕尚华等华人学者获奖

专知会员服务

8+阅读 · 2021年7月22日

【DeepMind】基于模型的强化学习，174页ppt，Model-Based Reinforcement Learning

【DeepMind】基于模型的强化学习，174页ppt，Model-Based Reinforcement Learning

专知会员服务

89+阅读 · 2021年1月12日

不可错过！UIUC最新《统计强化学习》课程！

专知会员服务

54+阅读 · 2020年9月7日

【课程推荐】人工智能导论：Introduction to Articial Intelligence

【课程推荐】人工智能导论：Introduction to Articial Intelligence

专知会员服务

103+阅读 · 2019年12月20日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

LibRec 精选：AutoML for Contextual Bandits

LibRec 精选：AutoML for Contextual Bandits

LibRec智能推荐

7+阅读 · 2019年9月19日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

RL 真经

CreateAMind

5+阅读 · 2018年12月28日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

carla 学习笔记

carla 学习笔记

CreateAMind

9+阅读 · 2018年2月7日

Acceleration based PSO for Multi-UAV Source-Seeking

Acceleration based PSO for Multi-UAV Source-Seeking

Arxiv

0+阅读 · 2021年9月23日

The Seven-League Scheme: Deep learning for large time step Monte Carlo simulations of stochastic differential equations

The Seven-League Scheme: Deep learning for large time step Monte Carlo simulations of stochastic differential equations

Arxiv

0+阅读 · 2021年9月23日

Temporal Scale Estimation for Oversampled Network Cascades: Theory, Algorithms, and Experiment

Arxiv

0+阅读 · 2021年9月22日

Minimax Rates and Adaptivity in Combining Experimental and Observational Data

Arxiv

0+阅读 · 2021年9月22日

Online Deterministic Annealing for Classification and Clustering

Arxiv

0+阅读 · 2021年9月21日

Policy Gradient Bayesian Robust Optimization for Imitation Learning

Arxiv

5+阅读 · 2021年6月11日

Hyperparameter Selection for Imitation Learning

Arxiv

7+阅读 · 2021年5月25日

Hierarchical Adaptive Contextual Bandits for Resource Constraint based Recommendation

Arxiv

5+阅读 · 2020年4月2日

Accelerated Methods for Deep Reinforcement Learning

Accelerated Methods for Deep Reinforcement Learning

Arxiv

6+阅读 · 2019年1月10日

Learning to Adapt: Meta-Learning for Model-Based Control

Arxiv

9+阅读 · 2018年3月30日

VIP会员

文章信息

相关主题

赌博机/老虎机

上下文赌博机/上下文老虎机

随机性策略

相关VIP内容

计算机理论顶会STOC 2021奖项出炉，滕尚华等华人学者获奖

专知会员服务

8+阅读 · 2021年7月22日

【DeepMind】基于模型的强化学习，174页ppt，Model-Based Reinforcement Learning

【DeepMind】基于模型的强化学习，174页ppt，Model-Based Reinforcement Learning

专知会员服务

89+阅读 · 2021年1月12日

不可错过！UIUC最新《统计强化学习》课程！

专知会员服务

54+阅读 · 2020年9月7日

【课程推荐】人工智能导论：Introduction to Articial Intelligence

【课程推荐】人工智能导论：Introduction to Articial Intelligence

专知会员服务

103+阅读 · 2019年12月20日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【NeurIPS 2025】稳定电影度量：面向专业视频生成的结构化分类与评测体系

战场AI决策支持系统

【博士论文】面向排序与扩散模型的安全、高效与鲁棒强化学习

面向 AI 生成图像的安全与鲁棒水印：全面综述

相关资讯

LibRec 精选：AutoML for Contextual Bandits

LibRec 精选：AutoML for Contextual Bandits

LibRec智能推荐

7+阅读 · 2019年9月19日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

RL 真经

CreateAMind

5+阅读 · 2018年12月28日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

carla 学习笔记

carla 学习笔记

CreateAMind

9+阅读 · 2018年2月7日

相关论文

Acceleration based PSO for Multi-UAV Source-Seeking

Acceleration based PSO for Multi-UAV Source-Seeking

Arxiv

0+阅读 · 2021年9月23日

The Seven-League Scheme: Deep learning for large time step Monte Carlo simulations of stochastic differential equations

The Seven-League Scheme: Deep learning for large time step Monte Carlo simulations of stochastic differential equations

Arxiv

0+阅读 · 2021年9月23日

Temporal Scale Estimation for Oversampled Network Cascades: Theory, Algorithms, and Experiment

Arxiv

0+阅读 · 2021年9月22日

Minimax Rates and Adaptivity in Combining Experimental and Observational Data

Arxiv

0+阅读 · 2021年9月22日

Online Deterministic Annealing for Classification and Clustering

Arxiv

0+阅读 · 2021年9月21日

Policy Gradient Bayesian Robust Optimization for Imitation Learning

Arxiv

5+阅读 · 2021年6月11日

Hyperparameter Selection for Imitation Learning

Arxiv

7+阅读 · 2021年5月25日

Hierarchical Adaptive Contextual Bandits for Resource Constraint based Recommendation

Arxiv

5+阅读 · 2020年4月2日

Accelerated Methods for Deep Reinforcement Learning

Accelerated Methods for Deep Reinforcement Learning

Arxiv

6+阅读 · 2019年1月10日

Learning to Adapt: Meta-Learning for Model-Based Control

Arxiv

9+阅读 · 2018年3月30日

微信扫码咨询专知VIP会员