序列实验的分流发射设置器 (Diffusion Asymptotics for Sequential Experiments) - 专知论文

会员服务 ·

0

缩放 · 赌博机/老虎机 · 样本 · 方差 · 均值 ·

2021 年 6 月 10 日

Diffusion Asymptotics for Sequential Experiments

翻译：序列实验的分流发射设置器

Stefan Wager,Kuang Xu

We propose a new diffusion-asymptotic analysis for sequentially randomized experiments, including those that arise in solving multi-armed bandit problems. In an experiment with $ n $ time steps, we let the mean reward gaps between actions scale to the order $1/\sqrt{n}$ so as to preserve the difficulty of the learning task as $n$ grows. In this regime, we show that the behavior of a class of sequentially randomized Markov experiments converges to a diffusion limit, given as the solution of a stochastic differential equation. The diffusion limit thus enables us to derive refined, instance-specific characterization of the stochastic dynamics of adaptive experiments. As an application of this framework, we use the diffusion limit to obtain several new insights on the regret and belief evolution of Thompson sampling. We show that a version of Thompson sampling with an asymptotically uninformative prior variance achieves nearly-optimal instance-specific regret scaling when the reward gaps are relatively large. We also demonstrate that, in this regime, the posterior beliefs underlying Thompson sampling are highly unstable over time.

翻译：我们建议对按顺序随机进行的实验进行新的扩散-无症状分析,包括解决多武装土匪问题时产生的实验。在一次以美元计时步骤的实验中,我们让行动规模之间的平均报酬差距达到1美元/斯克特{n}的顺序,以便随着美元的增长而保持学习任务的难度。在这个制度下,我们表明,按顺序随机的马可夫试验类别的行为接近于一个扩散限度,因为这是一个随机差异方程式的解决方案。扩散限度使我们得以对适应性实验的随机动态进行精细的、具体实例的定性。作为这一框架的应用,我们利用扩散限度来获得关于汤普森取样的遗憾和信仰演变情况的新见解。我们表明,在奖励差距相对较大时,一种具有无症状性、无说服力的前差异的汤普森取样方法可以取得几乎最佳的具体实例的遗憾比例。我们还表明,在这个制度下,作为汤普森取样基础的事后信念随着时间的推移非常不稳定。

0

相关内容

WWW21最新「比较学习」教程，135页PPT阐述从排名数据中学习

专知会员服务

37+阅读 · 2021年4月27日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

近期必读的 NeurIPS2020 80多篇【图机器学习】相关论文

专知会员服务

54+阅读 · 2020年11月3日

【干货书】Python数据科学分析，413页pdf

【干货书】Python数据科学分析，413页pdf

专知会员服务

93+阅读 · 2020年8月22日

【快讯】ICML 2020论文出炉，1088篇上榜，你的paper中了吗？

【快讯】ICML 2020论文出炉，1088篇上榜，你的paper中了吗？

专知会员服务

52+阅读 · 2020年6月1日

因果图，Causal Graphs，52页ppt

因果图，Causal Graphs，52页ppt

专知会员服务

250+阅读 · 2020年4月19日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

2019年机器学习框架回顾

2019年机器学习框架回顾

专知会员服务

36+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Windows 提权-快速查找 Exp

Windows 提权-快速查找 Exp

黑白之道

3+阅读 · 2019年1月23日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

Dynamic Zoom-in Network 论文笔记

Dynamic Zoom-in Network 论文笔记

统计学习与视觉计算组

6+阅读 · 2018年7月18日

【论文推荐】最新六篇主题模型相关论文—领域特定知识库、神经变分推断、动态和静态主题模型

【论文推荐】最新六篇主题模型相关论文—领域特定知识库、神经变分推断、动态和静态主题模型

专知

19+阅读 · 2018年6月26日

笔记 | Sentiment Analysis

笔记 | Sentiment Analysis

黑龙江大学自然语言处理实验室

10+阅读 · 2018年5月6日

CVE-2018-7600 - Drupal 7.x 远程代码执行exp

CVE-2018-7600 - Drupal 7.x 远程代码执行exp

黑客工具箱

14+阅读 · 2018年4月17日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

【推荐】MXNet深度情感分析实战

【推荐】MXNet深度情感分析实战

机器学习研究会

16+阅读 · 2017年10月4日

M6-T: Exploring Sparse Expert Models and Beyond

M6-T: Exploring Sparse Expert Models and Beyond

Arxiv

0+阅读 · 2021年8月9日

Bayesian decision-theoretic design of experiments under an alternative model

Arxiv

0+阅读 · 2021年8月9日

Wavelet eigenvalue regression in high dimensions

Arxiv

0+阅读 · 2021年8月9日

On the Saturation Phenomenon of Stochastic Gradient Descent for Linear Inverse Problems

Arxiv

0+阅读 · 2021年8月7日

Central limit theorems for high dimensional dependent data

Arxiv

0+阅读 · 2021年8月7日

Network Inference and Influence Maximization from Samples

Arxiv

7+阅读 · 2021年6月7日

Informer: Beyond Efficient Transformer for Long Sequence Time-Series Forecasting

Informer: Beyond Efficient Transformer for Long Sequence Time-Series Forecasting

Arxiv

21+阅读 · 2020年12月17日

Insertion-based Decoding with automatically Inferred Generation Order

Arxiv

5+阅读 · 2019年2月28日

Adaptive strategy for superpixel-based region-growing image segmentation

Arxiv

4+阅读 · 2018年3月17日

Twitter Sentiment Analysis

Arxiv

5+阅读 · 2015年9月14日

VIP会员

文章信息

相关主题

赌博机/老虎机

相关VIP内容

WWW21最新「比较学习」教程，135页PPT阐述从排名数据中学习

专知会员服务

37+阅读 · 2021年4月27日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

近期必读的 NeurIPS2020 80多篇【图机器学习】相关论文

专知会员服务

54+阅读 · 2020年11月3日

【干货书】Python数据科学分析，413页pdf

【干货书】Python数据科学分析，413页pdf

专知会员服务

93+阅读 · 2020年8月22日

【快讯】ICML 2020论文出炉，1088篇上榜，你的paper中了吗？

【快讯】ICML 2020论文出炉，1088篇上榜，你的paper中了吗？

专知会员服务

52+阅读 · 2020年6月1日

因果图，Causal Graphs，52页ppt

因果图，Causal Graphs，52页ppt

专知会员服务

250+阅读 · 2020年4月19日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

2019年机器学习框架回顾

2019年机器学习框架回顾

专知会员服务

36+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

【CMU博士论文】数据驱动决策中的激励、信息与不确定性

DGP双粒度提示框架：图增强大模型助力欺诈检测

【ICCV2025】ESSENTIAL：用于视频类增量学习的情景记忆与语义记忆整合

唯快不破：大型语言模型高效架构综述

相关资讯

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Windows 提权-快速查找 Exp

Windows 提权-快速查找 Exp

黑白之道

3+阅读 · 2019年1月23日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

Dynamic Zoom-in Network 论文笔记

Dynamic Zoom-in Network 论文笔记

统计学习与视觉计算组

6+阅读 · 2018年7月18日

【论文推荐】最新六篇主题模型相关论文—领域特定知识库、神经变分推断、动态和静态主题模型

【论文推荐】最新六篇主题模型相关论文—领域特定知识库、神经变分推断、动态和静态主题模型

专知

19+阅读 · 2018年6月26日

笔记 | Sentiment Analysis

笔记 | Sentiment Analysis

黑龙江大学自然语言处理实验室

10+阅读 · 2018年5月6日

CVE-2018-7600 - Drupal 7.x 远程代码执行exp

CVE-2018-7600 - Drupal 7.x 远程代码执行exp

黑客工具箱

14+阅读 · 2018年4月17日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

【推荐】MXNet深度情感分析实战

【推荐】MXNet深度情感分析实战

机器学习研究会

16+阅读 · 2017年10月4日

相关论文

M6-T: Exploring Sparse Expert Models and Beyond

M6-T: Exploring Sparse Expert Models and Beyond

Arxiv

0+阅读 · 2021年8月9日

Bayesian decision-theoretic design of experiments under an alternative model

Arxiv

0+阅读 · 2021年8月9日

Wavelet eigenvalue regression in high dimensions

Arxiv

0+阅读 · 2021年8月9日

On the Saturation Phenomenon of Stochastic Gradient Descent for Linear Inverse Problems

Arxiv

0+阅读 · 2021年8月7日

Central limit theorems for high dimensional dependent data

Arxiv

0+阅读 · 2021年8月7日

Network Inference and Influence Maximization from Samples

Arxiv

7+阅读 · 2021年6月7日

Informer: Beyond Efficient Transformer for Long Sequence Time-Series Forecasting

Informer: Beyond Efficient Transformer for Long Sequence Time-Series Forecasting

Arxiv

21+阅读 · 2020年12月17日

Insertion-based Decoding with automatically Inferred Generation Order

Arxiv

5+阅读 · 2019年2月28日

Adaptive strategy for superpixel-based region-growing image segmentation

Arxiv

4+阅读 · 2018年3月17日

Twitter Sentiment Analysis

Arxiv

5+阅读 · 2015年9月14日

微信扫码咨询专知VIP会员