纯粹勘探线性强盗最优化模型选择 (Near Instance Optimal Model Selection for Pure Exploration Linear Bandits) - 专知论文

会员服务 ·

0

模型选择 · 赌博机/老虎机 · 优化器 · MoDELS · 线性的 ·

2021 年 9 月 10 日

Near Instance Optimal Model Selection for Pure Exploration Linear Bandits

翻译：纯粹勘探线性强盗最优化模型选择

Yinglun Zhu,Julian Katz-Samuels,Robert Nowak

The model selection problem in the pure exploration linear bandit setting is introduced and studied in both the fixed confidence and fixed budget settings. The model selection problem considers a nested sequence of hypothesis classes of increasing complexities. Our goal is to automatically adapt to the instance-dependent complexity measure of the smallest hypothesis class containing the true model, rather than suffering from the complexity measure related to the largest hypothesis class. We provide evidence showing that a standard doubling trick over dimension fails to achieve the optimal instance-dependent sample complexity. Our algorithms define a new optimization problem based on experimental design that leverages the geometry of the action set to efficiently identify a near-optimal hypothesis class. Our fixed budget algorithm uses a novel application of a selection-validation trick in bandits. This provides a new method for the understudied fixed budget setting in linear bandits (even without the added challenge of model selection). We further generalize the model selection problem to the misspecified regime, adapting our algorithms in both fixed confidence and fixed budget settings.

翻译：纯粹勘探线性土匪设置的模型选择问题在固定信心和固定预算设置中都引入并研究。模型选择问题考虑了日益复杂的假设类别中的嵌套序列。我们的目标是自动适应包含真实模型的最小假设类的根据实例的复杂度, 而不是受与最大假设类有关的复杂度的制约。我们提供证据表明, 标准双倍的参数无法达到最佳的根据实例进行抽样的复杂度。我们的算法根据实验设计定义了一个新的优化问题,它利用了所设定动作的几何性来有效识别接近最佳的假设类。我们的固定预算算法在土匪中应用了一种新的选择-验证技巧。这为未得到充分研究的线性土匪固定预算设置提供了一种新的方法( 即使没有增加模型选择的挑战 ) 。我们进一步将模型选择问题概括到错误的系统, 在固定信心和固定预算环境下调整我们的算法。

0

相关内容

模型选择

【NUS】深度长尾学习综述，20页pdf172篇文献

【NUS】深度长尾学习综述，20页pdf172篇文献

专知会员服务

59+阅读 · 2021年10月14日

【KDD2021】图神经网络，NUS- Xavier Bresson教授

【KDD2021】图神经网络，NUS- Xavier Bresson教授

专知会员服务

66+阅读 · 2021年8月20日

【DeepMind】基于模型的强化学习，174页ppt，Model-Based Reinforcement Learning

【DeepMind】基于模型的强化学习，174页ppt，Model-Based Reinforcement Learning

专知会员服务

89+阅读 · 2021年1月12日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

基于Transformer嵌入模型的个性化产品搜索，A Transformer-based Embedding Model for Personalized Product Search

基于Transformer嵌入模型的个性化产品搜索，A Transformer-based Embedding Model for Personalized Product Search

专知会员服务

31+阅读 · 2020年5月20日

【基于模型的强化学习的博弈论框架】A Game Theoretic Framework for Model Based Reinforcement Learning

【基于模型的强化学习的博弈论框架】A Game Theoretic Framework for Model Based Reinforcement Learning

专知会员服务

131+阅读 · 2020年4月19日

强化学习最优表示的几何视角（A Geometric Perspective on Optimal Representations for Reinforcement Learning）

强化学习最优表示的几何视角（A Geometric Perspective on Optimal Representations for Reinforcement Learning）

专知会员服务

9+阅读 · 2019年12月24日

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

专知会员服务

244+阅读 · 2019年10月21日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

LibRec 精选：AutoML for Contextual Bandits

LibRec 精选：AutoML for Contextual Bandits

LibRec智能推荐

7+阅读 · 2019年9月19日

Successor representations 强化学习表示的生物学启发

Successor representations 强化学习表示的生物学启发

CreateAMind

6+阅读 · 2019年9月5日

LibRec 精选：EfficientNet、XLNet 论文及代码实现

LibRec 精选：EfficientNet、XLNet 论文及代码实现

LibRec智能推荐

5+阅读 · 2019年7月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

Reinforcement Learning: An Introduction 2018第二版 500页

Reinforcement Learning: An Introduction 2018第二版 500页

CreateAMind

14+阅读 · 2018年4月27日

【论文】图上的表示学习综述

【论文】图上的表示学习综述

机器学习研究会

15+阅读 · 2017年9月24日

Nearly Optimal Algorithms for Level Set Estimation

Nearly Optimal Algorithms for Level Set Estimation

Arxiv

0+阅读 · 2021年11月2日

Dealing With Misspecification In Fixed-Confidence Linear Top-m Identification

Arxiv

0+阅读 · 2021年11月2日

Mixture Proportion Estimation and PU Learning: A Modern Approach

Arxiv

0+阅读 · 2021年11月1日

Generalized Proximal Policy Optimization with Sample Reuse

Arxiv

0+阅读 · 2021年10月29日

Collaborative Pure Exploration in Kernel Bandit

Arxiv

0+阅读 · 2021年10月29日

Improved Variance-Aware Confidence Sets for Linear Bandits and Linear Mixture MDP

Arxiv

0+阅读 · 2021年10月29日

Exploration in Approximate Hyper-State Space for Meta Reinforcement Learning

Exploration in Approximate Hyper-State Space for Meta Reinforcement Learning

Arxiv

9+阅读 · 2021年2月23日

Model-based Adversarial Meta-Reinforcement Learning

Arxiv

5+阅读 · 2020年6月16日

A Baseline for Few-Shot Image Classification

Arxiv

7+阅读 · 2020年3月1日

Variance Reduction Methods for Sublinear Reinforcement Learning

Arxiv

4+阅读 · 2018年4月25日

VIP会员

文章信息

相关主题

赌博机/老虎机

相关VIP内容

【NUS】深度长尾学习综述，20页pdf172篇文献

【NUS】深度长尾学习综述，20页pdf172篇文献

专知会员服务

59+阅读 · 2021年10月14日

【KDD2021】图神经网络，NUS- Xavier Bresson教授

【KDD2021】图神经网络，NUS- Xavier Bresson教授

专知会员服务

66+阅读 · 2021年8月20日

【DeepMind】基于模型的强化学习，174页ppt，Model-Based Reinforcement Learning

【DeepMind】基于模型的强化学习，174页ppt，Model-Based Reinforcement Learning

专知会员服务

89+阅读 · 2021年1月12日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

基于Transformer嵌入模型的个性化产品搜索，A Transformer-based Embedding Model for Personalized Product Search

基于Transformer嵌入模型的个性化产品搜索，A Transformer-based Embedding Model for Personalized Product Search

专知会员服务

31+阅读 · 2020年5月20日

【基于模型的强化学习的博弈论框架】A Game Theoretic Framework for Model Based Reinforcement Learning

【基于模型的强化学习的博弈论框架】A Game Theoretic Framework for Model Based Reinforcement Learning

专知会员服务

131+阅读 · 2020年4月19日

强化学习最优表示的几何视角（A Geometric Perspective on Optimal Representations for Reinforcement Learning）

强化学习最优表示的几何视角（A Geometric Perspective on Optimal Representations for Reinforcement Learning）

专知会员服务

9+阅读 · 2019年12月24日

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

专知会员服务

244+阅读 · 2019年10月21日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

热门VIP内容

开通专知VIP会员享更多权益服务

《复杂工程系统模型驱动设计决策支持系统：早期设计阶段挑战》最新138页

《日本陆上自卫队2040年作战方式与未来作战研究》最新23页slides

人工智能作为战争武器

《后勤保障》最新23页

相关资讯

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

LibRec 精选：AutoML for Contextual Bandits

LibRec 精选：AutoML for Contextual Bandits

LibRec智能推荐

7+阅读 · 2019年9月19日

Successor representations 强化学习表示的生物学启发

Successor representations 强化学习表示的生物学启发

CreateAMind

6+阅读 · 2019年9月5日

LibRec 精选：EfficientNet、XLNet 论文及代码实现

LibRec 精选：EfficientNet、XLNet 论文及代码实现

LibRec智能推荐

5+阅读 · 2019年7月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

Reinforcement Learning: An Introduction 2018第二版 500页

Reinforcement Learning: An Introduction 2018第二版 500页

CreateAMind

14+阅读 · 2018年4月27日

【论文】图上的表示学习综述

【论文】图上的表示学习综述

机器学习研究会

15+阅读 · 2017年9月24日

相关论文

Nearly Optimal Algorithms for Level Set Estimation

Nearly Optimal Algorithms for Level Set Estimation

Arxiv

0+阅读 · 2021年11月2日

Dealing With Misspecification In Fixed-Confidence Linear Top-m Identification

Arxiv

0+阅读 · 2021年11月2日

Mixture Proportion Estimation and PU Learning: A Modern Approach

Arxiv

0+阅读 · 2021年11月1日

Generalized Proximal Policy Optimization with Sample Reuse

Arxiv

0+阅读 · 2021年10月29日

Collaborative Pure Exploration in Kernel Bandit

Arxiv

0+阅读 · 2021年10月29日

Improved Variance-Aware Confidence Sets for Linear Bandits and Linear Mixture MDP

Arxiv

0+阅读 · 2021年10月29日

Exploration in Approximate Hyper-State Space for Meta Reinforcement Learning

Exploration in Approximate Hyper-State Space for Meta Reinforcement Learning

Arxiv

9+阅读 · 2021年2月23日

Model-based Adversarial Meta-Reinforcement Learning

Arxiv

5+阅读 · 2020年6月16日

A Baseline for Few-Shot Image Classification

Arxiv

7+阅读 · 2020年3月1日

Variance Reduction Methods for Sublinear Reinforcement Learning

Arxiv

4+阅读 · 2018年4月25日

微信扫码咨询专知VIP会员