提高线性后勤模型和对线性强盗应用的信心 (Improved Confidence Bounds for the Linear Logistic Model and Applications to Linear Bandits) - 专知论文

会员服务 ·

0

置信度 · 赌博机/老虎机 · 线性的 · state-of-the-art · 对率损失 ·

2021 年 3 月 18 日

Improved Confidence Bounds for the Linear Logistic Model and Applications to Linear Bandits

翻译：提高线性后勤模型和对线性强盗应用的信心

Kwang-Sung Jun,Lalit Jain,Blake Mason,Houssam Nassif

We propose improved fixed-design confidence bounds for the linear logistic model. Our bounds significantly improve upon the state-of-the-art bound by Li et al. (2017) via recent developments of the self-concordant analysis of the logistic loss (Faury et al., 2020). Specifically, our confidence bound avoids a direct dependence on $1/\kappa$, where $\kappa$ is the minimal variance over all arms' reward distributions. In general, $1/\kappa$ scales exponentially with the norm of the unknown linear parameter $\theta^*$. Instead of relying on this worst-case quantity, our confidence bound for the reward of any given arm depends directly on the variance of that arm's reward distribution. We present two applications of our novel bounds to pure exploration and regret minimization logistic bandits improving upon state-of-the-art performance guarantees. For pure exploration, we also provide a lower bound highlighting a dependence on $1/\kappa$ for a family of instances.

翻译：我们建议改善线性后勤模式的固定设计信任度。我们通过最近对后勤损失进行自我协调分析(Foury等人,2020年),大大改进了Li等人(2017年)所约束的最新技术水平(2017年),具体地说,我们的信任度避免直接依赖1美元/卡帕(Kappa)美元,因为Kappa美元是所有军备奖励分配的最小差异。一般而言,1美元/卡帕(Kappa)美元与未知线性参数的规范($\theta ⁇ $)成倍增长。我们对任何特定手臂的奖赏所约束的信任直接取决于该手臂报酬分配的差异。我们提出了我们两个新的界限,以纯粹勘探为目的,并遗憾最大限度地减少利用最先进的业绩保证的后勤匪徒。关于纯度的勘探,我们还提供了一个家庭对1美元/卡帕($)的依赖度较低。

0

相关内容

置信度

1800页33章数学方法精要笔记 —深入数学建模，机器学习和深度学习的数学基础

专知会员服务

249+阅读 · 2020年7月3日

【经典书】贝叶斯编程，378页pdf，Bayesian Programming

【经典书】贝叶斯编程，378页pdf，Bayesian Programming

专知会员服务

250+阅读 · 2020年5月18日

【剑桥大学】统计因果关系的决策理论基础，Decision-theoretic foundations for statistical causality

【剑桥大学】统计因果关系的决策理论基础，Decision-theoretic foundations for statistical causality

专知会员服务

48+阅读 · 2020年5月5日

【伯克利】元学习的元基线，A New Meta-Baseline for Few-Shot Learning

【伯克利】元学习的元基线，A New Meta-Baseline for Few-Shot Learning

专知会员服务

67+阅读 · 2020年3月28日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【图机器学习论文】图摘要方法与应用综述（Graph Summarization Methods and Applications: A Survey）

【图机器学习论文】图摘要方法与应用综述（Graph Summarization Methods and Applications: A Survey）

专知会员服务

42+阅读 · 2019年12月16日

【AAAI2020论文】隐私保留GBDT（Privacy-Preserving Gradient Boosting Decision Trees）

【AAAI2020论文】隐私保留GBDT（Privacy-Preserving Gradient Boosting Decision Trees）

专知会员服务

36+阅读 · 2019年11月15日

【课程】普林斯顿大学19年春季学期《机器学习优化》课程讲义

【课程】普林斯顿大学19年春季学期《机器学习优化》课程讲义

专知会员服务

85+阅读 · 2019年10月29日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

已删除

将门创投

3+阅读 · 2020年8月3日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Jointly Improving Summarization and Sentiment Classification

Jointly Improving Summarization and Sentiment Classification

黑龙江大学自然语言处理实验室

3+阅读 · 2018年6月12日

蒙特卡罗方法(Monte Carlo Methods)

蒙特卡罗方法(Monte Carlo Methods)

数据挖掘入门与实战

6+阅读 · 2018年4月22日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

逻辑回归（Logistic Regression）模型简介

逻辑回归（Logistic Regression）模型简介

全球人工智能

5+阅读 · 2017年11月1日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

Logistic回归第一弹——二项Logistic Regression

Logistic回归第一弹——二项Logistic Regression

机器学习深度学习实战原创交流

3+阅读 · 2015年10月22日

Provable Model-based Nonlinear Bandit and Reinforcement Learning: Shelve Optimism, Embrace Virtual Curvature

Arxiv

0+阅读 · 2021年5月13日

Convergence bounds for empirical nonlinear least-squares

Arxiv

0+阅读 · 2021年5月11日

Search Algorithms and Loss Functions for Bayesian Clustering

Arxiv

0+阅读 · 2021年5月10日

New Approximations and Hardness Results for Submodular Partitioning Problems

Arxiv

0+阅读 · 2021年5月10日

Approximation Theory Based Methods for RKHS Bandits

Arxiv

0+阅读 · 2021年5月10日

Impact of Representation Learning in Linear Bandits

Arxiv

0+阅读 · 2021年5月5日

Information Complexity and Generalization Bounds

Arxiv

0+阅读 · 2021年5月4日

Regret Bounds for Stochastic Shortest Path Problems with Linear Function Approximation

Arxiv

0+阅读 · 2021年5月4日

Variational Bayesian Reinforcement Learning with Regret Bounds

Arxiv

3+阅读 · 2018年7月25日

A fast algorithm with minimax optimal guarantees for topic models with an unknown number of topics

Arxiv

7+阅读 · 2018年6月12日

VIP会员

文章信息

相关主题

赌博机/老虎机

state-of-the-art

相关VIP内容

1800页33章数学方法精要笔记 —深入数学建模，机器学习和深度学习的数学基础

专知会员服务

249+阅读 · 2020年7月3日

【经典书】贝叶斯编程，378页pdf，Bayesian Programming

【经典书】贝叶斯编程，378页pdf，Bayesian Programming

专知会员服务

250+阅读 · 2020年5月18日

【剑桥大学】统计因果关系的决策理论基础，Decision-theoretic foundations for statistical causality

【剑桥大学】统计因果关系的决策理论基础，Decision-theoretic foundations for statistical causality

专知会员服务

48+阅读 · 2020年5月5日

【伯克利】元学习的元基线，A New Meta-Baseline for Few-Shot Learning

【伯克利】元学习的元基线，A New Meta-Baseline for Few-Shot Learning

专知会员服务

67+阅读 · 2020年3月28日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【图机器学习论文】图摘要方法与应用综述（Graph Summarization Methods and Applications: A Survey）

【图机器学习论文】图摘要方法与应用综述（Graph Summarization Methods and Applications: A Survey）

专知会员服务

42+阅读 · 2019年12月16日

【AAAI2020论文】隐私保留GBDT（Privacy-Preserving Gradient Boosting Decision Trees）

【AAAI2020论文】隐私保留GBDT（Privacy-Preserving Gradient Boosting Decision Trees）

专知会员服务

36+阅读 · 2019年11月15日

【课程】普林斯顿大学19年春季学期《机器学习优化》课程讲义

【课程】普林斯顿大学19年春季学期《机器学习优化》课程讲义

专知会员服务

85+阅读 · 2019年10月29日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

热门VIP内容

开通专知VIP会员享更多权益服务

《美空军条令出版物：战略打击》最新条令

《高能激光武器》22页slides

军事前沿模型

《面向小型无人机或无人飞行器的创新雷达探测与人工智能分类技术》263页

相关资讯

已删除

将门创投

3+阅读 · 2020年8月3日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Jointly Improving Summarization and Sentiment Classification

Jointly Improving Summarization and Sentiment Classification

黑龙江大学自然语言处理实验室

3+阅读 · 2018年6月12日

蒙特卡罗方法(Monte Carlo Methods)

蒙特卡罗方法(Monte Carlo Methods)

数据挖掘入门与实战

6+阅读 · 2018年4月22日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

逻辑回归（Logistic Regression）模型简介

逻辑回归（Logistic Regression）模型简介

全球人工智能

5+阅读 · 2017年11月1日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

Logistic回归第一弹——二项Logistic Regression

Logistic回归第一弹——二项Logistic Regression

机器学习深度学习实战原创交流

3+阅读 · 2015年10月22日

相关论文

Provable Model-based Nonlinear Bandit and Reinforcement Learning: Shelve Optimism, Embrace Virtual Curvature

Arxiv

0+阅读 · 2021年5月13日

Convergence bounds for empirical nonlinear least-squares

Arxiv

0+阅读 · 2021年5月11日

Search Algorithms and Loss Functions for Bayesian Clustering

Arxiv

0+阅读 · 2021年5月10日

New Approximations and Hardness Results for Submodular Partitioning Problems

Arxiv

0+阅读 · 2021年5月10日

Approximation Theory Based Methods for RKHS Bandits

Arxiv

0+阅读 · 2021年5月10日

Impact of Representation Learning in Linear Bandits

Arxiv

0+阅读 · 2021年5月5日

Information Complexity and Generalization Bounds

Arxiv

0+阅读 · 2021年5月4日

Regret Bounds for Stochastic Shortest Path Problems with Linear Function Approximation

Arxiv

0+阅读 · 2021年5月4日

Variational Bayesian Reinforcement Learning with Regret Bounds

Arxiv

3+阅读 · 2018年7月25日

A fast algorithm with minimax optimal guarantees for topic models with an unknown number of topics

Arxiv

7+阅读 · 2018年6月12日

微信扫码咨询专知VIP会员