拉索强盗的门槛值 (Thresholded Lasso Bandit) - 专知论文

会员服务 ·

0

赌博机/老虎机 · 估计/估计量 · 阈值 · 奖励函数 · 向量化 ·

2022 年 6 月 19 日

Thresholded Lasso Bandit

翻译：拉索强盗的门槛值

Kaito Ariu,Kenshi Abe,Alexandre Proutière

from arxiv, International Conference on Machine Learning (ICML 2022), Proceedings of Machine Learning Research

In this paper, we revisit the regret minimization problem in sparse stochastic contextual linear bandits, where feature vectors may be of large dimension $d$, but where the reward function depends on a few, say $s_0\ll d$, of these features only. We present Thresholded Lasso bandit, an algorithm that (i) estimates the vector defining the reward function as well as its sparse support, i.e., significant feature elements, using the Lasso framework with thresholding, and (ii) selects an arm greedily according to this estimate projected on its support. The algorithm does not require prior knowledge of the sparsity index $s_0$ and can be parameter-free under some symmetric assumptions. For this simple algorithm, we establish non-asymptotic regret upper bounds scaling as $\mathcal{O}( \log d + \sqrt{T} )$ in general, and as $\mathcal{O}( \log d + \log T)$ under the so-called margin condition (a probabilistic condition on the separation of the arm rewards). The regret of previous algorithms scales as $\mathcal{O}( \log d + \sqrt{T \log (d T)})$ and $\mathcal{O}( \log T \log d)$ in the two settings, respectively. Through numerical experiments, we confirm that our algorithm outperforms existing methods.

翻译：在本文中, 我们重新审视了稀有的随机线性线性土匪的最小化遗憾问题, 在那里, 特性矢量可能具有很大的维度 $d$, 但奖励功能仅依赖于这些特性的少数, 比如 $_ 0\ll d$ 。我们只展示了这些特性。我们使用一个算法, (一) 估计矢量定义奖赏功能及其稀少的支持, 即重要特性元素, 使用有阈值的拉索框架, 并且 (二) 根据对它的支持预测的估算, 贪婪地选择一个臂。算法不需要事先知道月度指数 $_ 0$, 在某些对称假设下可以是无参数的。对于这个简单算法, 我们设置了不设防缩的遗憾上限, 以$\ macal{ O} (\log d+ sqrt} 普通值, 以及 $\ mathcalalal( a problog\ ralal) 和 roal dral 的缩度。 (alog deal) 。 ( a rolog deal deal) dqral deal) 。

0

相关内容

赌博机/老虎机

赌博机/老虎机

【硬核书】树与网络上的概率，716页pdf

【硬核书】树与网络上的概率，716页pdf

专知会员服务

77+阅读 · 2021年12月8日

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

专知会员服务

69+阅读 · 2021年3月27日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

【ICIG2021】Latest News & Announcements of the Plenary Talk2

【ICIG2021】Latest News & Announcements of the Plenary Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年11月2日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

离散时间马氏链的泛函不等式及遍历性

国家自然科学基金

0+阅读 · 2014年12月31日

采用pinball loss的MEE算法研究

国家自然科学基金

1+阅读 · 2013年12月31日

基于SURE/PURE准则的图像盲反卷积算法研究

国家自然科学基金

3+阅读 · 2013年12月31日

向量均衡问题的迭代算法研究

国家自然科学基金

0+阅读 · 2012年12月31日

受限制策略下多臂Bandit过程的理论与应用研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于有限集统计的雷达组网弱信噪比目标检测跟踪理论研究

国家自然科学基金

0+阅读 · 2012年12月31日

高维协变量下部分线性风险回归模型的变量选择

国家自然科学基金

0+阅读 · 2012年12月31日

随机最优控制的数值方法理论及其应用研究

国家自然科学基金

0+阅读 · 2011年12月31日

广义度量方法及其在D空间和传感器最优布局问题中的应用

国家自然科学基金

0+阅读 · 2009年12月31日

马尔科夫调节风险模型下的破产概率及相关问题

国家自然科学基金

0+阅读 · 2009年12月31日

EM algorithm for generalized Ridge regression with spatial covariates

Arxiv

0+阅读 · 2022年8月9日

Risk and optimal policies in bandit experiments

Arxiv

0+阅读 · 2022年8月9日

Machine learning the real discriminant locus

Arxiv

0+阅读 · 2022年8月8日

Minimax Semiparametric Learning With Approximate Sparsity

Arxiv

0+阅读 · 2022年8月8日

Rate-Optimal Cluster-Randomized Designs for Spatial Interference

Arxiv

0+阅读 · 2022年8月8日

Time-uniform, nonparametric, nonasymptotic confidence sequences

Arxiv

0+阅读 · 2022年8月6日

The Extended UCB Policies for Frequentist Multi-armed Bandit Problems

Arxiv

0+阅读 · 2022年8月6日

Malliavin calculus for the optimal estimation of the invariant density of discretely observed diffusions in intermediate regime

Arxiv

0+阅读 · 2022年8月5日

Gaussian Universal Likelihood Ratio Testing

Arxiv

0+阅读 · 2022年8月5日

A Case-Study of Sample-Based Bayesian Forecasting Algorithms

Arxiv

0+阅读 · 2022年8月5日

VIP会员

文章信息

相关主题

赌博机/老虎机

估计/估计量

相关VIP内容

【硬核书】树与网络上的概率，716页pdf

【硬核书】树与网络上的概率，716页pdf

专知会员服务

77+阅读 · 2021年12月8日

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

专知会员服务

69+阅读 · 2021年3月27日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《步兵小单元山地严寒作战指南》美军最新条令200页

《联合作战概念的发展》最新报告

俄制无人机弹药

《复杂场景下自主着陆的模型预测控制技术》92页

相关资讯

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

【ICIG2021】Latest News & Announcements of the Plenary Talk2

【ICIG2021】Latest News & Announcements of the Plenary Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年11月2日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

相关论文

EM algorithm for generalized Ridge regression with spatial covariates

Arxiv

0+阅读 · 2022年8月9日

Risk and optimal policies in bandit experiments

Arxiv

0+阅读 · 2022年8月9日

Machine learning the real discriminant locus

Arxiv

0+阅读 · 2022年8月8日

Minimax Semiparametric Learning With Approximate Sparsity

Arxiv

0+阅读 · 2022年8月8日

Rate-Optimal Cluster-Randomized Designs for Spatial Interference

Arxiv

0+阅读 · 2022年8月8日

Time-uniform, nonparametric, nonasymptotic confidence sequences

Arxiv

0+阅读 · 2022年8月6日

The Extended UCB Policies for Frequentist Multi-armed Bandit Problems

Arxiv

0+阅读 · 2022年8月6日

Malliavin calculus for the optimal estimation of the invariant density of discretely observed diffusions in intermediate regime

Arxiv

0+阅读 · 2022年8月5日

Gaussian Universal Likelihood Ratio Testing

Arxiv

0+阅读 · 2022年8月5日

A Case-Study of Sample-Based Bayesian Forecasting Algorithms

Arxiv

0+阅读 · 2022年8月5日

相关基金

离散时间马氏链的泛函不等式及遍历性

国家自然科学基金

0+阅读 · 2014年12月31日

采用pinball loss的MEE算法研究

国家自然科学基金

1+阅读 · 2013年12月31日

基于SURE/PURE准则的图像盲反卷积算法研究

国家自然科学基金

3+阅读 · 2013年12月31日

向量均衡问题的迭代算法研究

国家自然科学基金

0+阅读 · 2012年12月31日

受限制策略下多臂Bandit过程的理论与应用研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于有限集统计的雷达组网弱信噪比目标检测跟踪理论研究

国家自然科学基金

0+阅读 · 2012年12月31日

高维协变量下部分线性风险回归模型的变量选择

国家自然科学基金

0+阅读 · 2012年12月31日

随机最优控制的数值方法理论及其应用研究

国家自然科学基金

0+阅读 · 2011年12月31日

广义度量方法及其在D空间和传感器最优布局问题中的应用

国家自然科学基金

0+阅读 · 2009年12月31日

马尔科夫调节风险模型下的破产概率及相关问题

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员