线性存储中转模型下的斯托克背景比量土匪 (Stochastic Contextual Dueling Bandits under Linear Stochastic Transitivity Models) - 专知论文

会员服务 ·

0

赌博机/老虎机 · MoDELS · 线性的 · 学习器 · Processing（编程语言） ·

2022 年 2 月 9 日

Stochastic Contextual Dueling Bandits under Linear Stochastic Transitivity Models

翻译：线性存储中转模型下的斯托克背景比量土匪

Viktor Bengs,Aadirupa Saha,Eyke Hüllermeier

We consider the regret minimization task in a dueling bandits problem with context information. In every round of the sequential decision problem, the learner makes a context-dependent selection of two choice alternatives (arms) to be compared with each other and receives feedback in the form of noisy preference information. We assume that the feedback process is determined by a linear stochastic transitivity model with contextualized utilities (CoLST), and the learner's task is to include the best arm (with highest latent context-dependent utility) in the duel. We propose a computationally efficient algorithm, $\texttt{CoLSTIM}$, which makes its choice based on imitating the feedback process using perturbed context-dependent utility estimates of the underlying CoLST model. If each arm is associated with a $d$-dimensional feature vector, we show that $\texttt{CoLSTIM}$ achieves a regret of order $\tilde O( \sqrt{dT})$ after $T$ learning rounds. Additionally, we also establish the optimality of $\texttt{CoLSTIM}$ by showing a lower bound for the weak regret that refines the existing average regret analysis. Our experiments demonstrate its superiority over state-of-art algorithms for special cases of CoLST models.

翻译：我们认为,在与土匪的决斗中,最小化的任务与背景信息有关。在每一轮相继决定问题的每回合中,学习者根据背景选择了两种选择方案(武器),相互比较,并接受以吵闹的偏好信息为形式的反馈。我们假定,反馈进程是由具有背景化公用事业(CoLST)的线性随机过渡模式(CoLST)决定的,学习者的任务是在决斗中包括最好的手臂(具有最高潜伏背景效用)。我们提出一个计算效率高的算法,即$\ textt{ColSTIM}$,根据对基本COLST模型进行基于环境的实用性估计,以模拟反馈进程为基础作出选择。如果每个手臂与一个以美元为单位的功能矢量的直线性中转模式(CoLSTIM})相关,我们则表明,$trftlett{CoLSTIM}在学习回合后,以美元为最优的排序。此外,我们还确定了美元texttralt{ColSTIM} 的优化,展示了我们现有的低度实验。

0

相关内容

赌博机/老虎机

赌博机/老虎机

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【经典书】图理论与应用，270页pdf

专知会员服务

86+阅读 · 2020年12月5日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

【北京大学】Locally Differentially Private (Contextual) Bandits Learning

【北京大学】Locally Differentially Private (Contextual) Bandits Learning

专知会员服务

13+阅读 · 2020年6月8日

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

专知会员服务

77+阅读 · 2020年2月8日

【ICCV 2019 Toturial】Global Optimization for Geometric Understanding with Provable Guarantees（具有可证明保证的几何理解的全局优化）

【ICCV 2019 Toturial】Global Optimization for Geometric Understanding with Provable Guarantees（具有可证明保证的几何理解的全局优化）

专知会员服务

18+阅读 · 2019年11月1日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

【论文推荐】最新5篇图像分割（Image Segmentation）相关论文—多重假设、超像素分割、自监督、图、生成对抗网络

【论文推荐】最新5篇图像分割（Image Segmentation）相关论文—多重假设、超像素分割、自监督、图、生成对抗网络

专知

27+阅读 · 2018年2月7日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

储氢、供氢新概念-电解液氨制氢及其反应机理

国家自然科学基金

1+阅读 · 2013年12月31日

超声波电机高效率非线性Hammerstein控制方法

国家自然科学基金

0+阅读 · 2013年12月31日

随机扰动理论和随机算法在大规模矩阵计算中的应用

国家自然科学基金

0+阅读 · 2012年12月31日

利用参量结构实现复杂信号环境下盲信号分离方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

低秩矩阵复原的Schatten-q(0<q<1)正则化理论与算法研究

国家自然科学基金

1+阅读 · 2012年12月31日

超宽带通信数字接收机的压缩采样技术研究

国家自然科学基金

0+阅读 · 2011年12月31日

ZrC基陶瓷中纳米相的析出行为与机理及对性能的影响

国家自然科学基金

0+阅读 · 2011年12月31日

多重齐次多项式优化的近似算法及其应用

国家自然科学基金

0+阅读 · 2011年12月31日

布尔函数的密码性质研究

国家自然科学基金

0+阅读 · 2011年12月31日

双亲性共聚物自组装表面活性胶体粒子及其乳化性能研究

国家自然科学基金

0+阅读 · 2009年12月31日

Near-optimal Policy Optimization Algorithms for Learning Adversarial Linear Mixture MDPs

Arxiv

0+阅读 · 2022年4月20日

Faster Perturbed Stochastic Gradient Methods for Finding Local Minima

Arxiv

0+阅读 · 2022年4月20日

K-LITE: Learning Transferable Visual Models with External Knowledge

Arxiv

2+阅读 · 2022年4月20日

A stochastic Stein Variational Newton method

Arxiv

0+阅读 · 2022年4月19日

Neural Stochastic Partial Differential Equations: Resolution-Invariant Learning of Continuous Spatiotemporal Dynamics

Neural Stochastic Partial Differential Equations: Resolution-Invariant Learning of Continuous Spatiotemporal Dynamics

Arxiv

0+阅读 · 2022年4月19日

Stochastic Saddle Point Problems with Decision-Dependent Distributions

Arxiv

0+阅读 · 2022年4月19日

RIS-Assisted Cooperative NOMA with SWIPT

RIS-Assisted Cooperative NOMA with SWIPT

Arxiv

0+阅读 · 2022年4月18日

Faster One-Sample Stochastic Conditional Gradient Method for Composite Convex Minimization

Arxiv

0+阅读 · 2022年4月17日

Space-sequential particle filters for high-dimensional dynamical systems described by stochastic differential equations

Arxiv

0+阅读 · 2022年4月15日

TreeStep: Tree Search for Vector Perturbation Precoding under per-Antenna Power Constraint

TreeStep: Tree Search for Vector Perturbation Precoding under per-Antenna Power Constraint

Arxiv

0+阅读 · 2022年4月15日

VIP会员

文章信息

相关主题

赌博机/老虎机

Processing（编程语言）

相关VIP内容

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【经典书】图理论与应用，270页pdf

专知会员服务

86+阅读 · 2020年12月5日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

【北京大学】Locally Differentially Private (Contextual) Bandits Learning

【北京大学】Locally Differentially Private (Contextual) Bandits Learning

专知会员服务

13+阅读 · 2020年6月8日

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

专知会员服务

77+阅读 · 2020年2月8日

【ICCV 2019 Toturial】Global Optimization for Geometric Understanding with Provable Guarantees（具有可证明保证的几何理解的全局优化）

【ICCV 2019 Toturial】Global Optimization for Geometric Understanding with Provable Guarantees（具有可证明保证的几何理解的全局优化）

专知会员服务

18+阅读 · 2019年11月1日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

提升军事训练能力的最佳人工智能模拟工具

《社交媒体信息作战》最新48页技术报告

《美空军条令出版物：核作战》最新条令

《使用量化测量将传感器节点关联到融合中心的算法设计》171页

相关资讯

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

【论文推荐】最新5篇图像分割（Image Segmentation）相关论文—多重假设、超像素分割、自监督、图、生成对抗网络

【论文推荐】最新5篇图像分割（Image Segmentation）相关论文—多重假设、超像素分割、自监督、图、生成对抗网络

专知

27+阅读 · 2018年2月7日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

相关论文

Near-optimal Policy Optimization Algorithms for Learning Adversarial Linear Mixture MDPs

Arxiv

0+阅读 · 2022年4月20日

Faster Perturbed Stochastic Gradient Methods for Finding Local Minima

Arxiv

0+阅读 · 2022年4月20日

K-LITE: Learning Transferable Visual Models with External Knowledge

Arxiv

2+阅读 · 2022年4月20日

A stochastic Stein Variational Newton method

Arxiv

0+阅读 · 2022年4月19日

Neural Stochastic Partial Differential Equations: Resolution-Invariant Learning of Continuous Spatiotemporal Dynamics

Neural Stochastic Partial Differential Equations: Resolution-Invariant Learning of Continuous Spatiotemporal Dynamics

Arxiv

0+阅读 · 2022年4月19日

Stochastic Saddle Point Problems with Decision-Dependent Distributions

Arxiv

0+阅读 · 2022年4月19日

RIS-Assisted Cooperative NOMA with SWIPT

RIS-Assisted Cooperative NOMA with SWIPT

Arxiv

0+阅读 · 2022年4月18日

Faster One-Sample Stochastic Conditional Gradient Method for Composite Convex Minimization

Arxiv

0+阅读 · 2022年4月17日

Space-sequential particle filters for high-dimensional dynamical systems described by stochastic differential equations

Arxiv

0+阅读 · 2022年4月15日

TreeStep: Tree Search for Vector Perturbation Precoding under per-Antenna Power Constraint

TreeStep: Tree Search for Vector Perturbation Precoding under per-Antenna Power Constraint

Arxiv

0+阅读 · 2022年4月15日

相关基金

储氢、供氢新概念-电解液氨制氢及其反应机理

国家自然科学基金

1+阅读 · 2013年12月31日

超声波电机高效率非线性Hammerstein控制方法

国家自然科学基金

0+阅读 · 2013年12月31日

随机扰动理论和随机算法在大规模矩阵计算中的应用

国家自然科学基金

0+阅读 · 2012年12月31日

利用参量结构实现复杂信号环境下盲信号分离方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

低秩矩阵复原的Schatten-q(0<q<1)正则化理论与算法研究

国家自然科学基金

1+阅读 · 2012年12月31日

超宽带通信数字接收机的压缩采样技术研究

国家自然科学基金

0+阅读 · 2011年12月31日

ZrC基陶瓷中纳米相的析出行为与机理及对性能的影响

国家自然科学基金

0+阅读 · 2011年12月31日

多重齐次多项式优化的近似算法及其应用

国家自然科学基金

0+阅读 · 2011年12月31日

布尔函数的密码性质研究

国家自然科学基金

0+阅读 · 2011年12月31日

双亲性共聚物自组装表面活性胶体粒子及其乳化性能研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员