带有匿名综合综合匿名延迟反馈的有声有声的记忆反反弹强盗 (Bounded Memory Adversarial Bandits with Composite Anonymous Delayed Feedback) - 专知论文

会员服务 ·

0

赌博机/老虎机 · 情景 · 损失 · 最优化 · ENJOY ·

2022 年 4 月 28 日

Bounded Memory Adversarial Bandits with Composite Anonymous Delayed Feedback

翻译：带有匿名综合综合匿名延迟反馈的有声有声的记忆反反弹强盗

Zongqi Wan,Xiaoming Sun,Jialin Zhang

from arxiv, IJCAI'2022

We study the adversarial bandit problem with composite anonymous delayed feedback. In this setting, losses of an action are split into $d$ components, spreading over consecutive rounds after the action is chosen. And in each round, the algorithm observes the aggregation of losses that come from the latest $d$ rounds. Previous works focus on oblivious adversarial setting, while we investigate the harder non-oblivious setting. We show non-oblivious setting incurs $\Omega(T)$ pseudo regret even when the loss sequence is bounded memory. However, we propose a wrapper algorithm which enjoys $o(T)$ policy regret on many adversarial bandit problems with the assumption that the loss sequence is bounded memory. Especially, for $K$-armed bandit and bandit convex optimization, we have $\mathcal{O}(T^{2/3})$ policy regret bound. We also prove a matching lower bound for $K$-armed bandit. Our lower bound works even when the loss sequence is oblivious but the delay is non-oblivious. It answers the open problem proposed in \cite{wang2021adaptive}, showing that non-oblivious delay is enough to incur $\tilde{\Omega}(T^{2/3})$ regret.

翻译：我们用复合匿名延迟反馈来研究对抗性土匪问题。在这种环境下, 行动的损失被分割成美元的组成部分, 在选择行动后连续几轮。在每轮中, 算法观察最新的美元回合产生的损失汇总情况。先前的工作重点是模糊的对抗环境, 而我们调查较难的非显眼环境。我们显示非显眼的设置导致$\ Omega( T) 的伪遗憾, 即使损失序列与内存有关。但是, 我们提出一个包装算法, 在许多对抗性土匪问题上享有$( T) 的政策遗憾, 并在假设损失序列是约束性记忆的情况下, 。特别是, $( K) 武装的土匪和土匪的 convex 优化, 我们有$\ mathcal{O} (T+2/3} 政策遗憾。我们还证明, $( T) 和$( K) 手持土匪的比下低。我们较低的约束算得更低。即使在损失序列为模糊, 但是延迟也是不明显的。它解了一个公开的问题, 显示的是, $2_\\\\\\\\\\\\\\\\\\\\}

0

相关内容

赌博机/老虎机

赌博机/老虎机

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

计算机科学课程与视频课件合集，Computer Science courses with video lectures

计算机科学课程与视频课件合集，Computer Science courses with video lectures

专知会员服务

37+阅读 · 2022年1月24日

ICLR 2021杰出论文奖出炉，8篇论文上榜！

专知会员服务

26+阅读 · 2021年4月2日

【快讯】ICML 2020论文出炉，1088篇上榜，你的paper中了吗？

【快讯】ICML 2020论文出炉，1088篇上榜，你的paper中了吗？

专知会员服务

52+阅读 · 2020年6月1日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

【推荐】SVM实例教程

【推荐】SVM实例教程

机器学习研究会

17+阅读 · 2017年8月26日

【推荐】图像分类必读开创性论文汇总

【推荐】图像分类必读开创性论文汇总

机器学习研究会

14+阅读 · 2017年8月15日

罗巴代数的表示和罗巴代数在operad中的应用

国家自然科学基金

0+阅读 · 2015年12月31日

Alpha稳定分布环境下的非圆信号波达方向估计方法研究

国家自然科学基金

0+阅读 · 2015年12月31日

对称性破缺条件下耦合系统chimera态的特性研究

国家自然科学基金

0+阅读 · 2013年12月31日

Persephin在急性肾损伤中的作用研究

国家自然科学基金

0+阅读 · 2012年12月31日

组蛋白去乙酰化酶抑制剂对骨关节炎中Notch-NFAT信号通路调控的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

巨磁致伸缩材料中磁机械效应和磁致伸缩

国家自然科学基金

0+阅读 · 2012年12月31日

网关口令认证密钥交换协议的模型与设计研究

国家自然科学基金

0+阅读 · 2012年12月31日

面向属性的CPN建模及On the Fly辅助的测试生成方法研究

国家自然科学基金

0+阅读 · 2011年12月31日

基于认知无线电的宽带航空数据链基础理论研究

国家自然科学基金

1+阅读 · 2009年12月31日

口令认证密钥交换协议的可证明安全性研究

国家自然科学基金

1+阅读 · 2008年12月31日

Adversarially Robust Multi-Armed Bandit Algorithm with Variance-Dependent Regret Bounds

Arxiv

0+阅读 · 2022年6月14日

Near-Optimal Randomized Exploration for Tabular Markov Decision Processes

Arxiv

0+阅读 · 2022年6月14日

Near-Optimal Sample Complexity Bounds for Constrained MDPs

Near-Optimal Sample Complexity Bounds for Constrained MDPs

Arxiv

0+阅读 · 2022年6月13日

Towards an Approximation-Aware Computational Workflow Framework for Accelerating Large-Scale Discovery Tasks

Arxiv

0+阅读 · 2022年6月13日

Scalable Exploration for Neural Online Learning to Rank with Perturbed Feedback

Arxiv

0+阅读 · 2022年6月13日

The Price of Incentivizing Exploration: A Characterization via Thompson Sampling and Sample Complexity

Arxiv

0+阅读 · 2022年6月12日

Prioritized training on points that are learnable, worth learning, and not yet learned (workshop version)

Arxiv

0+阅读 · 2022年6月11日

Generalization Bounds with Minimal Dependency on Hypothesis Class via Distributionally Robust Optimization

Arxiv

0+阅读 · 2022年6月10日

Learning Classifiers under Delayed Feedback with a Time Window Assumption

Arxiv

0+阅读 · 2022年6月10日

Composite Adversarial Attacks

Arxiv

12+阅读 · 2020年12月10日

VIP会员

文章信息

相关主题

赌博机/老虎机

相关VIP内容

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

计算机科学课程与视频课件合集，Computer Science courses with video lectures

计算机科学课程与视频课件合集，Computer Science courses with video lectures

专知会员服务

37+阅读 · 2022年1月24日

ICLR 2021杰出论文奖出炉，8篇论文上榜！

专知会员服务

26+阅读 · 2021年4月2日

【快讯】ICML 2020论文出炉，1088篇上榜，你的paper中了吗？

【快讯】ICML 2020论文出炉，1088篇上榜，你的paper中了吗？

专知会员服务

52+阅读 · 2020年6月1日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【CMU博士论文】数据驱动决策中的激励、信息与不确定性

DGP双粒度提示框架：图增强大模型助力欺诈检测

【ICCV2025】ESSENTIAL：用于视频类增量学习的情景记忆与语义记忆整合

唯快不破：大型语言模型高效架构综述

相关资讯

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

【推荐】SVM实例教程

【推荐】SVM实例教程

机器学习研究会

17+阅读 · 2017年8月26日

【推荐】图像分类必读开创性论文汇总

【推荐】图像分类必读开创性论文汇总

机器学习研究会

14+阅读 · 2017年8月15日

相关论文

Adversarially Robust Multi-Armed Bandit Algorithm with Variance-Dependent Regret Bounds

Arxiv

0+阅读 · 2022年6月14日

Near-Optimal Randomized Exploration for Tabular Markov Decision Processes

Arxiv

0+阅读 · 2022年6月14日

Near-Optimal Sample Complexity Bounds for Constrained MDPs

Near-Optimal Sample Complexity Bounds for Constrained MDPs

Arxiv

0+阅读 · 2022年6月13日

Towards an Approximation-Aware Computational Workflow Framework for Accelerating Large-Scale Discovery Tasks

Arxiv

0+阅读 · 2022年6月13日

Scalable Exploration for Neural Online Learning to Rank with Perturbed Feedback

Arxiv

0+阅读 · 2022年6月13日

The Price of Incentivizing Exploration: A Characterization via Thompson Sampling and Sample Complexity

Arxiv

0+阅读 · 2022年6月12日

Prioritized training on points that are learnable, worth learning, and not yet learned (workshop version)

Arxiv

0+阅读 · 2022年6月11日

Generalization Bounds with Minimal Dependency on Hypothesis Class via Distributionally Robust Optimization

Arxiv

0+阅读 · 2022年6月10日

Learning Classifiers under Delayed Feedback with a Time Window Assumption

Arxiv

0+阅读 · 2022年6月10日

Composite Adversarial Attacks

Arxiv

12+阅读 · 2020年12月10日

相关基金

罗巴代数的表示和罗巴代数在operad中的应用

国家自然科学基金

0+阅读 · 2015年12月31日

Alpha稳定分布环境下的非圆信号波达方向估计方法研究

国家自然科学基金

0+阅读 · 2015年12月31日

对称性破缺条件下耦合系统chimera态的特性研究

国家自然科学基金

0+阅读 · 2013年12月31日

Persephin在急性肾损伤中的作用研究

国家自然科学基金

0+阅读 · 2012年12月31日

组蛋白去乙酰化酶抑制剂对骨关节炎中Notch-NFAT信号通路调控的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

巨磁致伸缩材料中磁机械效应和磁致伸缩

国家自然科学基金

0+阅读 · 2012年12月31日

网关口令认证密钥交换协议的模型与设计研究

国家自然科学基金

0+阅读 · 2012年12月31日

面向属性的CPN建模及On the Fly辅助的测试生成方法研究

国家自然科学基金

0+阅读 · 2011年12月31日

基于认知无线电的宽带航空数据链基础理论研究

国家自然科学基金

1+阅读 · 2009年12月31日

口令认证密钥交换协议的可证明安全性研究

国家自然科学基金

1+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员