Instance-Optimality in Interactive Decision Making: Toward a Non-Asymptotic Theory - 专知论文

会员服务 ·

0

INTERACT · Performer · Learning · 示例 · 广义函数 ·

2023 年 4 月 24 日

Instance-Optimality in Interactive Decision Making: Toward a Non-Asymptotic Theory

翻译：暂无翻译

Andrew Wagenmaker,Dylan J. Foster

We consider the development of adaptive, instance-dependent algorithms for interactive decision making (bandits, reinforcement learning, and beyond) that, rather than only performing well in the worst case, adapt to favorable properties of real-world instances for improved performance. We aim for instance-optimality, a strong notion of adaptivity which asserts that, on any particular problem instance, the algorithm under consideration outperforms all consistent algorithms. Instance-optimality enjoys a rich asymptotic theory originating from the work of \citet{lai1985asymptotically,graves1997asymptotically}, but non-asymptotic guarantees have remained elusive outside of certain special cases. Even for problems as simple as tabular reinforcement learning, existing algorithms do not attain instance-optimal performance until the number of rounds of interaction is doubly exponential in the number of states. In this paper, we take the first step toward developing a non-asymptotic theory of instance-optimal decision making with general function approximation. We introduce a new complexity measure, the Allocation-Estimation Coefficient (AEC), and provide a new algorithm, $\mathsf{AE}^2$, which attains non-asymptotic instance-optimal performance at a rate controlled by the AEC. Our results recover the best known guarantees for well-studied problems such as finite-armed and linear bandits and, when specialized to tabular reinforcement learning, attain the first instance-optimal regret bounds with polynomial dependence on all problem parameters, improving over prior work exponentially. We complement these results with lower bounds that show that i) existing notions of statistical complexity are insufficient to derive non-asymptotic guarantees, and ii) under certain technical conditions, boundedness of the AEC is necessary to learn an instance-optimal allocation of decisions in finite time.

翻译：暂无翻译

0

相关内容

INTERACT

IFIP TC13 Conference on Human-Computer Interaction是人机交互领域的研究者和实践者展示其工作的重要平台。多年来，这些会议吸引了来自几个国家和文化的研究人员。官网链接：http://interact2019.org/

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

专知会员服务

69+阅读 · 2021年3月27日

机器学习组合优化

机器学习组合优化

专知会员服务

110+阅读 · 2021年2月16日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文推荐】最新十篇目标跟踪相关论文—多帧光流跟踪、动态图学习、MV-YOLO、姿态估计、深度核相关滤波、Benchmark

【论文推荐】最新十篇目标跟踪相关论文—多帧光流跟踪、动态图学习、MV-YOLO、姿态估计、深度核相关滤波、Benchmark

专知

13+阅读 · 2018年5月26日

MoCoGAN 分解运动和内容的视频生成

MoCoGAN 分解运动和内容的视频生成

CreateAMind

18+阅读 · 2017年10月21日

【推荐】SVM实例教程

【推荐】SVM实例教程

机器学习研究会

17+阅读 · 2017年8月26日

Hamilton-Jacibi方程的弱KAM理论

国家自然科学基金

2+阅读 · 2017年12月31日

合作均衡的本质稳定性研究

国家自然科学基金

0+阅读 · 2015年12月31日

稀疏表达下的非负矩阵分解在入侵检测中的研究

国家自然科学基金

0+阅读 · 2013年12月31日

马铃薯野生种（Solanum pinnatisectum）块茎变紫的光谱效应及其分子机理

国家自然科学基金

0+阅读 · 2012年12月31日

甲烷及二氧化碳催化转化的第一原理及多尺度理论研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于超冷极性分子的量子态制备与外场操控

国家自然科学基金

0+阅读 · 2012年12月31日

高稳定性有序介孔尖晶石AFe2O4(A=Zn,Cu,Co,Ni)的可控制备及可见光催化分解水制氢研究

国家自然科学基金

0+阅读 · 2012年12月31日

强八元数矩阵代数与矢量传感器阵列多维信号处理

国家自然科学基金

0+阅读 · 2011年12月31日

基于Decorin基因甲基化调控的非小细胞肺癌转移的分子机制

国家自然科学基金

0+阅读 · 2011年12月31日

水蒸气条件下稳定的微孔Nb2O5气体分离膜制备的基础研究

国家自然科学基金

0+阅读 · 2009年12月31日

On the effectiveness of partial variance reduction in federated learning with heterogeneous data

Arxiv

0+阅读 · 2023年6月9日

Explaining Predictive Uncertainty with Information Theoretic Shapley Values

Arxiv

0+阅读 · 2023年6月9日

Decentralized Randomly Distributed Multi-agent Multi-armed Bandit with Heterogeneous Rewards

Arxiv

0+阅读 · 2023年6月8日

Who are CUIs Really For? Representation and Accessibility in the Conversational User Interface Literature

Arxiv

0+阅读 · 2023年6月8日

Robust online active learning

Arxiv

1+阅读 · 2023年6月8日

An Adaptive and Robust Deep Learning Framework for THz Ultra-Massive MIMO Channel Estimation

Arxiv

0+阅读 · 2023年6月8日

Actively Supervised Clustering for Open Relation Extraction

Arxiv

0+阅读 · 2023年6月8日

The superadditivity effects of quantum capacity decrease with the dimension for qudit depolarizing channels

Arxiv

0+阅读 · 2023年6月7日

A Survey of Decision Making in Adversarial Games

Arxiv

84+阅读 · 2022年7月16日

Domain Adaptive Faster R-CNN for Object Detection in the Wild

Arxiv

10+阅读 · 2018年3月8日

VIP会员

文章信息

相关主题

相关VIP内容

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

专知会员服务

69+阅读 · 2021年3月27日

机器学习组合优化

机器学习组合优化

专知会员服务

110+阅读 · 2021年2月16日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

大语言模型智能体强化学习：全景综述

《城市滨海地区：理解复杂多变环境下的指挥控制框架》50页报告

【伯克利博士论文】从推理服务到训练：面向大规模 LLM 智能体的高效系统

美空军“顶点2025”实验：推进AI在C2、动态目标锁定与联盟集成中的应用

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文推荐】最新十篇目标跟踪相关论文—多帧光流跟踪、动态图学习、MV-YOLO、姿态估计、深度核相关滤波、Benchmark

【论文推荐】最新十篇目标跟踪相关论文—多帧光流跟踪、动态图学习、MV-YOLO、姿态估计、深度核相关滤波、Benchmark

专知

13+阅读 · 2018年5月26日

MoCoGAN 分解运动和内容的视频生成

MoCoGAN 分解运动和内容的视频生成

CreateAMind

18+阅读 · 2017年10月21日

【推荐】SVM实例教程

【推荐】SVM实例教程

机器学习研究会

17+阅读 · 2017年8月26日

相关论文

On the effectiveness of partial variance reduction in federated learning with heterogeneous data

Arxiv

0+阅读 · 2023年6月9日

Explaining Predictive Uncertainty with Information Theoretic Shapley Values

Arxiv

0+阅读 · 2023年6月9日

Decentralized Randomly Distributed Multi-agent Multi-armed Bandit with Heterogeneous Rewards

Arxiv

0+阅读 · 2023年6月8日

Who are CUIs Really For? Representation and Accessibility in the Conversational User Interface Literature

Arxiv

0+阅读 · 2023年6月8日

Robust online active learning

Arxiv

1+阅读 · 2023年6月8日

An Adaptive and Robust Deep Learning Framework for THz Ultra-Massive MIMO Channel Estimation

Arxiv

0+阅读 · 2023年6月8日

Actively Supervised Clustering for Open Relation Extraction

Arxiv

0+阅读 · 2023年6月8日

The superadditivity effects of quantum capacity decrease with the dimension for qudit depolarizing channels

Arxiv

0+阅读 · 2023年6月7日

A Survey of Decision Making in Adversarial Games

Arxiv

84+阅读 · 2022年7月16日

Domain Adaptive Faster R-CNN for Object Detection in the Wild

Arxiv

10+阅读 · 2018年3月8日

相关基金

Hamilton-Jacibi方程的弱KAM理论

国家自然科学基金

2+阅读 · 2017年12月31日

合作均衡的本质稳定性研究

国家自然科学基金

0+阅读 · 2015年12月31日

稀疏表达下的非负矩阵分解在入侵检测中的研究

国家自然科学基金

0+阅读 · 2013年12月31日

马铃薯野生种（Solanum pinnatisectum）块茎变紫的光谱效应及其分子机理

国家自然科学基金

0+阅读 · 2012年12月31日

甲烷及二氧化碳催化转化的第一原理及多尺度理论研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于超冷极性分子的量子态制备与外场操控

国家自然科学基金

0+阅读 · 2012年12月31日

高稳定性有序介孔尖晶石AFe2O4(A=Zn,Cu,Co,Ni)的可控制备及可见光催化分解水制氢研究

国家自然科学基金

0+阅读 · 2012年12月31日

强八元数矩阵代数与矢量传感器阵列多维信号处理

国家自然科学基金

0+阅读 · 2011年12月31日

基于Decorin基因甲基化调控的非小细胞肺癌转移的分子机制

国家自然科学基金

0+阅读 · 2011年12月31日

水蒸气条件下稳定的微孔Nb2O5气体分离膜制备的基础研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员