高山进程土匪的 " 冷漠和良好行动识别 " (Lenient Regret and Good-Action Identification in Gaussian Process Bandits) - 专知论文

会员服务 ·

0

赌博机/老虎机 · Processing（编程语言） · 阈值 · 泛函 · 优化器 ·

2021 年 5 月 26 日

Lenient Regret and Good-Action Identification in Gaussian Process Bandits

翻译：高山进程土匪的 " 冷漠和良好行动识别 "

Xu Cai,Selwyn Gomes,Jonathan Scarlett

from arxiv, ICML 2021

In this paper, we study the problem of Gaussian process (GP) bandits under relaxed optimization criteria stating that any function value above a certain threshold is "good enough". On the theoretical side, we study various {\em lenient regret} notions in which all near-optimal actions incur zero penalty, and provide upper bounds on the lenient regret for GP-UCB and an elimination algorithm, circumventing the usual $O(\sqrt{T})$ term (with time horizon $T$) resulting from zooming extremely close towards the function maximum. In addition, we complement these upper bounds with algorithm-independent lower bounds. On the practical side, we consider the problem of finding a single "good action" according to a known pre-specified threshold, and introduce several good-action identification algorithms that exploit knowledge of the threshold. We experimentally find that such algorithms can often find a good action faster than standard optimization-based approaches.

翻译：在本文中,我们研究了高山进程(GP)匪徒的问题,根据宽松的优化标准,指出任何超过某一阈值的功能值都“足够好 ” 。在理论方面,我们研究了所有接近最佳的行动都受到零处罚的各种概念,并为GP-UCB和清除算法提供了宽度遗憾的上限,绕过通常的美元(以时间范围计)条件(用时间范围计$T$),因为放大到功能上限非常近。此外,我们用依赖算法的下限来补充这些上限。在实际方面,我们考虑根据已知的预设阈值寻找单一“良好行动”的问题,并引入几种利用阈值知识的良好行动识别算法。我们实验性地发现,这种算法往往能找到比标准的优化方法更快的好行动。

0

相关内容

赌博机/老虎机

赌博机/老虎机

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

专知会员服务

69+阅读 · 2021年3月27日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

回顾机器学习公平的数学框架，Review of Mathematical frameworks for Fairness in Machine Learning

回顾机器学习公平的数学框架，Review of Mathematical frameworks for Fairness in Machine Learning

专知会员服务

38+阅读 · 2020年5月30日

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

专知会员服务

77+阅读 · 2020年2月8日

UC.Berkeley CS189讲义教材:《机器学习全面指南》，185页pdf

专知会员服务

162+阅读 · 2020年1月16日

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

专知会员服务

245+阅读 · 2019年10月21日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

LibRec 精选：AutoML for Contextual Bandits

LibRec 精选：AutoML for Contextual Bandits

LibRec智能推荐

7+阅读 · 2019年9月19日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

集成学习入门

集成学习入门

论智

8+阅读 · 2018年3月5日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

Gaussian Process Boosting

Arxiv

0+阅读 · 2021年7月16日

Towards a Dimension-Free Understanding of Adaptive Linear Control

Arxiv

0+阅读 · 2021年7月15日

Independence weights for causal inference with continuous exposures

Arxiv

1+阅读 · 2021年7月15日

Fundamental limits and algorithms for sparse linear regression with sublinear sparsity

Arxiv

0+阅读 · 2021年7月14日

Sublinear Regret for Learning POMDPs

Arxiv

0+阅读 · 2021年7月14日

A Modern Introduction to Online Learning

A Modern Introduction to Online Learning

Arxiv

21+阅读 · 2019年12月31日

Variational Bayesian Reinforcement Learning with Regret Bounds

Arxiv

3+阅读 · 2018年7月25日

A Tour of Reinforcement Learning: The View from Continuous Control

Arxiv

6+阅读 · 2018年6月25日

The Search Problem in Mixture Models

Arxiv

3+阅读 · 2018年2月24日

Inverse Reinforcement Learning via Deep Gaussian Process

Arxiv

3+阅读 · 2017年5月4日

VIP会员

文章信息

相关主题

赌博机/老虎机

Processing（编程语言）

相关VIP内容

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

专知会员服务

69+阅读 · 2021年3月27日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

回顾机器学习公平的数学框架，Review of Mathematical frameworks for Fairness in Machine Learning

回顾机器学习公平的数学框架，Review of Mathematical frameworks for Fairness in Machine Learning

专知会员服务

38+阅读 · 2020年5月30日

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

专知会员服务

77+阅读 · 2020年2月8日

UC.Berkeley CS189讲义教材:《机器学习全面指南》，185页pdf

专知会员服务

162+阅读 · 2020年1月16日

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

专知会员服务

245+阅读 · 2019年10月21日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

小规模训练指南：打造世界级大语言模型的关键方法

无人机编队飞行：复杂环境中作战的策略、挑战与应用

大模型APP，AI时代第一个爆款

从数据中心视角出发的高效大语言模型训练综述

相关资讯

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

LibRec 精选：AutoML for Contextual Bandits

LibRec 精选：AutoML for Contextual Bandits

LibRec智能推荐

7+阅读 · 2019年9月19日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

集成学习入门

集成学习入门

论智

8+阅读 · 2018年3月5日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

相关论文

Gaussian Process Boosting

Arxiv

0+阅读 · 2021年7月16日

Towards a Dimension-Free Understanding of Adaptive Linear Control

Arxiv

0+阅读 · 2021年7月15日

Independence weights for causal inference with continuous exposures

Arxiv

1+阅读 · 2021年7月15日

Fundamental limits and algorithms for sparse linear regression with sublinear sparsity

Arxiv

0+阅读 · 2021年7月14日

Sublinear Regret for Learning POMDPs

Arxiv

0+阅读 · 2021年7月14日

A Modern Introduction to Online Learning

A Modern Introduction to Online Learning

Arxiv

21+阅读 · 2019年12月31日

Variational Bayesian Reinforcement Learning with Regret Bounds

Arxiv

3+阅读 · 2018年7月25日

A Tour of Reinforcement Learning: The View from Continuous Control

Arxiv

6+阅读 · 2018年6月25日

The Search Problem in Mixture Models

Arxiv

3+阅读 · 2018年2月24日

Inverse Reinforcement Learning via Deep Gaussian Process

Arxiv

3+阅读 · 2017年5月4日

微信扫码咨询专知VIP会员