通过信息下下环搜索按顺序排列和平行排列的最大值和平行约束的最大值 Entropy 搜索 (Sequential- and Parallel- Constrained Max-value Entropy Search via Information Lower Bound) - 专知论文

会员服务 ·

0

INFORMS · 估计/估计量 · state-of-the-art · 蒙特卡罗 · 互信息 ·

2021 年 11 月 24 日

Sequential- and Parallel- Constrained Max-value Entropy Search via Information Lower Bound

翻译：通过信息下下环搜索按顺序排列和平行排列的最大值和平行约束的最大值 Entropy 搜索

Shion Takeno,Tomoyuki Tamura,Kazuki Shitara,Masayuki Karasuyama

from arxiv, 39pages, 10 figures

Max-value entropy search (MES) is one of the state-of-the-art approaches in Bayesian optimization (BO). In this paper, we propose a novel variant of MES for constrained problems, called Constrained MES via Information lower BOund (CMES-IBO), that is based on a Monte Carlo (MC) estimator of a lower bound of a mutual information (MI). We first define the MI in which the max-value is defined so that it can incorporate uncertainty with respect to feasibility. Then, we derive a lower bound of the MI that guarantees non-negativity, while a constrained counterpart of conventional MES can be negative. We further provide theoretical analysis that assures the low-variability of our estimator which has never been investigated for any existing information-theoretic BO. Moreover, using the conditional MI, we extend CMES-IBO to the parallel setting while maintaining the desirable properties. We demonstrate the effectiveness of CMES-IBO by several benchmark functions and a real-world problem.

翻译：Max- valu entropy search (MES) 是巴伊西亚优化(BO)中最先进的方法之一。在本文中,我们提出了一个针对受限问题的MES创新的变体,称为“通过信息下游(CMES-IBO)控制MES”,这个变体以蒙特卡洛(Monte Carlo)对互通信息下界的较低范围进行估计(MI)为基础。我们首先定义了最大值定义的MI,从而可以纳入可行性的不确定性。然后,我们从保证非渗透性的MI中获取了一个较低的约束范围,而常规MES的受限对应方可以是负面的。我们还提供了理论分析,以确保我们从未因任何现有信息理论性BO而接受过调查的天主的低可变性。此外,我们利用有条件的MI,我们在维护理想特性的同时,将CMES- IBO扩大到平行环境。我们通过几个基准功能和现实世界问题展示了CMES- IBO的有效性。

0

相关内容

INFORMS

《计算机信息》杂志发表高质量的论文，扩大了运筹学和计算的范围，寻求有关理论、方法、实验、系统和应用方面的原创研究论文、新颖的调查和教程论文，以及描述新的和有用的软件工具的论文。官网链接：https://pubsonline.informs.org/journal/ijoc

【DeepMind】基于模型的强化学习，174页ppt，Model-Based Reinforcement Learning

【DeepMind】基于模型的强化学习，174页ppt，Model-Based Reinforcement Learning

专知会员服务

89+阅读 · 2021年1月12日

策略梯度方法的算子视图，An operator view of policy gradient methods

策略梯度方法的算子视图，An operator view of policy gradient methods

专知会员服务

11+阅读 · 2020年6月23日

【牛津大学ICLR2020】通过元学习的贝叶斯自适应深度RL, VariBAD: A Very Good Method for Bayes-Adaptive Deep RL via Meta-Learning

【牛津大学ICLR2020】通过元学习的贝叶斯自适应深度RL, VariBAD: A Very Good Method for Bayes-Adaptive Deep RL via Meta-Learning

专知会员服务

25+阅读 · 2020年2月28日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

在线变分推断，76页ppt，A Regret Bound for Online Variational Inference

在线变分推断，76页ppt，A Regret Bound for Online Variational Inference

专知会员服务

21+阅读 · 2019年12月2日

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

10+阅读 · 2019年10月24日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【泡泡一分钟】基于表面的自主三维建模探索

【泡泡一分钟】基于表面的自主三维建模探索

泡泡机器人SLAM

9+阅读 · 2019年9月10日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

条件GAN重大改进！cGANs with Projection Discriminator

条件GAN重大改进！cGANs with Projection Discriminator

CreateAMind

8+阅读 · 2018年2月7日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

Search Trajectories Networks of Multiobjective Evolutionary Algorithms

Search Trajectories Networks of Multiobjective Evolutionary Algorithms

Arxiv

0+阅读 · 2022年1月27日

Confidence-Aware Imitation Learning from Demonstrations with Varying Optimality

Arxiv

0+阅读 · 2022年1月26日

Model-free and Bayesian Ensembling Model-based Deep Reinforcement Learning for Particle Accelerator Control Demonstrated on the FERMI FEL

Arxiv

0+阅读 · 2022年1月26日

Information-Theoretic Characterization of the Generalization Error for Iterative Semi-Supervised Learning

Arxiv

0+阅读 · 2022年1月26日

Neuro-Symbolic Entropy Regularization

Arxiv

1+阅读 · 2022年1月25日

Density Constrained Reinforcement Learning

Arxiv

6+阅读 · 2021年6月24日

Inverse Constrained Reinforcement Learning

Arxiv

8+阅读 · 2021年5月21日

Generalization and Regularization in DQN

Generalization and Regularization in DQN

Arxiv

6+阅读 · 2019年1月30日

Reward learning from human preferences and demonstrations in Atari

Arxiv

8+阅读 · 2018年11月15日

Inverse Reinforcement Learning via Deep Gaussian Process

Arxiv

3+阅读 · 2017年5月4日

VIP会员

文章信息

相关主题

估计/估计量

state-of-the-art

相关VIP内容

【DeepMind】基于模型的强化学习，174页ppt，Model-Based Reinforcement Learning

【DeepMind】基于模型的强化学习，174页ppt，Model-Based Reinforcement Learning

专知会员服务

89+阅读 · 2021年1月12日

策略梯度方法的算子视图，An operator view of policy gradient methods

策略梯度方法的算子视图，An operator view of policy gradient methods

专知会员服务

11+阅读 · 2020年6月23日

【牛津大学ICLR2020】通过元学习的贝叶斯自适应深度RL, VariBAD: A Very Good Method for Bayes-Adaptive Deep RL via Meta-Learning

【牛津大学ICLR2020】通过元学习的贝叶斯自适应深度RL, VariBAD: A Very Good Method for Bayes-Adaptive Deep RL via Meta-Learning

专知会员服务

25+阅读 · 2020年2月28日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

在线变分推断，76页ppt，A Regret Bound for Online Variational Inference

在线变分推断，76页ppt，A Regret Bound for Online Variational Inference

专知会员服务

21+阅读 · 2019年12月2日

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

10+阅读 · 2019年10月24日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

《解析陆域作战方向：一个概念性框架》报告

《人工智能与人类的未来》2025年最新300页书籍

追寻真正的AI自主性：从遗留思维到战场优势

《“蛛网”行动：乌克兰不对称作战的演进》报告

相关资讯

【泡泡一分钟】基于表面的自主三维建模探索

【泡泡一分钟】基于表面的自主三维建模探索

泡泡机器人SLAM

9+阅读 · 2019年9月10日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

条件GAN重大改进！cGANs with Projection Discriminator

条件GAN重大改进！cGANs with Projection Discriminator

CreateAMind

8+阅读 · 2018年2月7日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

相关论文

Search Trajectories Networks of Multiobjective Evolutionary Algorithms

Search Trajectories Networks of Multiobjective Evolutionary Algorithms

Arxiv

0+阅读 · 2022年1月27日

Confidence-Aware Imitation Learning from Demonstrations with Varying Optimality

Arxiv

0+阅读 · 2022年1月26日

Model-free and Bayesian Ensembling Model-based Deep Reinforcement Learning for Particle Accelerator Control Demonstrated on the FERMI FEL

Arxiv

0+阅读 · 2022年1月26日

Information-Theoretic Characterization of the Generalization Error for Iterative Semi-Supervised Learning

Arxiv

0+阅读 · 2022年1月26日

Neuro-Symbolic Entropy Regularization

Arxiv

1+阅读 · 2022年1月25日

Density Constrained Reinforcement Learning

Arxiv

6+阅读 · 2021年6月24日

Inverse Constrained Reinforcement Learning

Arxiv

8+阅读 · 2021年5月21日

Generalization and Regularization in DQN

Generalization and Regularization in DQN

Arxiv

6+阅读 · 2019年1月30日

Reward learning from human preferences and demonstrations in Atari

Arxiv

8+阅读 · 2018年11月15日

Inverse Reinforcement Learning via Deep Gaussian Process

Arxiv

3+阅读 · 2017年5月4日

微信扫码咨询专知VIP会员