以线性方案编制为基础的解决方案方法,解决受制约的POMDPs (Linear programming-based solution methods for constrained POMDPs) - 专知论文

会员服务 ·

0

Analysis · 近似 · 线性的 · 部分可观测马尔可夫决策过程 · MoDELS ·

2022 年 6 月 28 日

Linear programming-based solution methods for constrained POMDPs

翻译：以线性方案编制为基础的解决方案方法,解决受制约的POMDPs

Can Kavaklioglu,Robert Helmeczi,Mucahit Cevik

from arxiv, 27 pages

Constrained partially observable Markov decision processes (CPOMDPs) have been used to model various real-world phenomena. However, they are notoriously difficult to solve to optimality, and there exist only a few approximation methods for obtaining high-quality solutions. In this study, we use grid-based approximations in combination with linear programming (LP) models to generate approximate policies for CPOMDPs. We consider five CPOMDP problem instances and conduct a detailed numerical study of both their finite and infinite horizon formulations. We first establish the quality of the approximate unconstrained POMDP policies through a comparative analysis with exact solution methods. We then show the performance of the LP-based CPOMDP solution approaches for varying budget levels (i.e., cost limits) for different problem instances. Finally, we show the flexibility of LP-based approaches by applying deterministic policy constraints, and investigate the impact that these constraints have on collected rewards and CPU run time. Our analysis demonstrates that LP models can effectively generate approximate policies for both finite and infinite horizon problems, while providing the flexibility to incorporate various additional constraints into the underlying model.

翻译：对部分可观察的Markov 决策程序(CPOMDPs)进行了严格的模拟,以模拟各种现实世界现象,然而,这些进程很难找到最佳的解决办法,而且只有几种近似方法来获得高质量的解决办法。在本研究中,我们使用基于网格的近似值,结合线性方案编制模型,为CPOMDPs制定大致的政策。我们考虑了五种CPOMDP问题实例,对其有限的和无限的地平线配方进行详细的数字研究。我们首先通过用精确的解决方案方法进行比较分析,确定大约未受限制的POMDP政策的质量。然后,我们展示基于LP的CPOMDP解决方案方法在不同预算水平(即成本限制)方面的绩效。最后,我们通过运用确定性的政策限制,展示基于网格的近近似度方法的灵活性,并调查这些制约因素对所收集的奖赏和CPU运行时间的影响。我们的分析表明,LP模型能够有效地为有限和无限的地平线问题制定大致政策,同时提供灵活性,将各种额外限制纳入基本模式。

0

相关内容

Analysis

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

南大《优化方法（Optimization Methods》课程，推荐！

南大《优化方法（Optimization Methods》课程，推荐！

专知会员服务

80+阅读 · 2022年4月3日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

不可错过！UIUC最新《统计强化学习》课程！

专知会员服务

54+阅读 · 2020年9月7日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

93+阅读 · 2020年2月12日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Call for Nominations: 2022 Multimedia Prize Paper Award

Call for Nominations: 2022 Multimedia Prize Paper Award

CCF多媒体专委会

0+阅读 · 2022年2月12日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium5

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium5

中国图象图形学学会CSIG

1+阅读 · 2021年11月11日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

中国图象图形学学会CSIG

0+阅读 · 2021年11月10日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

高维模糊数值函数分析学、模糊凸分析与优化理论

国家自然科学基金

0+阅读 · 2014年12月31日

MicroRNA调控BACE1在AD发病中的作用与机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

不确定性推理的广义概率模型及其逻辑基础

国家自然科学基金

3+阅读 · 2014年12月31日

集群环境下基于内存的高性能数据管理与分析

国家自然科学基金

0+阅读 · 2013年12月31日

Ad hoc网络中基于博弈论的激励合作路由算法研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于动态余量的EMU能量模型与策略研究

国家自然科学基金

0+阅读 · 2012年12月31日

Eulerian bond-cubic 模型渗流性质的数值研究

国家自然科学基金

0+阅读 · 2012年12月31日

面向属性的CPN建模及On the Fly辅助的测试生成方法研究

国家自然科学基金

0+阅读 · 2011年12月31日

SHOX基因下游增强子的识别及调控活性分析

国家自然科学基金

0+阅读 · 2011年12月31日

基于ORP的锌净化过程控制参数化优化方法研究

国家自然科学基金

0+阅读 · 2011年12月31日

An Adaptively Resized Parametric Bootstrap for Inference in High-dimensional Generalized Linear Models

An Adaptively Resized Parametric Bootstrap for Inference in High-dimensional Generalized Linear Models

Arxiv

0+阅读 · 2022年8月18日

Optimal designs for discrete choice models via graph Laplacians

Optimal designs for discrete choice models via graph Laplacians

Arxiv

0+阅读 · 2022年8月18日

Diet Code is Healthy: Simplifying Programs for Pre-Trained Models of Code

Arxiv

0+阅读 · 2022年8月17日

Interactive and Visual Prompt Engineering for Ad-hoc Task Adaptation with Large Language Models

Interactive and Visual Prompt Engineering for Ad-hoc Task Adaptation with Large Language Models

Arxiv

0+阅读 · 2022年8月16日

Stream-based active learning with linear models

Arxiv

0+阅读 · 2022年8月16日

The Correlated Arc Orienteering Problem

Arxiv

0+阅读 · 2022年8月16日

On Efficient and Scalable Computation of the Nonparametric Maximum Likelihood Estimator in Mixture Models

Arxiv

0+阅读 · 2022年8月16日

Adaptive Methods for Real-World Domain Generalization

Arxiv

13+阅读 · 2021年3月29日

Differentiable Reasoning on Large Knowledge Bases and Natural Language

Arxiv

12+阅读 · 2019年12月17日

Class-Balanced Loss Based on Effective Number of Samples

Arxiv

12+阅读 · 2019年1月16日

VIP会员

文章信息

相关主题

部分可观测马尔可夫决策过程

相关VIP内容

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

南大《优化方法（Optimization Methods》课程，推荐！

南大《优化方法（Optimization Methods》课程，推荐！

专知会员服务

80+阅读 · 2022年4月3日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

不可错过！UIUC最新《统计强化学习》课程！

专知会员服务

54+阅读 · 2020年9月7日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

93+阅读 · 2020年2月12日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【NeurIPS2025】语义提示扩散变换器的像素级精确深度估计

俄乌冲突的地缘政治与军事教训（万字长文）

【博士论文】弥合多模态基础模型与世界模型之间的鸿沟

量子增强计算机视觉：超越经典算法

相关资讯

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Call for Nominations: 2022 Multimedia Prize Paper Award

Call for Nominations: 2022 Multimedia Prize Paper Award

CCF多媒体专委会

0+阅读 · 2022年2月12日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium5

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium5

中国图象图形学学会CSIG

1+阅读 · 2021年11月11日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

中国图象图形学学会CSIG

0+阅读 · 2021年11月10日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

An Adaptively Resized Parametric Bootstrap for Inference in High-dimensional Generalized Linear Models

An Adaptively Resized Parametric Bootstrap for Inference in High-dimensional Generalized Linear Models

Arxiv

0+阅读 · 2022年8月18日

Optimal designs for discrete choice models via graph Laplacians

Optimal designs for discrete choice models via graph Laplacians

Arxiv

0+阅读 · 2022年8月18日

Diet Code is Healthy: Simplifying Programs for Pre-Trained Models of Code

Arxiv

0+阅读 · 2022年8月17日

Interactive and Visual Prompt Engineering for Ad-hoc Task Adaptation with Large Language Models

Interactive and Visual Prompt Engineering for Ad-hoc Task Adaptation with Large Language Models

Arxiv

0+阅读 · 2022年8月16日

Stream-based active learning with linear models

Arxiv

0+阅读 · 2022年8月16日

The Correlated Arc Orienteering Problem

Arxiv

0+阅读 · 2022年8月16日

On Efficient and Scalable Computation of the Nonparametric Maximum Likelihood Estimator in Mixture Models

Arxiv

0+阅读 · 2022年8月16日

Adaptive Methods for Real-World Domain Generalization

Arxiv

13+阅读 · 2021年3月29日

Differentiable Reasoning on Large Knowledge Bases and Natural Language

Arxiv

12+阅读 · 2019年12月17日

Class-Balanced Loss Based on Effective Number of Samples

Arxiv

12+阅读 · 2019年1月16日

相关基金

高维模糊数值函数分析学、模糊凸分析与优化理论

国家自然科学基金

0+阅读 · 2014年12月31日

MicroRNA调控BACE1在AD发病中的作用与机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

不确定性推理的广义概率模型及其逻辑基础

国家自然科学基金

3+阅读 · 2014年12月31日

集群环境下基于内存的高性能数据管理与分析

国家自然科学基金

0+阅读 · 2013年12月31日

Ad hoc网络中基于博弈论的激励合作路由算法研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于动态余量的EMU能量模型与策略研究

国家自然科学基金

0+阅读 · 2012年12月31日

Eulerian bond-cubic 模型渗流性质的数值研究

国家自然科学基金

0+阅读 · 2012年12月31日

面向属性的CPN建模及On the Fly辅助的测试生成方法研究

国家自然科学基金

0+阅读 · 2011年12月31日

SHOX基因下游增强子的识别及调控活性分析

国家自然科学基金

0+阅读 · 2011年12月31日

基于ORP的锌净化过程控制参数化优化方法研究

国家自然科学基金

0+阅读 · 2011年12月31日

微信扫码咨询专知VIP会员