关于不确定系统中最坏案件控制和学习的近似信息国 (Approximate Information States for Worst-Case Control and Learning in Uncertain Systems) - 专知论文

会员服务 ·

0

INFORMS · 近似 · 控制器 · Learning · 可辨认的 ·

2023 年 1 月 12 日

Approximate Information States for Worst-Case Control and Learning in Uncertain Systems

翻译：关于不确定系统中最坏案件控制和学习的近似信息国

Aditya Dave,Nishanth Venkatesh,Andreas A. Malikopoulos

from arxiv, Preliminary results related to this article were reported in arXiv:2203.15271

In this paper, we investigate discrete-time decision-making problems in uncertain systems with partially observed states. We consider a non-stochastic model, where uncontrolled disturbances acting on the system take values in bounded sets with unknown distributions. We present a general framework for decision-making in such problems by developing the notions of information states and approximate information states. In our definition of an information state, we introduce conditions to identify for an uncertain variable sufficient to construct a dynamic program (DP) that computes an optimal strategy. We show that many information states from the literature on worst-case control actions, e.g., the conditional range, are examples of our more general definition. Next, we relax these conditions to define approximate information states using only output variables, which can be learned from output data without knowledge of system dynamics. We use this notion to formulate an approximate DP that yields a strategy with a bounded performance loss. Finally, we illustrate the application of our results in control and reinforcement learning using numerical examples.

翻译：在本文中,我们调查了部分观察状态的不确定系统中的离散时间决策问题。我们考虑的是非随机模型,在这个模型中,系统上不受控制的干扰行为以未知分布的模组为单位,以未知分布的模组为单位。我们通过发展信息状态和近似信息状态的概念,为这类问题的决策提供了一个总体框架。在信息状态定义中,我们引入了条件,以便为一个不确定的变量确定一个足以计算最佳战略的动态程序(DP)。我们表明,许多资料都来自最坏情况控制行动的文献,例如有条件的范围,是我们更一般定义的例子。接下来,我们放松这些条件,以便用输出变量来定义近似的信息状态,仅使用产出变量,而这种变量可以在没有系统动态知识的情况下从产出数据中学习。我们用这个概念来设计出一个大致的DP,产生一个具有约束性绩效损失的战略。最后,我们用数字例子来说明我们在控制和强化学习方面运用我们的成果。

0

相关内容

INFORMS

《计算机信息》杂志发表高质量的论文，扩大了运筹学和计算的范围，寻求有关理论、方法、实验、系统和应用方面的原创研究论文、新颖的调查和教程论文，以及描述新的和有用的软件工具的论文。官网链接：https://pubsonline.informs.org/journal/ijoc

【2022新书】机器学习中的统计建模:概念和应用，398页pdf

【2022新书】机器学习中的统计建模:概念和应用，398页pdf

专知会员服务

142+阅读 · 2022年11月5日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

【开放书】卡耐基梅隆大学Elaine Shi 教授《Foundations of Distributed Consensus and Blockchains（分布式共识和区块链的基础）》150页pdf

【开放书】卡耐基梅隆大学Elaine Shi 教授《Foundations of Distributed Consensus and Blockchains（分布式共识和区块链的基础）》150页pdf

专知会员服务

30+阅读 · 2022年2月22日

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

专知会员服务

104+阅读 · 2022年2月10日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

专知会员服务

244+阅读 · 2019年10月21日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

Zakharov系统的解的动力学行为研究

国家自然科学基金

0+阅读 · 2015年12月31日

模糊收敛群及其在粗糙集中的应用

国家自然科学基金

2+阅读 · 2015年12月31日

碳硅化钛/铝基自润滑复合材料界面调控及摩擦学性能研究

国家自然科学基金

0+阅读 · 2014年12月31日

TROY及其信号分子在多发性硬化症中的作用及机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

用于白光LED的颜色可调稀土离子掺杂固溶体发光材料的研究

国家自然科学基金

0+阅读 · 2013年12月31日

INF-γ通过CIITA调控PPARγ转录机制及其在2型糖尿病中意义的探讨

国家自然科学基金

0+阅读 · 2013年12月31日

多孔POSS/PDMS分子内杂化膜的制备及其渗透汽化优先透醇性能研究

国家自然科学基金

0+阅读 · 2012年12月31日

白光LED用硅基氮氧化物荧光材料的制备、性能及发光机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于混沌与分形理论的磨损过程动力学行为研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于联合决策与估计的高频超视距雷达信息处理与融合

国家自然科学基金

3+阅读 · 2011年12月31日

Maximum likelihood estimation and prediction error for a Mat{é}rn model on the circle

Arxiv

0+阅读 · 2023年3月8日

Prior and Posterior Networks: A Survey on Evidential Deep Learning Methods For Uncertainty Estimation

Arxiv

0+阅读 · 2023年3月7日

Optimal Methods for Convex Risk Averse Distributed Optimization

Arxiv

0+阅读 · 2023年3月7日

Improved Sample Complexity Bounds for Distributionally Robust Reinforcement Learning

Arxiv

0+阅读 · 2023年3月5日

Uncertainty Estimation by Fisher Information-based Evidential Deep Learning

Arxiv

0+阅读 · 2023年3月3日

Handling Sparse Rewards in Reinforcement Learning Using Model Predictive Control

Arxiv

0+阅读 · 2023年3月3日

Guarded Policy Optimization with Imperfect Online Demonstrations

Arxiv

0+阅读 · 2023年3月3日

Non-Gaussian Uncertainty Minimization Based Control of Stochastic Nonlinear Robotic Systems

Arxiv

0+阅读 · 2023年3月2日

Convex Approximation for Probabilistic Reachable Set under Data-driven Uncertainties

Arxiv

0+阅读 · 2023年3月2日

Aleatoric and Epistemic Uncertainty in Machine Learning: An Introduction to Concepts and Methods

Aleatoric and Epistemic Uncertainty in Machine Learning: An Introduction to Concepts and Methods

Arxiv

15+阅读 · 2020年4月3日

VIP会员

文章信息

相关主题

相关VIP内容

【2022新书】机器学习中的统计建模:概念和应用，398页pdf

【2022新书】机器学习中的统计建模:概念和应用，398页pdf

专知会员服务

142+阅读 · 2022年11月5日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

【开放书】卡耐基梅隆大学Elaine Shi 教授《Foundations of Distributed Consensus and Blockchains（分布式共识和区块链的基础）》150页pdf

【开放书】卡耐基梅隆大学Elaine Shi 教授《Foundations of Distributed Consensus and Blockchains（分布式共识和区块链的基础）》150页pdf

专知会员服务

30+阅读 · 2022年2月22日

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

专知会员服务

104+阅读 · 2022年2月10日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

专知会员服务

244+阅读 · 2019年10月21日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【CMU博士论文】以人为中心的强化学习

任务规划与地形分析：现代复杂环境作战导航体系

认知优势：人工智能在国家安全决策中的核心作用

大模型赋能的具身智能：决策与具身学习综述

相关资讯

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

相关论文

Maximum likelihood estimation and prediction error for a Mat{é}rn model on the circle

Arxiv

0+阅读 · 2023年3月8日

Prior and Posterior Networks: A Survey on Evidential Deep Learning Methods For Uncertainty Estimation

Arxiv

0+阅读 · 2023年3月7日

Optimal Methods for Convex Risk Averse Distributed Optimization

Arxiv

0+阅读 · 2023年3月7日

Improved Sample Complexity Bounds for Distributionally Robust Reinforcement Learning

Arxiv

0+阅读 · 2023年3月5日

Uncertainty Estimation by Fisher Information-based Evidential Deep Learning

Arxiv

0+阅读 · 2023年3月3日

Handling Sparse Rewards in Reinforcement Learning Using Model Predictive Control

Arxiv

0+阅读 · 2023年3月3日

Guarded Policy Optimization with Imperfect Online Demonstrations

Arxiv

0+阅读 · 2023年3月3日

Non-Gaussian Uncertainty Minimization Based Control of Stochastic Nonlinear Robotic Systems

Arxiv

0+阅读 · 2023年3月2日

Convex Approximation for Probabilistic Reachable Set under Data-driven Uncertainties

Arxiv

0+阅读 · 2023年3月2日

Aleatoric and Epistemic Uncertainty in Machine Learning: An Introduction to Concepts and Methods

Aleatoric and Epistemic Uncertainty in Machine Learning: An Introduction to Concepts and Methods

Arxiv

15+阅读 · 2020年4月3日

相关基金

Zakharov系统的解的动力学行为研究

国家自然科学基金

0+阅读 · 2015年12月31日

模糊收敛群及其在粗糙集中的应用

国家自然科学基金

2+阅读 · 2015年12月31日

碳硅化钛/铝基自润滑复合材料界面调控及摩擦学性能研究

国家自然科学基金

0+阅读 · 2014年12月31日

TROY及其信号分子在多发性硬化症中的作用及机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

用于白光LED的颜色可调稀土离子掺杂固溶体发光材料的研究

国家自然科学基金

0+阅读 · 2013年12月31日

INF-γ通过CIITA调控PPARγ转录机制及其在2型糖尿病中意义的探讨

国家自然科学基金

0+阅读 · 2013年12月31日

多孔POSS/PDMS分子内杂化膜的制备及其渗透汽化优先透醇性能研究

国家自然科学基金

0+阅读 · 2012年12月31日

白光LED用硅基氮氧化物荧光材料的制备、性能及发光机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于混沌与分形理论的磨损过程动力学行为研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于联合决策与估计的高频超视距雷达信息处理与融合

国家自然科学基金

3+阅读 · 2011年12月31日

微信扫码咨询专知VIP会员