POMDPs 活性状态轨迹估计和模糊的光滑元件 (Smoother Entropy for Active State Trajectory Estimation and Obfuscation in POMDPs) - 专知论文

会员服务 ·

0

估计/估计量 · 易处理的 · 部分可观测马尔可夫决策过程 · Processing（编程语言） · 可辨认的 ·

2021 年 8 月 19 日

Smoother Entropy for Active State Trajectory Estimation and Obfuscation in POMDPs

翻译：POMDPs 活性状态轨迹估计和模糊的光滑元件

Timothy L. Molloy,Girish N. Nair

from arxiv, 41 pages, 2 figures, submitted (under review)

We study the problem of controlling a partially observed Markov decision process (POMDP) to either aid or hinder the estimation of its state trajectory by optimising the conditional entropy of the state trajectory given measurements and controls, a quantity we dub the smoother entropy. Our consideration of the smoother entropy contrasts with previous active state estimation and obfuscation approaches that instead resort to measures of marginal (or instantaneous) state uncertainty due to tractability concerns. By establishing novel expressions of the smoother entropy in terms of the usual POMDP belief state, we show that our active estimation and obfuscation problems can be reformulated as Markov decision processes (MDPs) that are fully observed in the belief state. Surprisingly, we identify belief-state MDP reformulations of both active estimation and obfuscation with concave cost and cost-to-go functions, which enables the use of standard POMDP techniques to construct tractable bounded-error (approximate) solutions. We show in simulations that optimisation of the smoother entropy leads to superior trajectory estimation and obfuscation compared to alternative approaches.

翻译：我们研究了控制部分观察到的Markov决定过程(POMDP)的问题,以便通过优化测量和控制,优化国家轨迹的有条件环流,从而帮助或阻碍对其国家轨迹的估计,我们研究了控制部分观测到的Markov决定过程的问题。我们对光滑的环流的考虑与先前的积极国家估计和模糊的方法形成对比,而后者则诉诸于边际(或即时)状态的不确定性措施,因为可移性问题。我们根据POMDP通常的信仰状态,对光滑的环流进行新的表达,我们表明,我们的积极估计和模糊问题可以作为在信仰状态中充分观察到的Markov决定过程(MDPs)重新拟订。令人惊讶的是,我们发现,对主动估计和迷惑两种方法的信念-状态的重新拟订,与连锁成本和成本-成本-go函数相比,使得能够使用标准的POMDP技术来构建可拉动的捆绑的(近)解决方案。我们在模拟中显示,对光滑的诱导器的优化可导致高级的轨迹估计和反转的替代方法。

0

相关内容

估计/估计量

估计/估计量

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

专知会员服务

69+阅读 · 2021年3月27日

【经典书】计算最优传输，209页pdf，Computational Optimal Transport

【经典书】计算最优传输，209页pdf，Computational Optimal Transport

专知会员服务

75+阅读 · 2021年1月10日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

机器学习与物理科学（Machine learning and the physical sciences），附44页pdf

机器学习与物理科学（Machine learning and the physical sciences），附44页pdf

专知会员服务

51+阅读 · 2019年12月10日

【AAAI 2019 Tutorial】城市交通控制的规划与调度方法（Planning and Scheduling Approaches for Urban Traffic Control），Scott Sanner，Mauro Vallati，Stephen F. Smith

【AAAI 2019 Tutorial】城市交通控制的规划与调度方法（Planning and Scheduling Approaches for Urban Traffic Control），Scott Sanner，Mauro Vallati，Stephen F. Smith

专知会员服务

8+阅读 · 2019年11月18日

【目标检测 | 2019最新综述】目标检测的最新进展，附40页PDF，Recent Advances in Deep Learning for Object Detection

【目标检测 | 2019最新综述】目标检测的最新进展，附40页PDF，Recent Advances in Deep Learning for Object Detection

专知会员服务

85+阅读 · 2019年11月15日

【ICCV 2019 Workshop】Complete Dictionary Learning via L4-Norm Maximization over the Orthogonal Grou，加州大学伯克利分校马毅

【ICCV 2019 Workshop】Complete Dictionary Learning via L4-Norm Maximization over the Orthogonal Grou，加州大学伯克利分校马毅

专知会员服务

16+阅读 · 2019年10月31日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

95 FPS！超快速3D目标检测网络开源了！SFA3D：基于LiDAR的实时、准确的3D目标检测模型

95 FPS！超快速3D目标检测网络开源了！SFA3D：基于LiDAR的实时、准确的3D目标检测模型

CVer

4+阅读 · 2020年11月14日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

2018机器学习开源资源盘点

2018机器学习开源资源盘点

专知

6+阅读 · 2019年2月2日

【泡泡一分钟】LIMO：激光和单目相机融合的视觉里程计

【泡泡一分钟】LIMO：激光和单目相机融合的视觉里程计

泡泡机器人SLAM

12+阅读 · 2019年1月16日

TCN v2 + 3Dconv 运动信息

TCN v2 + 3Dconv 运动信息

CreateAMind

4+阅读 · 2019年1月8日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

人体姿态估计资源大列表（Human Pose Estimation）

人体姿态估计资源大列表（Human Pose Estimation）

专知

9+阅读 · 2018年10月6日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

The Geometry of Memoryless Stochastic Policy Optimization in Infinite-Horizon POMDPs

The Geometry of Memoryless Stochastic Policy Optimization in Infinite-Horizon POMDPs

Arxiv

0+阅读 · 2021年10月15日

Almost Optimal Batch-Regret Tradeoff for Batch Linear Contextual Bandits

Arxiv

0+阅读 · 2021年10月15日

Anomaly Detection in Multi-Agent Trajectories for Automated Driving

Arxiv

0+阅读 · 2021年10月15日

Identifiable Energy-based Representations: An Application to Estimating Heterogeneous Causal Effects

Arxiv

0+阅读 · 2021年10月15日

A Proximal Algorithm for Sampling from Non-smooth Potentials

Arxiv

0+阅读 · 2021年10月9日

Hitting the Target: Stopping Active Learning at the Cost-Based Optimum

Arxiv

0+阅读 · 2021年10月7日

A Step Towards Efficient Evaluation of Complex Perception Tasks in Simulation

Arxiv

0+阅读 · 2021年9月28日

Active inference, Bayesian optimal design, and expected utility

Arxiv

0+阅读 · 2021年9月21日

Decomposed Mutual Information Estimation for Contrastive Representation Learning

Arxiv

11+阅读 · 2021年6月25日

Signal Processing and Piecewise Convex Estimation

Arxiv

4+阅读 · 2018年3月14日

VIP会员

文章信息

相关主题

估计/估计量

部分可观测马尔可夫决策过程

Processing（编程语言）

相关VIP内容

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

专知会员服务

69+阅读 · 2021年3月27日

【经典书】计算最优传输，209页pdf，Computational Optimal Transport

【经典书】计算最优传输，209页pdf，Computational Optimal Transport

专知会员服务

75+阅读 · 2021年1月10日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

机器学习与物理科学（Machine learning and the physical sciences），附44页pdf

机器学习与物理科学（Machine learning and the physical sciences），附44页pdf

专知会员服务

51+阅读 · 2019年12月10日

【AAAI 2019 Tutorial】城市交通控制的规划与调度方法（Planning and Scheduling Approaches for Urban Traffic Control），Scott Sanner，Mauro Vallati，Stephen F. Smith

【AAAI 2019 Tutorial】城市交通控制的规划与调度方法（Planning and Scheduling Approaches for Urban Traffic Control），Scott Sanner，Mauro Vallati，Stephen F. Smith

专知会员服务

8+阅读 · 2019年11月18日

【目标检测 | 2019最新综述】目标检测的最新进展，附40页PDF，Recent Advances in Deep Learning for Object Detection

【目标检测 | 2019最新综述】目标检测的最新进展，附40页PDF，Recent Advances in Deep Learning for Object Detection

专知会员服务

85+阅读 · 2019年11月15日

【ICCV 2019 Workshop】Complete Dictionary Learning via L4-Norm Maximization over the Orthogonal Grou，加州大学伯克利分校马毅

【ICCV 2019 Workshop】Complete Dictionary Learning via L4-Norm Maximization over the Orthogonal Grou，加州大学伯克利分校马毅

专知会员服务

16+阅读 · 2019年10月31日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

热门VIP内容

开通专知VIP会员享更多权益服务

【牛津博士论文】零样本强化学习综述

《美军条令：陆军指挥官与规划人员地理空间指南》60页

战术边缘指挥控制：防务面临的核心挑战

迈向开放世界检测：综述

相关资讯

95 FPS！超快速3D目标检测网络开源了！SFA3D：基于LiDAR的实时、准确的3D目标检测模型

95 FPS！超快速3D目标检测网络开源了！SFA3D：基于LiDAR的实时、准确的3D目标检测模型

CVer

4+阅读 · 2020年11月14日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

2018机器学习开源资源盘点

2018机器学习开源资源盘点

专知

6+阅读 · 2019年2月2日

【泡泡一分钟】LIMO：激光和单目相机融合的视觉里程计

【泡泡一分钟】LIMO：激光和单目相机融合的视觉里程计

泡泡机器人SLAM

12+阅读 · 2019年1月16日

TCN v2 + 3Dconv 运动信息

TCN v2 + 3Dconv 运动信息

CreateAMind

4+阅读 · 2019年1月8日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

人体姿态估计资源大列表（Human Pose Estimation）

人体姿态估计资源大列表（Human Pose Estimation）

专知

9+阅读 · 2018年10月6日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

相关论文

The Geometry of Memoryless Stochastic Policy Optimization in Infinite-Horizon POMDPs

The Geometry of Memoryless Stochastic Policy Optimization in Infinite-Horizon POMDPs

Arxiv

0+阅读 · 2021年10月15日

Almost Optimal Batch-Regret Tradeoff for Batch Linear Contextual Bandits

Arxiv

0+阅读 · 2021年10月15日

Anomaly Detection in Multi-Agent Trajectories for Automated Driving

Arxiv

0+阅读 · 2021年10月15日

Identifiable Energy-based Representations: An Application to Estimating Heterogeneous Causal Effects

Arxiv

0+阅读 · 2021年10月15日

A Proximal Algorithm for Sampling from Non-smooth Potentials

Arxiv

0+阅读 · 2021年10月9日

Hitting the Target: Stopping Active Learning at the Cost-Based Optimum

Arxiv

0+阅读 · 2021年10月7日

A Step Towards Efficient Evaluation of Complex Perception Tasks in Simulation

Arxiv

0+阅读 · 2021年9月28日

Active inference, Bayesian optimal design, and expected utility

Arxiv

0+阅读 · 2021年9月21日

Decomposed Mutual Information Estimation for Contrastive Representation Learning

Arxiv

11+阅读 · 2021年6月25日

Signal Processing and Piecewise Convex Estimation

Arxiv

4+阅读 · 2018年3月14日

微信扫码咨询专知VIP会员