In this paper we investigate the problem of controlling a partially observed stochastic dynamical system such that its state is difficult to infer using a (fixed-interval) Bayesian smoother. This problem arises naturally in applications in which it is desirable to keep the entire state trajectory of a system concealed. We pose our smoothing-averse control problem as the problem of maximising the (joint) entropy of smoother state estimates (i.e., the joint conditional entropy of the state trajectory given the history of measurements and controls). We show that the entropy of Bayesian smoother estimates for general nonlinear state-space models can be expressed as the sum of entropies of marginal state estimates given by Bayesian filters. This novel additive form allows us to reformulate the smoothing-averse control problem as a fully observed stochastic optimal control problem in terms of the usual concept of the information (or belief) state, and solve the resulting problem via dynamic programming. We illustrate the applicability of smoothing-averse control to privacy in cloud-based control and covert robotic navigation.
翻译:在本文中,我们调查了控制一个部分观测到的随机动态系统的问题,因此很难用(固定间距)贝叶斯平滑度来推断其状态。这个问题自然地出现在一些应用中,在应用中,最好将一个系统的整个状态轨迹隐藏起来。我们之所以提出我们平滑的反控制问题,是因为在信息(或信仰)状态的通常概念中,将更平稳的国家估计(即,根据测量和控制的历史,国家轨迹的有条件联合酶)最大化,并通过动态编程解决由此产生的问题。我们表明,对普通非线性国家空间模型的巴耶斯光滑动估计的酶可以表示为巴耶斯过滤器过滤器所提供边缘国家估计的元素总和。这种新颖的添加式形式使我们能够将平滑反控制问题重新定位为完全观测到的随机最佳控制问题,通过动态编程解决由此产生的问题。我们举例说明平滑反控制对云基控制和隐密性机器人导航的隐私的适用性。