Discrete-time stochastic optimal control remains a challenging problem for general, nonlinear systems under significant uncertainty, with practical solvers typically relying on the certainty equivalence assumption, replanning and/or extensive regularization. Control as inference is an approach that frames stochastic control as an equivalent inference problem, and has demonstrated desirable qualities over existing methods, namely in exploration and regularization. We look specifically at the input inference for control (i2c) algorithm, and derive three key characteristics that enable advanced trajectory optimization: An `expert' linear Gaussian controller that combines the benefits of open-loop optima and closed-loop variance reduction when optimizing for nonlinear systems, inherent adaptive risk sensitivity from the inference formulation, and covariance control functionality with only a minor algorithmic adjustment.
翻译:对于处于重大不确定性的一般非线性系统来说,对非线性系统来说,不透明时间的最佳控制仍然是一个具有挑战性的问题,因为实际解决者通常依赖确定性等同假设、再规划和/或广泛的正规化; 作为一种推论,控制是一种方法,将孔性控制作为相当的推论问题,并表明现有方法的可取性,即勘探和正规化方法。 我们特别查看了控制(i2c)算法的投入推论,并得出了三个关键特征,从而能够实现先进的轨道优化:一个`专家'线性直线高斯控制器,在优化非线性系统时,将开放性环形选择和封闭性环形差异减少的好处结合起来,从推论公式中固有的适应性风险敏感度,以及仅进行微微的算法调整的共变控制功能。