This paper addresses a continuous-time continuous-space chance-constrained stochastic optimal control (SOC) problem via a Hamilton-Jacobi-Bellman (HJB) partial differential equation (PDE). Through Lagrangian relaxation, we convert the chance-constrained (risk-constrained) SOC problem to a risk-minimizing SOC problem, the cost function of which possesses the time-additive Bellman structure. We show that the risk-minimizing control synthesis is equivalent to solving an HJB PDE whose boundary condition can be tuned appropriately to achieve a desired level of safety. Furthermore, it is shown that the proposed risk-minimizing control problem can be viewed as a generalization of the problem of estimating the risk associated with a given control policy. Two numerical techniques are explored, namely the path integral and the finite difference method (FDM), to solve a class of risk-minimizing SOC problems whose associated HJB equation is linearizable via the Cole-Hopf transformation. Using a 2D robot navigation example, we validate the proposed control synthesis framework and compare the solutions obtained using path integral and FDM.
翻译:本文通过汉密尔顿-Jacobi-Bellman(HJB)部分差分方程(PDE)解决持续时间连续空间机会限制的最佳控制问题。通过拉格朗加放松,我们把受机会限制(风险限制)的SOC问题转换成风险最小化SOC问题,即拥有时间适应贝尔曼结构的成本功能。我们证明风险最小化控制合成相当于解决HJB PDE问题,其边界条件可以适当调整,以达到理想的安全水平。此外,还表明,拟议的风险最小化控制问题可以被视为对与特定控制政策相关的风险估计问题的普遍化。我们探讨了两种数字技术,即路径一体化和有限差异方法(FDM),以解决风险最小化SOC问题的类别,其相关的HJB方程式通过Cole-Hopf变形可以线性化。我们以2D机器人导航为例,验证了拟议的控制合成框架,并比较使用路径整体和FDMM获得的解决方案。