The lack of stability guarantee restricts the practical use of learning-based methods in core control problems in robotics. We develop new methods for learning neural control policies and neural Lyapunov critic functions in the model-free reinforcement learning (RL) setting. We use sample-based approaches and the Almost Lyapunov function conditions to estimate the region of attraction and invariance properties through the learned Lyapunov critic functions. The methods enhance stability of neural controllers for various nonlinear systems including automobile and quadrotor control.
翻译:缺乏稳定性保障限制了在机器人核心控制问题上实际使用基于学习的方法。我们开发了学习神经控制政策和在无模型强化学习(RL)环境中神经功能的新方法。我们使用基于样本的方法和几乎Lyapunov功能条件来通过学习的Lyapunov批评功能来估计吸引力和无弹性特性的区域。这些方法加强了包括汽车和孔田控制在内的各种非线性系统神经控制器的稳定。