We present an approach for feedback motion planning of systems with unknown dynamics which provides guarantees on safety, reachability, and stability about the goal. Given a learned control-affine approximation of the true dynamics, we estimate the Lipschitz constant of the difference between the true and learned dynamics to determine a trusted domain for our learned model. Provided the system has at least as many controls as states, we further derive the conditions under which a one-step feedback law exists. This allows fora small bound on the tracking error when the trajectory is executed on the real system. Our method imposes a check for the existence of the feedback law as constraints in a sampling-based planner, which returns a feedback policy ensuring that under the true dynamics, the goal is reachable, the path is safe in execution, and the closed-loop system is invariant in a small set about the goal. We demonstrate our approach by planning using learned models of a 6D quadrotor and a 7DOF Kuka arm.We show that a baseline which plans using the same learned dynamics without considering the error bound or the existence of the feedback law can fail to stabilize around the plan and become unsafe.
翻译:我们提出一种方法,用于对具有未知动态的系统进行反馈运动规划,这些系统对安全、可达性和目标稳定性提供保障。根据对真实动态的精明控制-节奏近似,我们估计利普西茨对真实动态和学习动态之间的差异的常数,以确定我们所学模型的可信任域。如果系统至少拥有与州相同的控制,我们进一步得出存在一步骤反馈法的条件。这样,当轨迹在实际系统上执行时,就可以对追踪错误有小限制。我们的方法是将反馈法的存在作为基于抽样的规划师的制约因素进行检查,这种方法将反馈政策带来反馈政策,确保在真实动态下,目标是可以达到的,路径是安全的,而闭环系统对目标来说是一小套不变的。我们通过使用6D quadrotoror和7DOF Kuka arm的已知模型来规划我们的方法展示了我们的方法。我们显示,在不考虑错误或反馈法的存在的情况下,使用同一已知动态规划的基线可能无法在计划周围稳定并变得不安全。