This paper addresses the online motion planning problem of mobile robots under complex high-level tasks. The robot motion is modeled as an uncertain Markov Decision Process (MDP) due to limited initial knowledge, while the task is specified as Linear Temporal Logic (LTL) formulas. The proposed framework enables the robot to explore and update the system model in a Bayesian way, while simultaneously optimizing the asymptotic costs of satisfying the complex temporal task. Theoretical guarantees are provided for the synthesized outgoing policy and safety policy. More importantly, instead of greedy exploration under the classic ergodicity assumption, a safe-return requirement is enforced such that the robot can always return to home states with a high probability. The overall methods are validated by numerical simulations.
翻译:本文讨论在复杂高层次任务下移动机器人的在线运动规划问题。 机器人运动由于初始知识有限而成为不确定的Markov 决策程序( MDP ), 而任务则被指定为线性时空逻辑(LTL) 公式。 拟议的框架使机器人能够以巴耶斯方式探索和更新系统模型,同时优化满足复杂时间任务所需的无症状成本。 为综合的离任政策和安全政策提供了理论保障。 更重要的是,在典型的惯用惯用性假设下,安全返回要求不是贪婪的探索,而是强制实施,使机器人总是能以高概率返回母国。 总体方法通过数字模拟得到验证。