This paper presents a method for learning logical task specifications and cost functions from demonstrations. Linear temporal logic (LTL) formulas are widely used to express complex objectives and constraints for autonomous systems. Yet, such specifications may be challenging to construct by hand. Instead, we consider demonstrated task executions, whose temporal logic structure and transition costs need to be inferred by an autonomous agent. We employ a spectral learning approach to extract a weighted finite automaton (WFA), approximating the unknown logic structure of the task. Thereafter, we define a product between the WFA for high-level task guidance and a Labeled Markov decision process (L-MDP) for low-level control and optimize a cost function that matches the demonstrator's behavior. We demonstrate that our method is capable of generalizing the execution of the inferred task specification to new environment configurations.
翻译:本文介绍了一种从演示中学习逻辑任务规格和成本函数的方法。 线性时间逻辑(LTL)公式被广泛用于表达自主系统的复杂目标和限制。 然而,这种规格可能难以由手工构建。 相反,我们考虑的是证明的任务处决,其时间逻辑结构和过渡成本需要由自主代理来推断。我们采用光谱学习方法来提取一个加权的有限自动图(WFA),与这项任务的未知逻辑结构相近。之后,我们界定了WFA用于高级别任务指导的产品和Labeled Markov决策程序(L-MDP)之间的产品,用于低层次控制和优化与演示人行为相匹配的成本功能。我们证明,我们的方法能够将推断任务规格的执行普遍化为新的环境配置。