Data-driven control algorithms use observations of system dynamics to construct an implicit model for the purpose of control. However, in practice, data-driven techniques often require excessive sample sizes, which may be infeasible in real-world scenarios where only limited observations of the system are available. Furthermore, purely data-driven methods often neglect useful a priori knowledge, such as approximate models of the system dynamics. We present a method to incorporate such prior knowledge into data-driven control algorithms using kernel embeddings, a nonparametric machine learning technique based in the theory of reproducing kernel Hilbert spaces. Our proposed approach incorporates prior knowledge of the system dynamics as a bias term in the kernel learning problem. We formulate the biased learning problem as a least-squares problem with a regularization term that is informed by the dynamics, that has an efficiently computable, closed-form solution. Through numerical experiments, we empirically demonstrate the improved sample efficiency and out-of-sample generalization of our approach over a purely data-driven baseline. We demonstrate an application of our method to control through a target tracking problem with nonholonomic dynamics, and on spring-mass-damper and F-16 aircraft state prediction tasks.
翻译:由数据驱动的控制算法使用对系统动态的观测来构建一个用于控制目的的隐含模型。然而,在实践中,由数据驱动的技术往往要求过大的样本规模,而在实际的情景中,如果对系统只进行有限的观测,则这些样本可能不可行。此外,纯粹由数据驱动的方法往往忽视先天知识,例如系统动态的近似模型。我们提出一种方法,利用内核嵌入,将这种先前的知识纳入数据驱动的控制算法,这是一种基于再生产内核Hilbert空间理论的非对称机器学习技术。我们提议的方法包括了以前对系统动态的了解,作为内核学习问题中的一个偏差术语。我们把有偏见的学习问题作为一个最不划一的问题,由动态所通报的正规化术语,具有高效率的可调和封闭式的解决方案。我们通过数字实验,以实验方式展示了在纯数据驱动的基线基础上改进的样本效率并超越了我们的方法的概括性。我们用一种方法来控制一个目标跟踪问题,用非制式-16号飞行器的飞行器和春季制式飞行器。