从单一轨迹中为不稳定的线性二次二次曲线监管者建立学习稳定控制器 (Learning Stabilizing Controllers for Unstable Linear Quadratic Regulators from a Single Trajectory)

The principal task to control dynamical systems is to ensure their stability. When the system is unknown, robust approaches are promising since they aim to stabilize a large set of plausible systems simultaneously. We study linear controllers under quadratic costs model also known as linear quadratic regulators (LQR). We present two different semi-definite programs (SDP) which results in a controller that stabilizes all systems within an ellipsoid uncertainty set. We further show that the feasibility conditions of the proposed SDPs are \emph{equivalent}. Using the derived robust controller syntheses, we propose an efficient data dependent algorithm -- \textsc{eXploration} -- that with high probability quickly identifies a stabilizing controller. Our approach can be used to initialize existing algorithms that require a stabilizing controller as an input while adding constant to the regret. We further propose different heuristics which empirically reduce the number of steps taken by \textsc{eXploration} and reduce the suffered cost while searching for a stabilizing controller.

翻译：控制动态系统的主要任务是确保其稳定性。当系统未知时,稳健的方法很有希望,因为它们旨在同时稳定大量合理的系统。我们研究在二次成本模型下的线性控制器,也称为线性二次调节器(LQR)。我们提出两个不同的半限定程序(SDP),导致控制器稳定在环球不确定性中的所有系统。我们进一步表明,拟议的SDP的可行性条件是\emph{{eXploration}。我们利用衍生的稳健控制器合成,建议一种高效的数据依赖算法 -- -- \ textsc{eXloration} -- 高概率快速识别稳定控制器。我们的方法可以用来初始化需要稳定控制器作为投入的现有算法,同时不断增加遗憾。我们进一步提出不同的超理论,从经验上减少由 textsc{eXploration} 所采取步骤的数量,并在寻找稳定控制器的同时减少遭受的成本。