Changing conditions or environments can cause system dynamics to vary over time. To ensure optimal control performance, controllers should adapt to these changes. When the underlying cause and time of change is unknown, we need to rely on online data for this adaptation. In this paper, we will use time-varying Bayesian optimization (TVBO) to tune controllers online in changing environments using appropriate prior knowledge on the control objective and its changes. Two properties are characteristic of many online controller tuning problems: First, they exhibit incremental and lasting changes in the objective due to changes to the system dynamics, e.g., through wear and tear. Second, the optimization problem is convex in the tuning parameters. Current TVBO methods do not explicitly account for these properties, resulting in poor tuning performance and many unstable controllers through over-exploration of the parameter space. We propose a novel TVBO forgetting strategy using Uncertainty-Injection (UI), which incorporates the assumption of incremental and lasting changes. The control objective is modeled as a spatio-temporal Gaussian process (GP) with UI through a Wiener process in the temporal domain. Further, we explicitly model the convexity assumptions in the spatial dimension through GP models with linear inequality constraints. In numerical experiments, we show that our model outperforms the state-of-the-art method in TVBO, exhibiting reduced regret and fewer unstable parameter configurations.
翻译:为确保最佳控制性能,控制者应该适应这些变化。当变化的根本原因和时间未知时,我们需要依靠在线数据来进行这一调整。在本文中,我们将使用时间变化的巴伊西亚优化(TVBO),利用对控制目标及其变化的适当先前知识,在变化环境中对控制者进行在线调控。许多在线控制者调试问题有两个特点:首先,由于系统动态的变化,例如,通过磨损,控制者应适应这些变化。第二,优化问题在调试参数中是连接的。当前TVBO方法没有明确说明这些属性,导致调试参数空间的过度探索导致性能和许多不稳定控制者。我们建议采用新的TVOBO遗忘战略,其中包含渐进和持久变化的假设。控制目标被建模成一个阵列-时空高,在调参数调参数参数参数参数调调调调参数参数参数中,优化优化了调试度,导致性能和许多不稳定控制者通过超时空域的测试模型,明确展示了我们空间-GVI的模型-直线性模型。