We present a discretized design that expounds an algorithm recently introduced in Gagliardi and Russo (2021) to synthesize control policies from examples for constrained, possibly stochastic and nonlinear, systems. The constraints do not need to be fulfilled in the possibly noisy example data, which in turn might be collected from a system that is different from the one under control. For this discretized design, we discuss a number of properties and give a design pipeline. The design, which we term as discrete fully probabilistic design, is benchmarked numerically on an example that involves controlling an inverted pendulum with actuation constraints starting from data collected from a physically different pendulum that does not satisfy the system-specific actuation constraints.
翻译:我们提出了一个独立设计,其中阐述了最近在Gagliardi和Russo(2021年)采用的一种算法,将控制政策从受限制(可能具有随机性和非线性)系统的例子中综合起来,在可能非常吵的示例数据中不需要满足这些限制,而这些数据又可能从与受控制的系统不同的系统中收集。对于这一独立设计,我们讨论了若干特性,并给出了设计管道。我们称之为离散(完全概率性)设计的设计,其数字基准是以一个实例为基准的,即从不满足系统特定操作性限制的物理不同元件中收集的数据开始,控制带有操作性限制的倒转档。