具备稳定非线性非线性控制示范咨询意见的黑箱政策设备 (Equipping Black-Box Policies with Model-Based Advice for Stable Nonlinear Control)

Machine-learned black-box policies are ubiquitous for nonlinear control problems. Meanwhile, crude model information is often available for these problems from, e.g., linear approximations of nonlinear dynamics. We study the problem of equipping a black-box control policy with model-based advice for nonlinear control on a single trajectory. We first show a general negative result that a naive convex combination of a black-box policy and a linear model-based policy can lead to instability, even if the two policies are both stabilizing. We then propose an adaptive $\lambda$-confident policy, with a coefficient $\lambda$ indicating the confidence in a black-box policy, and prove its stability. With bounded nonlinearity, in addition, we show that the adaptive $\lambda$-confident policy achieves a bounded competitive ratio when a black-box policy is near-optimal. Finally, we propose an online learning approach to implement the adaptive $\lambda$-confident policy and verify its efficacy in case studies about the CartPole problem and a real-world electric vehicle (EV) charging problem with data bias due to COVID-19.

翻译：机器学的黑箱政策对于非线性控制问题来说是无处不在的。同时, 这些问题往往可以从非线性动态线性近似线性近似线性来获得粗略的模型信息。我们研究在单一轨道上为非线性控制提供基于模型的建议来装备黑箱控制政策的问题。我们首先显示一个普遍的负面结果, 即即使两种政策都稳定下来, 黑箱政策和线性模式性政策之间天真的结合会导致不稳定。我们然后提出一个适应性的 $lambda$- confident 政策, 以 $\ lambda$ 表示对黑箱政策的信心, 并证明它的稳定性。此外, 我们用不线性线性来研究黑箱政策的适应性 $\ lambda$- condifity 政策在黑箱政策接近最佳时, 能够实现一个约束性竞争比率。最后, 我们提出一个在线学习方法, 以实施适应性的 $\lambda$- confidentive 政策, 在CartPole 问题和实体- 19 电车( CO- VI) 将数据与错误联系起来的情况下进行案例研究, 。

相关内容

黑盒

关注 1

在科学，计算和工程学中，黑盒是一种设备，系统或对象，可以根据其输入和输出（或传输特性）对其进行查看，而无需对其内部工作有任何了解。它的实现是“不透明的”（黑色）。几乎任何事物都可以被称为黑盒：晶体管，引擎，算法，人脑，机构或政府。为了使用典型的“黑匣子方法”来分析建模为开放系统的事物，仅考虑刺激/响应的行为，以推断（未知）盒子。该黑匣子系统的通常表示形式是在该方框中居中的数据流程图。黑盒的对立面是一个内部组件或逻辑可用于检查的系统，通常将其称为白盒（有时也称为“透明盒”或“玻璃盒”）。

2020数据工程师成长路线图

专知会员服务

41+阅读 · 2020年9月6日

神经常微分方程教程，50页ppt，A brief tutorial on Neural ODEs

专知会员服务

74+阅读 · 2020年8月2日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日