We propose a demonstration-efficient strategy to compress a computationally expensive Model Predictive Controller (MPC) into a more computationally efficient representation based on a deep neural network and Imitation Learning (IL). By generating a Robust Tube variant (RTMPC) of the MPC and leveraging properties from the tube, we introduce a data augmentation method that enables high demonstration-efficiency, being capable to compensate the distribution shifts typically encountered in IL. Our approach opens the possibility of zero-shot transfer from a single demonstration collected in a nominal domain, such as a simulation or a robot in a lab/controlled environment, to a domain with bounded model errors/perturbations. Numerical and experimental evaluations performed on a trajectory tracking MPC for a quadrotor show that our method outperforms strategies commonly employed in IL, such as DAgger and Domain Randomization, in terms of demonstration-efficiency and robustness to perturbations unseen during training.
翻译:我们提出一个示范效率高的战略,将计算成本昂贵的模型预测主计长(MPC)压缩成一个基于深神经网络和模拟学习(IL)的更计算效率高的演示性代表。 通过生成MPC的硬管变体(RTMPC)并利用管子的特性,我们引入了一个数据增强方法,使高演示效率能够补偿IL通常遇到的分布变化。 我们的方法使得有可能从一个名义域收集的单一演示(如实验室/控制环境中的模拟或机器人)向一个有捆绑模型错误/渗透的域转移零点。 在对一个四重体进行轨迹跟踪的MPC进行的数量评估和实验性评估表明,我们的方法在演示效率和在训练期间无法察觉到的扰动性方面,超越了IL通常采用的战略,例如Dagger和Domain Randization。