Model free techniques have been successful at optimal control of complex systems at an expense of copious amounts of data and computation. However, it is often desired to obtain a control policy in a short period of time with minimal data use and computational burden. To this end, we make use of the NFQ algorithm for steering position control of a golf cart in both a real hardware and a simulated environment that was built from real-world interaction. The controller learns to apply a sequence of voltage signals in the presence of environmental uncertainties and inherent non-linearities that challenge the the control task. We were able to increase the rate of successful control under four minutes in simulation and under 11 minutes in real hardware.
翻译:模型自由技术在以大量数据和计算为代价对复杂系统进行最佳控制方面取得了成功,但往往希望在短期内获得控制政策,同时尽量减少数据使用和计算负担。为此,我们利用NFQ算法,在真实硬件和模拟环境中对高尔夫车进行方向定位控制,这种控制是用真实世界互动建造的。控制员学会在环境不确定性和固有的非线性因素对控制任务构成挑战的情况下,应用一系列电压信号。我们得以在模拟4分钟和真实硬件11分钟以下提高成功控制率。