Due to the high efficiency and less weather dependency, autonomous greenhouses provide an ideal solution to meet the increasing demand for fresh food. However, managers are faced with some challenges in finding appropriate control strategies for crop growth, since the decision space of the greenhouse control problem is an astronomical number. Therefore, an intelligent closed-loop control framework is highly desired to generate an automatic control policy. As a powerful tool for optimal control, reinforcement learning (RL) algorithms can surpass human beings' decision-making and can also be seamlessly integrated into the closed-loop control framework. However, in complex real-world scenarios such as agricultural automation control, where the interaction with the environment is time-consuming and expensive, the application of RL algorithms encounters two main challenges, i.e., sample efficiency and safety. Although model-based RL methods can greatly mitigate the efficiency problem of greenhouse control, the safety problem has not got too much attention. In this paper, we present a model-based robust RL framework for autonomous greenhouse control to meet the sample efficiency and safety challenges. Specifically, our framework introduces an ensemble of environment models to work as a simulator and assist in policy optimization, thereby addressing the low sample efficiency problem. As for the safety concern, we propose a sample dropout module to focus more on worst-case samples, which can help improve the adaptability of the greenhouse planting policy in extreme cases. Experimental results demonstrate that our approach can learn a more effective greenhouse planting policy with better robustness than existing methods.
翻译:由于效率高,对天气依赖较少,自主温室提供了理想的解决办法,以满足对新鲜食品日益增长的需求;然而,由于温室控制问题的决策空间是一个天文数字,因此管理人员在寻找适当的作物增长控制战略方面面临一些挑战,因此,明智的闭路控制框架非常希望产生自动控制政策;作为优化控制的一个强大工具,强化学习(RL)算法可以超越人类决策,也可以顺利地纳入封闭环形控制框架;然而,在复杂的现实世界情景中,如农业自动化控制,与环境的互动耗时且费用昂贵,应用RL算法面临两大挑战,即温室控制问题的决策空间是一个天文数字;因此,智能闭路控制框架非常理想地可以大大缓解温室气体控制的效率问题;作为基于模型的模型,我们可以用一个基于模式的稳健的RL框架,用以应对温室气体的自主控制,从而应对抽样控制的挑战;具体地说,在农业自动化控制中,在环境模型中,与环境模型的组合中,环境模型的相互作用既耗时费又耗时费昂贵,使用会遇到两种主要挑战,即抽样效率和安全;尽管基于模型的方法可以大大减轻温室安全;因此,我们可以提出更注重安全;我们可以学习一个最差的模型,在降低的策略中学习,可以改进安全;在降低的模型中学习,可以改进安全。