In this paper, genetic programming reinforcement learning (GPRL) is utilized to generate human-interpretable control policies for a Chylla-Haase polymerization reactor. Such continuously stirred tank reactors (CSTRs) with jacket cooling are widely used in the chemical industry, in the production of fine chemicals, pigments, polymers, and medical products. Despite appearing rather simple, controlling CSTRs in real-world applications is quite a challenging problem to tackle. GPRL utilizes already existing data from the reactor and generates fully automatically a set of optimized simplistic control strategies, so-called policies, the domain expert can choose from. Note that these policies are white-box models of low complexity, which makes them easy to validate and implement in the target control system, e.g., SIMATIC PCS 7. However, despite its low complexity the automatically-generated policy yields a high performance in terms of reactor temperature control deviation, which we empirically evaluate on the original reactor template.
翻译:在本文中,利用基因编程强化学习(GPRL)为Chylla-Haase聚合反应堆制定人类可解释的控制政策,在化学工业中,在生产精细化学品、颜料、聚合物和医疗产品时,广泛使用这种不断搅拌的装有夹克冷却的坦克反应堆(CSTS),尽管这看起来相当简单,但在现实应用中控制CSTS是一个相当具有挑战性的问题。GPRL利用反应堆的现有数据,并自动生成一套最优化的简单化控制战略,即所谓的政策,领域专家可以从中作出选择。请注意,这些政策是低复杂性的白箱模型,因此很容易在目标控制系统中验证和执行,例如SIMATIC PCS 7. 然而,尽管自动产生的政策在反应堆温度控制偏离方面产生很高的性能,我们从经验中评估了原反应堆模板。