Artificial intelligence techniques are increasingly being applied to solve control problems, but often rely on black-box methods without transparent output generation. To improve the interpretability and transparency in control systems, models can be defined as white-box symbolic policies described by mathematical expressions. For better performance in partially observable and volatile environments, the symbolic policies are extended with memory represented by continuous-time latent variables, governed by differential equations. Genetic programming is used for optimisation, resulting in interpretable policies consisting of symbolic expressions. Our results show that symbolic policies with memory compare with black-box policies on a variety of control tasks. Furthermore, the benefit of the memory in symbolic policies is demonstrated on experiments where memory-less policies fall short. Overall, we present a method for evolving high-performing symbolic policies that offer interpretability and transparency, which lacks in black-box models.
翻译:暂无翻译