We present novel techniques for neuro-symbolic concurrent stochastic games, a recently proposed modelling formalism to represent a set of probabilistic agents operating in a continuous-space environment using a combination of neural network based perception mechanisms and traditional symbolic methods. To date, only zero-sum variants of the model were studied, which is too restrictive when agents have distinct objectives. We formalise notions of equilibria for these models and present algorithms to synthesise them. Focusing on the finite-horizon setting, and (global) social welfare subgame-perfect optimality, we consider two distinct types: Nash equilibria and correlated equilibria. We first show that an exact solution based on backward induction may yield arbitrarily bad equilibria. We then propose an approximation algorithm called frozen subgame improvement, which proceeds through iterative solution of nonlinear programs. We develop a prototype implementation and demonstrate the benefits of our approach on two case studies: an automated car-parking system and an aircraft collision avoidance system.
翻译:我们提出了新颖的神经共振共振游戏技术,这是最近提议的一种模拟形式主义,代表了在连续空间环境中使用基于神经网络的感知机制和传统象征性方法的组合进行操作的一组概率性剂。迄今为止,只研究了模型的零和变体,这些变体在代理体有不同目标时过于严格。我们将这些模型的平衡概念正规化,并提出了合成这些模型的算法。我们侧重于有限和(全球)社会福利次游戏的最佳性,我们考虑了两种截然不同的类别:Nash equilibria和相对的平衡性。我们首先表明,基于后向感应的精确解决办法可能会产生任意的偏差。我们随后提出了一种叫作冻结子游戏改进的近似算法,它通过非线性方案的迭代解决方案进行。我们开发了一个原型实施,并展示了我们方法在两个案例研究上的好处:自动汽车定位系统和避免飞机碰撞系统。