We present novel techniques for neuro-symbolic concurrent stochastic games, a recently proposed modelling formalism to represent a set of agents operating in a probabilistic, continuous-space environment using a combination of neural network based perception mechanisms and traditional symbolic methods. To date, only zero-sum variants of the model were studied, which is too restrictive when agents have distinct objectives. We formalise notions of equilibria for these models and present algorithms to synthesise them. Focusing on the finite-horizon setting, and (global) social welfare subgame-perfect optimality, we consider two distinct types: Nash equilibria and correlated equilibria. We first show that an exact solution based on backward induction may yield arbitrarily bad equilibria. We then propose an approximation algorithm called frozen subgame improvement, which proceeds through iterative solution of nonlinear programs. We develop a prototype implementation and demonstrate the benefits of our approach on two case studies: an automated car-parking system and an aircraft collision avoidance system.
翻译:我们提出了神经 -- -- 共振共振游戏的新技术,这是最近提议的一种建模形式主义,代表了在概率、连续空间环境中使用基于神经网络的感知机制和传统象征性方法相结合的一组在概率、连续空间环境中运作的代理物。迄今为止,只研究了模型的零和变体,而当代理物有不同的目的时,这种变体过于严格。我们将这些模型的平衡概念正规化,并提出了合成这些模型的算法。我们侧重于有限平衡设置和(全球)社会福利亚游戏最佳性,我们考虑了两种不同的类型:Nash equilibria和相对平衡性。我们首先表明,基于后向感的精确解决方案可能会产生任意的偏差。我们随后提出了一个叫作冻结子游戏改进的近似算法,通过非线性方案的迭代解决方案进行。我们开发了一个原型实施,并在两个案例研究上展示了我们方法的好处:自动汽车定位系统和避免飞机碰撞系统。