Neuro-symbolic approaches to artificial intelligence, which combine neural networks with classical symbolic techniques, are growing in prominence, necessitating formal approaches to reason about their correctness. We propose a novel modelling formalism called neuro-symbolic concurrent stochastic games (NS-CSGs), which comprise probabilistic finite-state agents interacting in a shared continuous-state environment, observed through perception mechanisms implemented as neural networks. Since the environment state space is continuous, we focus on the class of NS-CSGs with Borel state spaces. We consider the problem of zero-sum discounted cumulative rewards and prove the existence of the value of NS-CSGs under Borel measurability and piecewise-constant restrictions on the components of the model. From an algorithmic perspective, existing methods to compute values and optimal strategies for CSGs focus on finite state spaces. We present, for the first time, implementable value iteration and policy iteration algorithms to solve a class of uncountable state space CSGs, namely NS-CSGs, and prove their convergence. Our approach works by exploiting the underlying game structures and then formulating piecewise linear or constant representations of the value functions and strategies of NS-CSGs. We illustrate our approach by applying a prototype implementation of value iteration to a dynamic vehicle parking case study.
翻译:将神经网络与古典象征性技术相结合的人工智能神经 -- -- 共振性方法日益突出,需要正式解释其正确性。我们建议采用新型的模型形式主义,称为神经-共振同时随机游戏(NS-CSGs),其中包括在共同的连续状态环境中互动的概率性限定国家代理人,通过神经网络的感知机制加以观察。由于环境状态空间是持续的,我们侧重于NS-CSG和波雷尔州空间的等级。我们考虑零和折扣累积奖励的问题,并证明在Borel的可计量性和对模型组成部分的零一致限制下NS-CSG的价值存在。从算法的角度,现有用于计算CSG的价值观和最佳战略以有限的国家空间为重点的方法。我们首次展示了可执行价值的循环和政策循环算法,以解决无法计算的国家空间(NS-CSGs)的类别,并证明在Borel中存在NS-CSGs的累积报酬问题,并证明在Borel的可计量性和对模型组成部分的细微一致性限制。我们的方法通过利用游戏结构的固定的模型模型模型研究,展示了我们对NCs-Csal-Csal-Csalview 的定位的定位模式的模型进行研究,从而展示了我们如何研究。