Neuro-symbolic approaches to artificial intelligence, which combine neural networks with classical symbolic techniques, are growing in prominence, necessitating formal approaches to reason about their correctness. We propose a novel modelling formalism called neuro-symbolic concurrent stochastic games (NS-CSGs), which comprise a set of probabilistic finite-state agents interacting in a shared continuous-state environment, observed through perception mechanisms implemented as neural networks. Since the environment state space is continuous, we focus on the class of NS-CSGs with Borel state spaces. We consider the problem of zero-sum discounted cumulative rewards and prove the existence of the value of NS-CSGs under Borel measurability and piecewise-constant restrictions on the components of the model. From an algorithmic perspective, existing methods to compute values and optimal strategies for CSGs focus on finite state spaces. We present, for the first time, implementable value iteration and policy iteration algorithms to solve a class of uncountable state space CSGs, namely NS-CSGs, and prove their convergence. Our approach works by exploiting the underlying game structures and then formulating piecewise linear or constant representations of the value functions and strategies of NS-CSGs. We validate the value iteration algorithm with a prototype implementation applied to a dynamic vehicle parking example.
翻译:将神经网络与古典象征性技术相结合的人工智能神经 -- -- 共振性方法正在日益突出,需要正式解释其正确性。我们建议采用新型的正规主义模型,称为神经-共振同时随机游戏(NS-CSGs),由一组在共同连续状态环境中互动的、以神经网络执行的感知机制为观察点的具有概率的有限国家代理人组成。由于环境状态空间是连续的,我们侧重于NS-CSG和波雷尔州空间的等级。我们考虑的是零和折扣累积奖励的问题,并证明在Borel的可计量性和对模型组成部分的零一致限制下NS-CSG的价值存在。从算法的角度,现有的计算CSG价值和最佳战略的计算方法以有限的状态空间为重点。我们首次提出可执行的超值转换和政策循环算法,以解决一组不可计价的国家空间(即NS-CSGs)的累积报酬问题,并证明在Boral-Csural-C的模型实施中,并证明它们的价值与S-C的稳定性结构的一致。我们的方法是利用游戏和稳定模式的模型的模型。