We propose a learning-based methodology to reconstruct private information held by a population of interacting agents in order to predict an exact outcome of the underlying multi-agent interaction process, here identified as a stationary action profile. We envision a scenario where an external observer, endowed with a learning procedure, is allowed to make queries and observe the agents' reactions through private action-reaction mappings, whose collective fixed point corresponds to a stationary profile. By adopting a smart query process to iteratively collect sensible data and update parametric estimates, we establish sufficient conditions to assess the asymptotic properties of the proposed learning-based methodology so that, if convergence happens, it can only be towards a stationary action profile. This fact yields two main consequences: i) learning locally-exact surrogates of the action-reaction mappings allows the external observer to succeed in its prediction task, and ii) working with assumptions so general that a stationary profile is not even guaranteed to exist, the established sufficient conditions hence act also as certificates for the existence of such a desirable profile. Extensive numerical simulations involving typical competitive multi-agent control and decision making problems illustrate the practical effectiveness of the proposed learning-based approach.
翻译:我们提出一种基于学习的方法,以重建由一组互动代理人掌握的私人信息,从而预测基础多代理人互动进程的确切结果,此处被确定为固定行动概况。我们设想一种情景,允许外部观察员(具有学习程序)通过私人行动-反应绘图进行查询和观察代理人的反应,其集体固定点与固定点相对应。我们采用一种聪明的查询程序,以迭接地收集合理数据并更新参数估计,从而建立充分的条件,评估拟议的基于学习的方法的被动特性,以便如果出现趋同,它只能是固定行动概况。这一事实产生两个主要后果:(一) 学习具有学习程序的当地行动-反应绘图,使外部观察员能够成功地完成预测任务,以及(二) 与一般假设一起工作,以致甚至无法保证存在一个固定点,因此确立的充分条件也成为存在这种理想特征的证明。广泛的数字模拟涉及典型的竞争性多代理人控制和决策,说明拟议的基于学习的方法的实际效力。