Sequential decision making techniques hold great promise to improve the performance of many real-world systems, but computational complexity hampers their principled application. Influence-based abstraction aims to gain leverage by modeling local subproblems together with the 'influence' that the rest of the system exerts on them. While computing exact representations of such influence might be intractable, learning approximate representations offers a promising approach to enable scalable solutions. This paper investigates the performance of such approaches from a theoretical perspective. The primary contribution is the derivation of sufficient conditions on approximate influence representations that can guarantee solutions with small value loss. In particular we show that neural networks trained with cross entropy are well suited to learn approximate influence representations. Moreover, we provide a sample based formulation of the bounds, which reduces the gap to applications. Finally, driven by our theoretical insights, we propose approximation error estimators, which empirically reveal to correlate well with the value loss.
翻译:一系列决策技术对于改善许多现实世界系统的业绩有很大的希望,但计算的复杂性妨碍了它们的原则应用。基于影响的抽象学旨在通过模拟局部子问题和系统其余部分对其施加的“影响”来获得杠杆作用。虽然计算出这种影响的确切表现可能是棘手的,但学习近似表现提供了一种有希望的方法,以促成可伸缩的解决办法。本文从理论角度来调查这些方法的绩效。主要贡献在于对近似影响表示的充分条件的推导,从而能够保证有少量价值损失的解决方案。特别是,我们表明,受过交叉酶训练的神经网络非常适合学习近似影响表示。此外,我们提供了一种基于界限的样本配方,从而缩小了应用之间的差距。最后,根据我们的理论见解,我们提出了近似错误估计者,其经验显示与价值损失密切相关。