Deep Reinforcement Learning (RL) powered by neural net approximation of the Q function has had enormous empirical success. While the theory of RL has traditionally focused on linear function approximation (or eluder dimension) approaches, little is known about nonlinear RL with neural net approximations of the Q functions. This is the focus of this work, where we study function approximation with two-layer neural networks (considering both ReLU and polynomial activation functions). Our first result is a computationally and statistically efficient algorithm in the generative model setting under completeness for two-layer neural networks. Our second result considers this setting but under only realizability of the neural net function class. Here, assuming deterministic dynamics, the sample complexity scales linearly in the algebraic dimension. In all cases, our results significantly improve upon what can be attained with linear (or eluder dimension) methods.
翻译:以Q函数神经网近似为动力的深度强化学习(RL)获得了巨大的经验成功。虽然RL理论传统上侧重于线性函数近似(或快率维度)方法,但对于Q函数神经网近似的非线性RL却鲜为人知。这就是这项工作的重点,我们在这里研究与两层神经网络的近似功能(同时考虑RELU和多线性激活功能)。我们的第一个结果是在两层神经网络完整的情况下,在基因化模型设置中进行计算和统计效率高的算法。我们的第二个结果考虑了这一设置,但只是在神经网功能等级的可变性之下。在这里,假设确定性动态,在代数层面的样本复杂度线性尺度线性(或精灵维度)方法所能达到的程度上,我们的结果大有改进。