值函数论文 - 专知

会员服务 ·

值函数

Provably Efficient Offline Reinforcement Learning with Trajectory-Wise Reward

Arxiv

0+阅读 · 2023年4月19日

Value Functions Factorization with Latent State Information Sharing in Decentralized Multi-Agent Policy Gradients

Arxiv

0+阅读 · 2023年4月18日

The error and perturbation bounds for the absolute value equations with some applications

Arxiv

0+阅读 · 2023年4月19日

Impossibility of Characterizing Distribution Learning -- a simple solution to a long-standing problem

Arxiv

0+阅读 · 2023年4月18日

Feasible Policy Iteration

Arxiv

0+阅读 · 2023年4月18日

Promises and Pitfalls of the Linearized Laplace in Bayesian Optimization

Arxiv

0+阅读 · 2023年4月17日

Ensemble Value Functions for Efficient Exploration in Multi-Agent Reinforcement Learning

Arxiv

0+阅读 · 2023年4月16日

Simulating Gaussian vectors via randomized dimension reduction and PCA

Arxiv

0+阅读 · 2023年4月14日

Strategy Synthesis for Zero-Sum Neuro-Symbolic Concurrent Stochastic Games

Arxiv

0+阅读 · 2023年4月12日

Importance Sampling BRDF Derivatives

Arxiv

0+阅读 · 2023年4月8日

Optimal Goal-Reaching Reinforcement Learning via Quasimetric Learning

Arxiv

0+阅读 · 2023年4月6日

On the approximation of vector-valued functions by samples

Arxiv

0+阅读 · 2023年4月6日

Convergence of Batch Asynchronous Stochastic Approximation With Applications to Reinforcement Learning

Arxiv

0+阅读 · 2023年4月3日

Optimal Goal-Reaching Reinforcement Learning via Quasimetric Learning

Arxiv

0+阅读 · 2023年4月3日

Optimal Goal-Reaching Reinforcement Learning via Quasimetric Learning

Arxiv

0+阅读 · 2023年4月5日

参考链接

微信扫码咨询专知VIP会员