基于代理模型的生成流网络策略不确定性量化 (Surrogate-based quantification of policy uncertainty in generative flow networks)

Generative flow networks are able to sample, via sequential construction, high-reward, complex objects according to a reward function. However, such reward functions are often estimated approximately from noisy data, leading to epistemic uncertainty in the learnt policy. We present an approach to quantify this uncertainty by constructing a surrogate model composed of a polynomial chaos expansion, fit on a small ensemble of trained flow networks. This model learns the relationship between reward functions, parametrised in a low-dimensional space, and the probability distributions over actions at each step along a trajectory of the flow network. The surrogate model can then be used for inexpensive Monte Carlo sampling to estimate the uncertainty in the policy given uncertain rewards. We illustrate the performance of our approach on a discrete and continuous grid-world, symbolic regression, and a Bayesian structure learning task.

翻译：生成流网络能够通过顺序构建的方式，依据奖励函数采样高奖励的复杂对象。然而，此类奖励函数通常需从含噪声数据中近似估计，从而导致学习策略存在认知不确定性。本文提出一种量化该不确定性的方法：通过构建由多项式混沌展开构成的代理模型，该模型基于少量训练好的流网络集成进行拟合。该模型能够学习低维参数化奖励函数与流网络轨迹各步骤动作概率分布之间的关系。随后，该代理模型可用于低成本的蒙特卡洛采样，以估计在奖励不确定条件下策略的不确定性。我们在离散与连续网格世界、符号回归以及贝叶斯结构学习任务中展示了本方法的性能。

相关内容

Networking

关注 22

Networking：IFIP International Conferences on Networking。 Explanation：国际网络会议。 Publisher：IFIP。 SIT： http://dblp.uni-trier.de/db/conf/networking/index.html

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

生成性对抗网络:理论模型、评估指标和最近发展的概述，Generative Adversarial Networks (GANs): An Overview of Theoretical Model, Evaluation Metrics, and Recent Developments

专知会员服务

42+阅读 · 2020年5月30日

【AI应用】Facebook-利用神经网络求解高等数学方程, Using neural networks to solve advanced mathematics equations

专知会员服务

34+阅读 · 2020年1月15日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日