This paper builds bridges between two families of probabilistic algorithms: (hierarchical) variational inference (VI), which is typically used to model distributions over continuous spaces, and generative flow networks (GFlowNets), which have been used for distributions over discrete structures such as graphs. We demonstrate that, in certain cases, VI algorithms are equivalent to special cases of GFlowNets in the sense of equality of expected gradients of their learning objectives. We then point out the differences between the two families and show how these differences emerge experimentally. Notably, GFlowNets, which borrow ideas from reinforcement learning, are more amenable than VI to off-policy training without the cost of high gradient variance induced by importance sampling. We argue that this property of GFlowNets can provide advantages for capturing diversity in multimodal target distributions.
翻译:本文在两个概率算法家族之间架起了桥梁:(等级)变异推论(VI)和基因流动网络(GFlowNets),前者通常用来模拟连续空间的分布,后者用于分布图等离散结构的分布;我们证明,在某些情况下,六种算法相当于GFlowNet的特殊情况,即其学习目标的预期梯度平等;然后我们指出两个家庭之间的差异,并表明这些差异是如何实验性地出现的。 值得注意的是,从强化学习中借用想法的GFlowNets比VI更适于进行离政策培训,而没有因重要取样而导致的高度梯度差异的代价。 我们说,GFlowNets的这种特性可以为在多式目标分布中捕捉多样性带来优势。