Variational approximations are increasingly based on gradient-based optimization of expectations estimated by sampling. Handling discrete latent variables is then challenging because the sampling process is not differentiable. Continuous relaxations, such as the Gumbel-Softmax for categorical distribution, enable gradient-based optimization, but do not define a valid probability mass for discrete observations. In practice, selecting the amount of relaxation is difficult and one needs to optimize an objective that does not align with the desired one, causing problems especially with models having strong meaningful priors. We provide an alternative differentiable reparameterization for categorical distribution by composing it as a mixture of discrete normalizing flows. It defines a proper discrete distribution, allows directly optimizing the evidence lower bound, and is less sensitive to the hyperparameter controlling relaxation.
翻译:不同近似值越来越多地以基于梯度的优化方法为基础,根据抽样估计的预期值进行不同的估计。 处理离散潜在变量因此具有挑战性,因为取样过程是无法区分的。 持续放松,例如用于绝对分布的甘贝尔-软体,能够实现基于梯度的优化,但并没有为离散观测确定有效的概率质量。 实际上,选择放松的数量是困难的,需要优化一个与所期望的不一致的目标,特别是给具有很强有意义的前科的模型造成问题。 我们通过将它组成离散正常流的混合体,为绝对分布提供了另一种可区别的重新校准法。 它定义了适当的离散分布,允许直接优化较低约束的证据,对超分光度控制放松不那么敏感。