Many applications of generative models rely on the marginalization of their high-dimensional output probability distributions. Normalization functions that yield sparse probability distributions can make exact marginalization more computationally tractable. However, sparse normalization functions usually require alternative loss functions for training since the log-likelihood is undefined for sparse probability distributions. Furthermore, many sparse normalization functions often collapse the multimodality of distributions. In this work, we present $\textit{ev-softmax}$, a sparse normalization function that preserves the multimodality of probability distributions. We derive its properties, including its gradient in closed-form, and introduce a continuous family of approximations to $\textit{ev-softmax}$ that have full support and can be trained with probabilistic loss functions such as negative log-likelihood and Kullback-Leibler divergence. We evaluate our method on a variety of generative models, including variational autoencoders and auto-regressive architectures. Our method outperforms existing dense and sparse normalization techniques in distributional accuracy. We demonstrate that $\textit{ev-softmax}$ successfully reduces the dimensionality of probability distributions while maintaining multimodality.
翻译:基因变异模型的许多应用都依赖于其高维输出概率分布的边缘化。 生成概率分布稀疏的正常化功能可以使精确的偏差分布在计算上更加容易。 然而, 稀疏的正常化功能通常要求培训的替代损失功能, 因为对日志可能性分布没有定义。 此外, 许多稀疏的正常化功能往往会破坏分销的多式联运。 在这项工作中, 我们展示了 $\ textit{ev- softmax} $, 一个保存概率分布的多式联运的稀疏正常化功能。 我们的特性, 包括封闭式的梯度, 并引入一个连续的近似值组合 $\ textit{ev- softmax} $, 这些近似值得到充分支持, 并且可以接受概率性损失功能的培训, 如负的日志相似性和库尔伯- Leiber 差异。 我们评估了我们关于各种基因变异模型的方法, 包括变式自动组合和自动反向结构。 我们的方法超过了分布准确性的现有密度和稀异的正常化技术。 我们证明 $text\ev- somaxaxaxality dust disality 。