Estimating the gradients of stochastic nodes is one of the crucial research questions in the deep generative modeling community, which enables the gradient descent optimization on neural network parameters. This estimation problem becomes further complex when we regard the stochastic nodes to be discrete because pathwise derivative techniques cannot be applied. Hence, the stochastic gradient estimation of discrete distributions requires either a score function method or continuous relaxation of the discrete random variables. This paper proposes a general version of the Gumbel-Softmax estimator with continuous relaxation, and this estimator is able to relax the discreteness of probability distributions including more diverse types, other than categorical and Bernoulli. In detail, we utilize the truncation of discrete random variables and the Gumbel-Softmax trick with a linear transformation for the relaxed reparameterization. The proposed approach enables the relaxed discrete random variable to be reparameterized and to backpropagated through a large scale stochastic computational graph. Our experiments consist of (1) synthetic data analyses, which show the efficacy of our methods; and (2) applications on VAE and topic model, which demonstrate the value of the proposed estimation in practices.
翻译:估计随机节点的梯度是深基因建模社区的关键研究问题之一,它使得神经网络参数参数能够对神经网络参数进行梯度的梯度优化。当我们认为,由于路径性衍生技术无法应用,因此这种估计问题变得更加复杂。因此,对离异分布的随机流流流流流流流流的随机流流流点的随机度估计需要分分分数函数或连续放松离散随机变量,这是深基因建模社区的关键研究问题之一,它使神经网络参数能够对神经网络参数参数进行梯度下沉优化。当我们认为,由于无法应用路径性衍生技术,因此,这种估计问题就变得更加复杂。因此,对离异分布的相切点点的相点点点点点点点估计需要一种评分函数方法,或者需要连续放松离异随机离异变量的连续放松。本文提出一个通用的 Gumbel-Softmax 的通用版本,不断放松放松放松的 Gumb度估计,通过一个大尺度的随机分析图,这个估计器能够放松概率分布离离离离离离离离离离离离离的普通的普通的普通的通用,这个模型包括(1)个合成数据分析模型分析,其中显示我们专题、A 和应用的方法和应用,以展示了我们研究的估价方法的数值,并应用。