We propose a general and scalable approximate sampling strategy for probabilistic models with discrete variables. Our approach uses gradients of the likelihood function with respect to its discrete inputs to propose updates in a Metropolis-Hastings sampler. We show empirically that this approach outperforms generic samplers in a number of difficult settings including Ising models, Potts models, restricted Boltzmann machines, and factorial hidden Markov models. We also demonstrate the use of our improved sampler for training deep energy-based models on high dimensional discrete data. This approach outperforms variational auto-encoders and existing energy-based models. Finally, we give bounds showing that our approach is near-optimal in the class of samplers which propose local updates.
翻译:我们为具有离散变量的概率模型提出了一个一般和可扩缩的近似抽样战略。我们的方法使用离散投入的可能性函数梯度,在大都会-哈斯廷采样器中提出更新建议。我们从经验上表明,这一方法优于若干困难环境中的通用采样器,包括Ising模型、Potts模型、限制使用的Boltzmann机器和因子隐藏的Markov模型。我们还展示了我们改良的采样器用于培训高维离散数据的深能量模型。这个方法优于变异自动编码器和现有的能源模型。最后,我们给出的界限表明,在提出本地更新的样本类别中,我们的方法接近最佳。