Several approximate inference algorithms have been proposed to minimize an alpha-divergence between an approximating distribution and a target distribution. Many of these algorithms introduce bias, the magnitude of which becomes problematic in high dimensions. Other algorithms are unbiased. These often seem to suffer from high variance, but little is rigorously known. In this work we study unbiased methods for alpha-divergence minimization through the Signal-to-Noise Ratio (SNR) of the gradient estimator. We study several representative scenarios where strong analytical results are possible, such as fully-factorized or Gaussian distributions. We find that when alpha is not zero, the SNR worsens exponentially in the dimensionality of the problem. This casts doubt on the practicality of these methods. We empirically confirm these theoretical results.
翻译:提出了几种近似推论算法,以尽量减少近似分布和目标分布之间的阿尔法差异。许多这些算法引入了偏差,其程度在高维方面引起问题。其他算法是不带偏见的。这些算法似乎往往存在很大差异,但鲜为人知。在这项工作中,我们研究了通过梯度估计值的信号到噪音比率(SNR)来尽量减少阿尔法差异的不公正性方法。我们研究了几种有代表性的假设,在其中有可能产生强有力的分析结果,例如完全影响分布法或高斯分布法。我们发现,当阿尔法不是零时,国家情报局在问题的维度方面会急剧恶化。这使人们对这些方法的实用性产生怀疑。我们从经验上证实了这些理论结果。