We propose a fast method with statistical guarantees for learning an exponential family density model where the natural parameter is in a reproducing kernel Hilbert space, and may be infinite-dimensional. The model is learned by fitting the derivative of the log density, the score, thus avoiding the need to compute a normalization constant. Our approach improves the computational efficiency of an earlier solution by using a low-rank, Nystr\"om-like solution. The new solution retains the consistency and convergence rates of the full-rank solution (exactly in Fisher distance, and nearly in other distances), with guarantees on the degree of cost and storage reduction. We evaluate the method in experiments on density estimation and in the construction of an adaptive Hamiltonian Monte Carlo sampler. Compared to an existing score learning approach using a denoising autoencoder, our estimator is empirically more data-efficient when estimating the score, runs faster, and has fewer parameters (which can be tuned in a principled and interpretable way), in addition to providing statistical guarantees.
翻译:我们提出了一个快速方法,提供统计保障,用于学习指数式家庭密度模型,其中自然参数位于复制的内核Hilbert空间,并且可能是无限的。该模型通过匹配日志密度的衍生物,即分数,学习,从而避免计算正常化常数的必要性。我们的方法通过使用低调的Nystr\'om式解决方案,提高早期解决方案的计算效率。新的解决方案保留了全级解决方案(在费希尔距离和近乎其他距离)的一致性和趋同率,同时保证降低成本和储存量。我们评估了密度估计实验和建造适应性汉密尔顿蒙特-蒙特-卡洛取样器的方法。与利用现有的分数学习方法相比,我们的测算器在估算得分时,在经验上提高了数据效率,运行速度更快,而且参数(可以有原则地和可解释的方式加以调整)较少,此外还提供统计保障。