In many areas of applied statistics and machine learning, generating an arbitrary number of independent and identically distributed (i.i.d.) samples from a given distribution is a key task. When the distribution is known only through evaluations of the density, current methods either scale badly with the dimension or require very involved implementations. Instead, we take a two-step approach by first modeling the probability distribution and then sampling from that model. We use the recently introduced class of positive semi-definite (PSD) models, which have been shown to be efficient for approximating probability densities. We show that these models can approximate a large class of densities concisely using few evaluations, and present a simple algorithm to effectively sample from these models. We also present preliminary empirical results to illustrate our assertions.
翻译:在许多应用统计和机器学习领域,从特定分布中产生任意的、独立和同样分布(即d)的样本是一个关键任务。当仅仅通过对密度的评价才知道分布时,目前的方法不是与尺寸相形见绌,就是需要非常涉及执行。相反,我们采取两步走的办法,首先对概率分布进行建模,然后从该模型中进行取样。我们使用最近引进的正半确定值模型,这些模型已证明对接近概率密度有效。我们表明,这些模型可以用很少的评价简洁地接近一大批密度,并提出了从这些模型中有效抽样的简单算法。我们还提供了初步的经验结果来说明我们的说法。