We study two adaptive importance sampling schemes for estimating the probability of a rare event in the high-dimensional regime $d \to \infty$ with $d$ the dimension. The first scheme is the prominent cross-entropy (CE) method, and the second scheme, motivated by recent results, uses as auxiliary distribution a projection of the optimal auxiliary distribution on a lower dimensional subspace. In these schemes, two samples are used: the first one to learn the auxiliary distribution and the second one, drawn according to the learned distribution, to perform the final probability estimation. Contrary to the common belief that the sample size needs to grow exponentially in the dimension to make the estimator consistent and avoid the weight degeneracy phenomenon, we find that a polynomial sample size in the first learning step is enough. We prove this result assuming that the sought probability is bounded away from 0. For CE, insight is provided on the polynomial growth rate which remains implicit. In contrast, we study the second scheme in a simple computational framework assuming that samples from the conditional distribution are available. This makes it possible to show that the sample size only needs to grow like $rd$ with $r$ the effective dimension of the projection, which highlights the potential benefits of these projection methods.
翻译:暂无翻译