We study two adaptive importance sampling schemes for estimating the probability of a rare event in the high-dimensional regime $d \to \infty$ with $d$ the dimension. The first scheme, motivated by recent results, seeks to use as auxiliary distribution a projection of the optimal auxiliary distribution (optimal among Gaussian distributions, and in the sense of the Kullback--Leibler divergence); the second scheme is the prominent cross-entropy method. In these schemes, two samples are used: the first one to learn the auxiliary distribution and the second one, drawn according to the learnt distribution, to perform the final probability estimation. Contrary to the common belief that the sample size needs to grow exponentially in the dimension to make the estimator consistent and avoid the weight degeneracy phenomenon, we find that a polynomial sample size in the first learning step is enough. We prove this result assuming that the sought probability is bounded away from $0$. For the first scheme, we show that the sample size only needs to grow like $rd$ with $r$ the effective dimension of the projection, while for cross-entropy, the polynomial growth rate remains implicit although insight on its value is provided. In addition to proving consistency, we also prove that in the regimes studied, the importance sampling weights do not degenerate.
翻译:暂无翻译