The extremal dependence structure of a regularly varying $d$-dimensional random vector can be described by its angular measure. The standard nonparametric estimator of this measure is the empirical measure of the observed angles of the $k$ random vectors with largest norm, for a suitably chosen number $k$. Due to the curse of dimensionality, for moderate or large $d$, this estimator is often inaccurate. If the angular measure is concentrated on a vicinity of a lower dimensional subspace, then first projecting the data on a lower dimensional subspace obtained by a principal component analysis of the angles of extreme observations can substantially improve the performance of the estimator. We derive the asymptotic behavior of such PCA projections and the resulting excess risk. In particular, it is shown that, under mild conditions, the excess risk (as a function of $k$) decreases much faster than it was suggested by empirical risk bounds obtained in \cite{DS21}. Moreover, functional limit theorems for local empirical processes of the (empirical) reconstruction error of projections uniformly over neighborhoods of the true optimal projection are established. Based on these asymptotic results, we propose a data-driven method to select the dimension of the projection space. Finally, the finite sample performance of resulting estimators is examined in a simulation study.
翻译:暂无翻译