Among the variety of statistical intervals, highest-density regions (HDRs) stand out for their ability to effectively summarize a distribution or sample, unveiling its distinctive and salient features. An HDR represents the minimum size set that satisfies a certain probability coverage, and current methods for their computation require knowledge or estimation of the underlying probability distribution or density $f$. In this work, we illustrate a broader framework for computing HDRs, which generalizes the classical density quantile method introduced in the seminal paper of Hyndman (1996). The framework is based on neighbourhood measures, i.e., measures that preserve the order induced in the sample by $f$, and include the density $f$ as a special case. We explore a number of suitable distance-based measures, such as the $k$-nearest neighborhood distance, and some probabilistic variants based on copula models. An extensive comparison is provided, showing the advantages of the copula-based strategy, especially in those scenarios that exhibit complex structures (e.g., multimodalities or particular dependencies). Finally, we discuss the practical implications of our findings for estimating HDRs in real-world applications.
翻译:暂无翻译