模拟到目标距离的概率分布 (Probability distributions for analog-to-target distances)

Some properties of chaotic dynamical systems can be probed through features of recurrences, also called analogs. In practice, analogs are nearest neighbours of the state of a system, taken from a large database called the catalog. Analogs have been used in many atmospheric applications including forecasts, downscaling, predictability estimation, and attribution of extreme events. The distances of the analogs to the target state condition the performances of analog applications. These distances can be viewed as random variables, and their probability distributions can be related to the catalog size and properties of the system at stake. A few studies have focused on the first moments of return time statistics for the best analog, fixing an objective of maximum distance from this analog to the target state. However, for practical use and to reduce estimation variance, applications usually require not just one, but many analogs. In this paper, we evaluate from a theoretical standpoint and with numerical experiments the probability distributions of the $K$-best analog-to-target distances. We show that dimensionality plays a role on the size of the catalog needed to find good analogs, and also on the relative means and variances of the $K$-best analogs. Our results are based on recently developed tools from dynamical systems theory. These findings are illustrated with numerical simulations of a well-known chaotic dynamical system and on 10m-wind reanalysis data in north-west France. A practical application of our derivations for the purpose of objective-based dimension reduction is shown using the same reanalysis data.

翻译：混乱动态系统的某些特性可以通过重现特征(也称为模拟)来探测。实际上,模拟是系统状态的近邻,取自一个称为目录的大型数据库。许多大气应用中都使用了模拟,包括预测、降尺度、可预测性估计和极端事件的归属。模拟与目标状态的距离是模拟应用的性能。这些距离可以被视为随机变量,其概率分布可以与系统所在的目录大小和特性有关。一些研究侧重于系统状态的返回时间统计的最初时刻,以最佳模拟为主,确定从这一模拟到目标状态的最大距离的目标。然而,为了实际使用和减少估计差异,应用通常不仅需要一种,而是许多模拟。在本文件中,我们从理论角度和数字实验的角度评价美元最佳模拟与目标距离的概率分布。我们显示,在查找良好模拟的高级时间统计的大小上具有一定的作用,同时利用我们不断变动的模型的精确值分析工具显示,在不断变动的模型中,在不断变动的模型中,在不断变动的系统上显示。