The striking fractal geometry of strange attractors underscores the generative nature of chaos: like probability distributions, chaotic systems can be repeatedly measured to produce arbitrarily-detailed information about the underlying attractor. Chaotic systems thus pose a unique challenge to modern statistical learning techniques, while retaining quantifiable mathematical properties that make them controllable and interpretable as benchmarks. Here, we present a growing database currently comprising 131 known chaotic dynamical systems spanning fields such as astrophysics, climatology, and biochemistry. Each system is paired with precomputed multivariate and univariate time series. Our dataset has comparable scale to existing static time series databases; however, our systems can be re-integrated to produce additional datasets of arbitrary length and granularity. Our dataset is annotated with known mathematical properties of each system, and we perform feature analysis to broadly categorize the diverse dynamics present across the collection. Chaotic systems inherently challenge forecasting models, and across extensive benchmarks we correlate forecasting performance with the degree of chaos present. We also exploit the unique generative properties of our dataset in several proof-of-concept experiments: surrogate transfer learning to improve time series classification, importance sampling to accelerate model training, and benchmarking symbolic regression algorithms.
翻译:奇怪的吸引者惊人的分形几何特征凸显了混乱的基因性质:像概率分布一样,可以反复测量混乱系统,以产生关于基本吸引者的任意详细信息。因此,不卫生系统对现代统计学习技术提出了独特的挑战,同时保留了可量化的数学属性,使其可以控制,可以作为基准加以解释。在这里,我们展示了一个不断增长的数据库,目前由131个已知的混乱动态系统组成,涵盖天体物理学、气候学和生物化学等各个领域。每个系统都与预先计算多变和未变时间序列相匹配。我们的数据集与现有的静态时间序列数据库具有可比的规模;然而,我们的系统可以重新整合,以产生更多任意长度和颗粒性数据集。我们的数据集带有已知的数学属性,我们进行特征分析,以广泛分类收集的各领域的多样性动态。冷却系统必然对预测模型提出挑战,并且跨越广泛的基准,我们将预测性能与目前的混乱程度联系起来。我们还利用了我们数据集中的独特基因化特性的属性属性属性属性属性,与现有的静态时间序列数据库相比较;然而,我们的系统可以重新整合到标志性回归分析实验,以加速。