Performance benchmarking is a crucial component of time series classification (TSC) algorithm design, and a fast-growing number of datasets have been established for empirical benchmarking. However, the empirical benchmarks are costly and do not guarantee statistical optimality. This study proposes to benchmark the optimality of TSC algorithms in distinguishing diffusion processes by the likelihood ratio test (LRT). The LRT is optimal in the sense of the Neyman-Pearson lemma: it has the smallest false positive rate among classifiers with a controlled level of false negative rate. The LRT requires the likelihood ratio of the time series to be computable. The diffusion processes from stochastic differential equations provide such time series and are flexible in design for generating linear or nonlinear time series. We demonstrate the benchmarking with three scalable state-of-the-art TSC algorithms: random forest, ResNet, and ROCKET. Test results show that they can achieve LRT optimality for univariate time series and multivariate Gaussian processes. However, these model-agnostic algorithms are suboptimal in classifying nonlinear multivariate time series from high-dimensional stochastic interacting particle systems. Additionally, the LRT benchmark provides tools to analyze the dependence of classification accuracy on the time length, dimension, temporal sampling frequency, and randomness of the time series. Thus, the LRT with diffusion processes can systematically and efficiently benchmark the optimality of TSC algorithms and may guide their future improvements.
翻译:性能基准是时间序列分类(TSC)算法设计的关键组成部分,而且为实证基准设定了快速增长的数据集数量,然而,实证基准成本昂贵,不能保证统计的最佳性。本研究报告建议用概率比量测试(LRT)来区分传播过程,以衡量TSC算法的优化性。LRT在Neyman-Pearson Lemma意义上是最佳的:在具有受控制的假负率水平的分类者中,LRT具有最小的假正率。LRT要求对时间序列的概率进行可比较化。Stocharical差异方程式的传播过程提供了这种时间序列,在设计生成线性或非线性时间序列时具有灵活性。我们用三种可缩放的TSC算法最优化性标准算法来显示基准:随机森林、ResNet和RocketET。测试结果表明,它们能够实现LRT对单向时间序列和多变数值测算法进程的最佳性。然而,这些模型算法的算法是用于将非线性梯级的Slovelyal-imal Asimimalalalalalalal imalalallialalalalalalalalalal lialalalalal 和Lestal lialalalalalal lixalal 提供它们在对准的系统上,它们从对准的精确级的精确度的精确度的精确性、对准性、对准性、对准性、对等级数级数级数级数级数级数级数级数级数级码、对数级数级数、对准性级数、对数、对数级数、对数级数级数级数级数的精确制制制制制制制制的精确性、制制的精确制的精确制的精确制制制制制制、制、制、制的精确制的精确制、制、制的精确制、制的精确基数。