In this paper, we use and further develop upon a recently proposed multivariate, distribution-free Goodness-of-Fit (GoF) test based on the theory of Optimal Transport (OT) called the Rank Energy (RE) [1], for non-parametric and unsupervised Change Point Detection (CPD) in multivariate time series data. We show that directly using RE leads to high sensitivity to very small changes in distributions (causing high false alarms) and it requires large sample complexity and huge computational cost. To alleviate these drawbacks, we propose a new GoF test statistic called as soft-Rank Energy (sRE) that is based on entropy regularized OT and employ it towards CPD. We discuss the advantages of using sRE over RE and demonstrate that the proposed sRE based CPD outperforms all the existing methods in terms of Area Under the Curve (AUC) and F1-score on real and synthetic data sets.
翻译:在本文中,我们使用并进一步发展了最近提出的基于最佳运输理论的多变量、无分布式良好环境(GOF)测试,该测试在多变量时间序列数据中称为 " Rank Energy(RE)[1) ",用于非参数和无监督的变化点探测(CPD),我们表明,直接使用RE导致对分布变化的非常小的敏感度(造成高度假警报),这需要大量的样本复杂性和巨大的计算成本。为了减轻这些缺陷,我们提议以软兰克能源(sRE)为名的新的GOF测试统计,该统计以加密正规化的OT为基础,并将它用于CPD。我们讨论了使用 sRE高于RE的优势,并表明基于拟议的SRE的CPD在实际和合成数据集上超越了在Curve(AUC)和F1核心区域的所有现有方法。