Detecting changepoints in datasets with many variates is a data science challenge of increasing importance. Motivated by the problem of detecting changes in the incidence of terrorism from a global terrorism database, we propose a novel approach to multiple changepoint detection in multivariate time series. Our method, which we call SUBSET, is a model-based approach which uses a penalised likelihood to detect changes for a wide class of parametric settings. We provide theory that guides the choice of penalties to use for SUBSET, and that shows it has high power to detect changes regardless of whether only a few variates or many variates change. Empirical results show that SUBSET out-performs many existing approaches for detecting changes in mean in Gaussian data; additionally, unlike these alternative methods, it can be easily extended to non-Gaussian settings such as are appropriate for modelling counts of terrorist events.
翻译:以多种变异形式探测数据集的变化点是一个日益重要的数据科学挑战,由于从全球恐怖主义数据库中检测恐怖主义发生率变化的问题,我们提出了在多变时间序列中多变点检测新颖的方法。我们称之为“亚伯尔尼特”的方法是一种基于模型的方法,它使用一种惩罚性的可能性来检测广泛等级的参数设置的变化。我们提供了理论,指导如何选择对亚伯利特使用的处罚,并表明无论只是少数变异还是许多变异都具有很高的检测变化的能力。经验性结果显示,亚伯利特在探测高斯数据平均值变化方面比许多现有方法要好;此外,与这些替代方法不同,它可以很容易地推广到非高加索环境,例如用来模拟恐怖主义事件的统计。