In this paper, I propose a general procedure for multivariate distribution-free nonparametric testing derived from the concept of ranks that are based upon measure transportation in the context of multiple change point analysis. I will use this algorithm to estimate both the number of change points and their locations within an observed multivariate time series. In this paper, the change point problem is observed in a general setting in which both the given distribution and number of change points are unknown, rather than assume the observed time series follows a specific distribution or contains only one change point as many works in this area of study assume. The intention of this is to develop a technique for accurately identifying the changes in a distribution while making as few suppositions as possible. The rank energy statistic used here is based on energy statistics and has the potential to detect any change in a distribution. I present the properties of this new algorithm, which can be used to analyze various datasets, including hierarchical clustering, testing multivariate normality, gene selection, and microarray data analysis. This algorithm has also been implemented in the R package recp, which is available on CRAN.
翻译:在本文中,我提出了一个基于多个变化点分析中测算运输的等级概念的多变分布无参数测试的一般程序。我将使用这一算法来估计一个观测到的多变时间序列中的变化点数目及其位置。在本文中,变化点的问题出现在一个总环境中,特定分布和变化点数目都不为人知,而不是假设所观察到的时间序列遵循一个特定的分布,或者只包含一个变化点,而这个研究领域所假设的很多工作所假设的改变点。这个算法的用意是开发一种技术,精确确定分布中的变化,同时尽可能少做一些推测。这里使用的能源等级统计基于能源统计,并有可能探测分布中的任何变化。我介绍这一新算法的特性,可用于分析各种数据集,包括等级组合、测试多变常态性、基因选择和微阵列数据分析。这一算法也已经在CRAN上提供的R包修正中实施。