Three robust methods for clustering multivariate time series from the point of view of generating processes are proposed. The procedures are robust versions of a fuzzy C-means model based on: (i) estimates of the quantile cross-spectral density and (ii) the classical principal component analysis. Robustness to the presence of outliers is achieved by using the so-called metric, noise and trimmed approaches. The metric approach incorporates in the objective function a distance measure aimed at neutralizing the effect of the outliers, the noise approach builds an artificial cluster expected to contain the outlying series and the trimmed approach eliminates the most atypical series in the dataset. All the proposed techniques inherit the nice properties of the quantile cross-spectral density, as being able to uncover general types of dependence. Results from a broad simulation study including multivariate linear, nonlinear and GARCH processes indicate that the algorithms are substantially effective in coping with the presence of outlying series (i.e., series exhibiting a dependence structure different from that of the majority), clearly poutperforming alternative procedures. The usefulness of the suggested methods is highlighted by means of two specific applications regarding financial and environmental series.
翻译:从生成过程的角度来看,提出了三种稳健的多变时间序列组合方法。这些程序是模糊的C值模型的稳健版本,其依据是:(一) 对四分位跨光谱密度的估计,以及(二) 古典主要组成部分分析。通过使用所谓的衡量、噪音和减缩方法,实现了对外部值存在的强健性。衡量方法在客观功能中纳入了一种旨在中和外部值效应的距离措施,噪音方法建立了一个人工组群,预计将包含外围值序列,而三角法则消除数据集中最非典型的系列。所有拟议的技术都继承了可发现一般依赖性的微分跨光谱密度的良好特性。广泛的模拟研究的结果,包括多变线性线性、非线性和GARCH进程,表明算法在应对外围值序列的存在(即显示与大多数不同依赖性结构的系列)方面非常有效,明显地排除了数据集中最不典型的系列。建议的方法的实用性通过两种具体应用手段来突出地说明关于环境的两种具体应用方法的效用。