Functional data clustering is to identify heterogeneous morphological patterns in the continuous functions underlying the discrete measurements/observations. Application of functional data clustering has appeared in many publications across various fields of sciences, including but not limited to biology, (bio)chemistry, engineering, environmental science, medical science, psychology, social science, etc. The phenomenal growth of the application of functional data clustering indicates the urgent need for a systematic approach to develop efficient clustering methods and scalable algorithmic implementations. On the other hand, there is abundant literature on the cluster analysis of time series, trajectory data, spatio-temporal data, etc., which are all related to functional data. Therefore, an overarching structure of existing functional data clustering methods will enable the cross-pollination of ideas across various research fields. We here conduct a comprehensive review of original clustering methods for functional data. We propose a systematic taxonomy that explores the connections and differences among the existing functional data clustering methods and relates them to the conventional multivariate clustering methods. The structure of the taxonomy is built on three main attributes of a functional data clustering method and therefore is more reliable than existing categorizations. The review aims to bridge the gap between the functional data analysis community and the clustering community and to generate new principles for functional data clustering.
翻译:功能数据组群应用的显著增长表明,迫切需要采取系统办法来制定高效的组合方法和可缩放的算法实施。另一方面,在时间序列、轨迹数据、时空数据等的群集分析方面有大量文献,这些都与功能数据有关。因此,现有功能数据组群方法的总体结构将使得各研究领域各种想法的交叉分布。我们在此全面审查功能数据组群的原始群集方法。我们提出系统分类法,探讨现有功能数据组集方法之间的联系和差异,并将它们与常规的多变量组集方法联系起来。分类法的结构建立在功能数据组群方法的三个主要特征之上,因此,功能组群组群方法的三大特征比功能群群群群群群之间产生功能组群和功能组群群之间的功能性分析更可靠。