Functional data analysis deals with data that are recorded densely over time (or any other continuum) with one or more observed curves per subject. Conceptually, functional data are continuously defined, but in practice, they are usually observed at discrete points. Among different kinds of functional data analyses, clustering analysis aims to determine underlying groups of curves in the dataset when there is no information on the group membership of each individual curve. In this work, we propose a new model-based approach for clustering and smoothing functional data simultaneously via variational inference. We derive a variational Bayes (VB) algorithm to approximate the posterior distribution of our model parameters by finding the variational distribution with the smallest Kullback-Leibler divergence to the posterior. Our VB algorithm is implemented as an R package and its performance is evaluated using simulated data and publicly available datasets.
翻译:功能数据分析涉及随着时间(或任何其他连续)每个主题观察到的曲线而记录密度高的数据。从概念上讲,功能数据是连续不断界定的,但实际上通常在离散点观测。在各种功能数据分析中,集群分析的目的是在没有关于每个曲线组别组成信息的情况下确定数据集中的曲线基本组别。在这项工作中,我们建议采用新的基于模型的方法,通过变式推断同时进行组合和平滑功能数据。我们通过找到最小的 Kullback- Leiper 差异到离子点的变异分布,得出一种变式贝斯(VB)算法,以接近我们模型参数的后方分布。我们的VB算法是作为R 组合实施的,其性能是通过模拟数据和公开的数据集加以评估的。