Functional data analysis deals with data recorded densely over time (or any other continuum) with one or more observed curves per subject. Conceptually, functional data are continuously defined, but in practice, they are usually observed at discrete points. Among different kinds of functional data analyses, clustering analysis aims to determine underlying groups of curves in the dataset when there is no information on the group membership of each individual curve. In this work, we propose a new model-based approach for clustering and smoothing functional data simultaneously via variational inference. We derive coordinate ascent mean-field variational Bayes algorithms to approximate the posterior distribution of our model parameters by finding the variational distribution with the smallest Kullback-Leibler divergence to the posterior. The performance of our proposed method is evaluated using simulated data and publicly available datasets.
翻译:功能数据分析涉及在一段时间内以一个或一个以上观察到的曲线密集记录的数据(或任何其他连续数据),每个主题都有一个或多个观察到的曲线。从概念上讲,功能数据是连续定义的,但在实践中,功能数据通常是在离散点上观察到的。在各种功能数据分析中,分组分析的目的是在没有关于每个曲线所属群体的信息的情况下确定数据集中的曲线基本组别。在这项工作中,我们建议采用新的基于模型的方法,通过变式推论同时进行组合和平滑功能数据。我们通过寻找最小的 Kullback-leiber 差异到远端的模型参数的变异分布,从而得出对准平均场变异性基算法的协调,以接近我们模型参数的后部分布。我们拟议方法的性能是通过模拟数据和公开的数据集进行评估的。