Analysis and clustering of multivariate time-series data attract growing interest in immunological and clinical studies. In such applications, researchers are interested in clustering subjects based on potentially high-dimensional longitudinal features, and in investigating how clinical covariates may affect the clustering results. These studies are often challenging due to high dimensionality, as well as the sparse and irregular nature of sample collection along the time dimension. We propose a smoothed probabilistic PARAFAC model with covariates (SPACO) to tackle these two problems while utilizing auxiliary covariates of interest. We provide intensive simulations to test different aspects of SPACO and demonstrate its use on immunological data sets from two recent cohorts of SARs-CoV-2 patients.
翻译:多变量时间序列数据的分析和分组吸引了人们对免疫学和临床研究的兴趣。在这类应用中,研究人员希望根据潜在的高维纵向特征对主题进行分组,并研究临床共变会如何影响集群结果。这些研究往往具有挑战性,因为高度的维度,以及取样收集在时间方面的稀少和不规律性质。我们建议与共变机构(SPACO)一道,采用一个平稳的概率模型来解决这两个问题,同时利用附带的共变因素。我们提供密集模拟,测试SPACO的不同方面,并展示其在两个最近组群的SARS-COV-2病人的免疫数据集中的使用情况。