功能性聚类方法在具有时间异质性的二元长纵向数据中的应用 (Functional clustering methods for binary longitudinal data with temporal heterogeneity)

In the analysis of binary longitudinal data, it is of interest to model a dynamic relationship between a response and covariates as a function of time, while also investigating similar patterns of time-dependent interactions. We present a novel generalized varying-coefficient model that accounts for within-subject variability and simultaneously clusters varying-coefficient functions, without restricting the number of clusters nor overfitting the data. In the analysis of a heterogeneous series of binary data, the model extracts population-level fixed effects, cluster-level varying effects, and subject-level random effects. Various simulation studies show the validity and utility of the proposed method to correctly specify cluster-specific varying-coefficients when the number of clusters is unknown. The proposed method is applied to a heterogeneous series of binary data in the German Socioeconomic Panel (GSOEP) study, where we identify three major clusters demonstrating the different varying effects of socioeconomic predictors as a function of age on the working status.

翻译：在分析二元长纵向数据时，有趣的是要将响应和协变量的动态关系建模为时间的函数，同时还要研究类似的时间依赖性交互模式。我们提出了一种新颖的广义可变系数模型，它考虑了受试者变异性并同时聚类可变系数函数，而无需限制聚类数量或过度拟合数据。在分析异质性二元数据的过程中，该模型提取了人口水平的固定效应，聚类水平的变化效应和受试者水平的随机效应。各种模拟研究表明，当聚类数量未知时，所提出的方法能正确地指定特定聚类的可变系数。该方法应用于德国社会经济调查（GSOEP）研究的异质性二元数据中，在工作状况方面我们发现了三个主要聚类，它们展示了不同的社会经济预测变量作为年龄函数对工作状况的作用。