An important task in health research is to characterize time-to-event outcomes such as disease onset or mortality in terms of a potentially high-dimensional set of risk factors. For example, prospective cohort studies of Alzheimer's disease typically enroll older adults for observation over several decades to assess the long-term impact of genetic and other factors on cognitive decline and mortality. The accelerated failure time model is particularly well-suited to such studies, structuring covariate effects as `horizontal' changes to the survival quantiles that conceptually reflect shifts in the outcome distribution due to lifelong exposures. However, this modeling task is complicated by the enrollment of adults at differing ages, and intermittent followup visits leading to interval censored outcome information. Moreover, genetic and clinical risk factors are not only high-dimensional, but characterized by underlying grouping structure, such as by function or gene location. Such grouped high-dimensional covariates require shrinkage methods that directly acknowledge this structure to facilitate variable selection and estimation. In this paper, we address these considerations directly by proposing a Bayesian accelerated failure time model with a group-structured lasso penalty, designed for left-truncated and interval-censored time-to-event data. We develop a custom Markov chain Monte Carlo sampler for efficient estimation, and investigate the impact of various methods of penalty tuning and thresholding for variable selection. We present a simulation study examining the performance of this method relative to models with an ordinary lasso penalty, and apply the proposed method to identify groups of predictive genetic and clinical risk factors for Alzheimer's disease in the Religious Orders Study and Memory and Aging Project (ROSMAP) prospective cohort studies of AD and dementia.
翻译:暂无翻译