Precision medicine provides customized treatments to patients based on their characteristics and is a promising approach to improving treatment efficiency. Large scale omics data are useful for patient characterization, but often their measurements change over time, leading to longitudinal data. Random forest is one of the state-of-the-art machine learning methods for building prediction models, and can play a crucial role in precision medicine. In this paper, we review extensions of the standard random forest method for the purpose of longitudinal data analysis. Extension methods are categorized according to the data structures for which they are designed. We consider both univariate and multivariate responses and further categorize the repeated measurements according to whether the time effect is relevant. Information of available software implementations of the reviewed extensions is also given. We conclude with discussions on the limitations of our review and some future research directions.
翻译:精密医学根据病人的特征提供定制治疗,这是提高治疗效率的一个很有希望的方法; 大规模流行性数据对病人特征分析有用,但往往随着时间的变化而改变测量结果,从而导致纵向数据; 随机森林是建立预测模型的最先进的机器学习方法之一,在精确医学方面可以发挥关键作用; 本文审查标准随机森林方法的扩展,以便进行纵向数据分析; 扩展方法按其设计的数据结构进行分类; 我们考虑单体和多变量反应,并进一步根据时间效果对重复测量进行分类; 还提供了关于所审查延长的现有软件实施情况的信息; 我们最后讨论了我们审查的局限性和今后一些研究方向。