In this work, we introduce the Personalized Online Super Learner (POSL) -- an online ensembling algorithm for streaming data whose optimization procedure accommodates varying degrees of personalization. Namely, POSL optimizes predictions with respect to baseline covariates, so personalization can vary from completely individualized (i.e., optimization with respect to baseline covariate subject ID) to many individuals (i.e., optimization with respect to common baseline covariates). As an online algorithm, POSL learns in real-time. POSL can leverage a diversity of candidate algorithms, including online algorithms with different training and update times, fixed algorithms that are never updated during the procedure, pooled algorithms that learn from many individuals' time-series, and individualized algorithms that learn from within a single time-series. POSL's ensembling of this hybrid of base learning strategies depends on the amount of data collected, the stationarity of the time-series, and the mutual characteristics of a group of time-series. In essence, POSL decides whether to learn across samples, through time, or both, based on the underlying (unknown) structure in the data. For a wide range of simulations that reflect realistic forecasting scenarios, and in a medical data application, we examine the performance of POSL relative to other current ensembling and online learning methods. We show that POSL is able to provide reliable predictions for time-series data and adjust to changing data-generating environments. We further cultivate POSL's practicality by extending it to settings where time-series enter/exit dynamically over chronological time.
翻译:在这项工作中,我们引入了个人化在线超级学习者(POSL) -- -- 一种用于流数据流的在线组合算法,其优化程序适应了不同程度的个人化。也就是说,POSL优化了基线共变的预测,因此个人化可以从完全个化(即基准共变主题ID的优化)到许多个人(即共同基线共变变量的优化)。作为在线算法,POSL实时学习。POSL可以利用多种候选算法,包括培训和更新时间不同的在线算法、在程序期间从未更新的固定算法、从许多个人的时间序列中学习的集合算法,以及从一个单一的时间序列中学习的个体化算法。POSL组成这种混合的基础学习战略取决于所收集的数据数量、时间序列的稳定性和时间序列的相互特征。从本质上,POSLL决定了是否通过不同时间和更新时间周期的在线算法,或者两者的固定算法,基于一个时间序列中我们所知道的动态的预测模型,我们从一个动态的预测模型到一个动态序列的数据的模型,我们从一个动态的模型到一个动态的模型的模型的模型,我们从一个模型的模型到一个模型的模型的模型的模型的模型的模型的模型的模型的演化数据。