When building either prediction intervals for regression (with real-valued response) or prediction sets for classification (with categorical responses), uncertainty quantification is essential to studying complex machine learning methods. In this paper, we develop Ensemble Regularized Adaptive Prediction Set (ERAPS) to construct prediction sets for time-series (with categorical responses), based on the prior work of [Xu and Xie, 2021]. In particular, we allow unknown dependencies to exist within features and responses that arrive in sequence. Method-wise, ERAPS is a distribution-free and ensemble-based framework that is applicable for arbitrary classifiers. Theoretically, we bound the coverage gap without assuming data exchangeability and show asymptotic set convergence. Empirically, we demonstrate valid marginal and conditional coverage by ERAPS, which also tends to yield smaller prediction sets than competing methods.
翻译:在为回归(以实际估价的响应)或为分类(以明确的答复)建立预测间隔时,不确定性量化对于研究复杂的机器学习方法至关重要。在本文件中,我们根据[Xu和Xie, 2021] 以前的工作,为时间序列(以明确的答复)建立综合的标准化适应预测组(ERAPS),为时间序列(以绝对反应)建立预测组(以绝对反应),特别是,我们允许在按顺序到达的特征和反应中存在未知的相互依存关系。方法上,ERAPS是一个适用于任意分类者的无分配和共用框架。理论上,我们将覆盖差距捆绑起来,不假定数据互换性,并显示零现的组合汇合。我们偶然地表明,ERAPS的边际和有条件覆盖是有效的,它也往往产生比竞争性方法更小的预测组。