Quantile regression is a fundamental problem in statistical learning motivated by a need to quantify uncertainty in predictions, or to model a diverse population without being overly reductive. For instance, epidemiological forecasts, cost estimates, and revenue predictions all benefit from being able to quantify the range of possible values accurately. As such, many models have been developed for this problem over many years of research in statistics, machine learning, and related fields. Rather than proposing yet another (new) algorithm for quantile regression we adopt a meta viewpoint: we investigate methods for aggregating any number of conditional quantile models, in order to improve accuracy and robustness. We consider weighted ensembles where weights may vary over not only individual models, but also over quantile levels, and feature values. All of the models we consider in this paper can be fit using modern deep learning toolkits, and hence are widely accessible (from an implementation point of view) and scalable. To improve the accuracy of the predicted quantiles (or equivalently, prediction intervals), we develop tools for ensuring that quantiles remain monotonically ordered, and apply conformal calibration methods. These can be used without any modification of the original library of base models. We also review some basic theory surrounding quantile aggregation and related scoring rules, and contribute a few new results to this literature (for example, the fact that post sorting or post isotonic regression can only improve the weighted interval score). Finally, we provide an extensive suite of empirical comparisons across 34 data sets from two different benchmark repositories.
翻译:分位数回归是统计学习中的一项基础问题,其动机是需要准确量化预测的不确定性,或在不过于简化的情况下对多样化种群建模。例如,流行病学预测、成本估算和收入预测等都受益于能够准确量化可能值的范围。因此,在统计学、机器学习和相关领域的多年研究中,针对此问题已经开发出了许多模型。取而代之的是,我们采用元视角:我们研究聚合任意数量的条件分位数模型的方法,以提高准确性和鲁棒性。我们考虑加权集成,其中权重不仅可以在单个模型上变化,而且还可以在分位数水平和特征值上变化。本文中考虑的所有模型都可以使用现代深度学习工具包进行拟合,因此广泛可用(从实现角度)且可扩展性。为了提高预测分位数(或等价地,预测间隔)的准确性,我们开发了工具来确保分位数保持单调有序,并应用符合校准方法。这些方法可以在不修改原始基模型库的情况下使用。我们还回顾了一些关于分位数聚合和相关评分规则的基本理论,并为这个领域做出了一些新的贡献(例如,经过排序或等距回归只能改善加权区间得分)。最后,我们在来自两个不同基准库的34个数据集上提供了广泛的实证比较。