It is common in machine learning to estimate a response y given covariate information x. However, these predictions alone do not quantify any uncertainty associated with said predictions. One way to overcome this deficiency is with conformal inference methods, which construct a set containing the unobserved response y with a prescribed probability. Unfortunately, even with one-dimensional responses, conformal inference is computationally expensive despite recent encouraging advances. In this paper, we explore the multidimensional response case within a regression setting, delivering exact derivations of conformal inference p-values when the predictive model can be described as a linear function of y. Additionally, we propose different efficient ways of approximating the conformal prediction region for non-linear predictors while preserving computational advantages. We also provide empirical justification for these approaches using a real-world data example.
翻译:在机器学习中,估计一个响应和给定的共变信息x是常见的。然而,单凭这些预测并不能量化与上述预测有关的任何不确定性。克服这一缺陷的一个办法是采用符合逻辑的推论方法,即建立一套包含未观测的响应和特定概率的数据集。不幸的是,即使采用一维的响应,尽管最近取得了令人鼓舞的进展,一致性推论在计算上也是昂贵的。在本文件中,我们探索了一个回归环境的多维反应案例,在预测模型被描述为y的线性函数时,提供符合逻辑推论 p值的精确衍生结果。此外,我们提出了不同有效的方法,在保存计算优势的同时,以非线性预测器为符合逻辑预测区域,同时保留计算优势。我们还以真实世界数据为例,为这些方法提供了经验上的理由。