We investigate the problem of regression where one is allowed to abstain from predicting. We refer to this framework as regression with reject option as an extension of classification with reject option. In this context, we focus on the case where the rejection rate is fixed and derive the optimal rule which relies on thresholding the conditional variance function. We provide a semi-supervised estimation procedure of the optimal rule involving two datasets: a first labeled dataset is used to estimate both regression function and conditional variance function while a second unlabeled dataset is exploited to calibrate the desired rejection rate. The resulting predictor with reject option is shown to be almost as good as the optimal predictor with reject option both in terms of risk and rejection rate. We additionally apply our methodology with kNN algorithm and establish rates of convergence for the resulting kNN predictor under mild conditions. Finally, a numerical study is performed to illustrate the benefit of using the proposed procedure.
翻译:我们调查回归问题,允许人们放弃预测。我们把这个框架称为拒绝回归选项,作为拒绝选项,作为拒绝选项的扩展分类。在这方面,我们侧重于拒绝率固定和得出依赖有条件差异功能阈值的最佳规则的情况。我们为包含两个数据集的最佳规则提供了一个半监督的估计程序:第一个标签数据集用于估算回归函数和有条件差异函数,而第二个未标签数据集用于校准预期拒绝率。由此得出的带有拒绝选项的预测器在风险率和拒绝率两方面都与拒绝选项的最佳预测器几乎一样好,我们进一步运用了与 kNN 算法的方法,并在温和的条件下为由此产生的 kNNN 预测器设定了趋同率。最后,进行了数字研究,以说明使用拟议程序的好处。