Selective prediction, where a model has the option to abstain from making a decision, is crucial for machine learning applications in which mistakes are costly. In this work, we focus on distributional regression and introduce a framework that enables the model to abstain from estimation in situations of high uncertainty. We refer to this approach as distributional regression with reject option, inspired by similar concepts in classification and regression with reject option. We study the scenario where the rejection rate is fixed. We derive a closed-form expression for the optimal rule, which relies on thresholding the entropy function of the Continuous Ranked Probability Score (CRPS). We propose a semi-supervised estimation procedure for the optimal rule, using two datasets: the first, labeled, is used to estimate both the conditional distribution function and the entropy function of the CRPS, while the second, unlabeled, is employed to calibrate the desired rejection rate. Notably, the control of the rejection rate is distribution-free. Under mild conditions, we show that our procedure is asymptotically as effective as the optimal rule, both in terms of error rate and rejection rate. Additionally, we establish rates of convergence for our approach based on distributional k-nearest neighbor. A numerical analysis on real-world datasets demonstrates the strong performance of our procedure
翻译:暂无翻译