有针对性的交叉估值 (Targeted Cross-Validation)

In many applications, we have access to the complete dataset but are only interested in the prediction of a particular region of predictor variables. A standard approach is to find the globally best modeling method from a set of candidate methods. However, it is perhaps rare in reality that one candidate method is uniformly better than the others. A natural approach for this scenario is to apply a weighted $L_2$ loss in performance assessment to reflect the region-specific interest. We propose a targeted cross-validation (TCV) to select models or procedures based on a general weighted $L_2$ loss. We show that the TCV is consistent in selecting the best performing candidate under the weighted $L_2$ loss. Experimental studies are used to demonstrate the use of TCV and its potential advantage over the global CV or the approach of using only local data for modeling a local region. Previous investigations on CV have relied on the condition that when the sample size is large enough, the ranking of two candidates stays the same. However, in many applications with the setup of changing data-generating processes or highly adaptive modeling methods, the relative performance of the methods is not static as the sample size varies. Even with a fixed data-generating process, it is possible that the ranking of two methods switches infinitely many times. In this work, we broaden the concept of the selection consistency by allowing the best candidate to switch as the sample size varies, and then establish the consistency of the TCV. This flexible framework can be applied to high-dimensional and complex machine learning scenarios where the relative performances of modeling procedures are dynamic.

翻译：在许多应用中,我们可以使用完整的数据集,但只对预测值变量的某一区域的预测感兴趣。一种标准做法是从一组候选方法中找到全球最佳的最佳模型方法。然而,在现实中,一种候选方法可能比其他方法更加一致,可能很少出现。一种自然的情景是,在绩效评估中采用加权成本为2美元的损失,以反映特定区域的利益。我们建议采用一个有针对性的交叉验证(TCV),在一般加权损失为2美元的基础上选择模型或程序。我们表明,TCV在根据加权成本为2美元的机算损失选择全球最佳的模型方法方面是一致的。实验性研究用来证明使用TCV及其在全球CV或仅使用本地数据作为本地区域模型的优势方面的潜在优势。对于CV的以往调查所依赖的条件是,当样本规模足够大时,两名候选人的排名保持同样的弹性。然而在许多应用中,数据生成过程或高度适应性强的模型方法中,相对性性性的工作表现是动态的,而这种方法的相对性能则随着不断变化的样本规模而变化。