We propose to study the generalization error of a learned predictor $\hat h$ in terms of that of a surrogate (potentially randomized) predictor that is coupled to $\hat h$ and designed to trade empirical risk for control of generalization error. In the case where $\hat h$ interpolates the data, it is interesting to consider theoretical surrogate classifiers that are partially derandomized or rerandomized, e.g., fit to the training data but with modified label noise. We also show that replacing $\hat h$ by its conditional distribution with respect to an arbitrary $\sigma$-field is a convenient way to derandomize. We study two examples, inspired by the work of Nagarajan and Kolter (2019) and Bartlett et al. (2019), where the learned classifier $\hat h$ interpolates the training data with high probability, has small risk, and, yet, does not belong to a nonrandom class with a tight uniform bound on two-sided generalization error. At the same time, we bound the risk of $\hat h$ in terms of surrogates constructed by conditioning and denoising, respectively, and shown to belong to nonrandom classes with uniformly small generalization error.
翻译:我们建议研究一个学习的预测元$h$(潜在随机化)替代预测元(可能随机化)的通用差错,该预测元与美元美元相联,旨在将经验风险用于控制一般差错。在美元和美元之间对数据进行内部调试的情况下,我们建议研究部分解密或重新调整的理论替代分类元的通用差错,例如,与培训数据相适应,但使用修改的标签噪音。我们还表明,以任意的美元(gigma$-field)的有条件分配取代美元(h$)是解禁的方便方法。我们研究了两个例子,这些例子受Nagarajan和Kolter(2019年)和Bartlett等人(2019年)的工作启发,在这类例子中,学习的分类元和美元对培训数据进行部分解密或重新调整的可能性很大,风险很小,然而,也不属于在两面通用差错上严格统一的非随机类。与此同时,我们将美元的风险与不固定的等级分别表现为不固定的等级和不固定的等级。