We consider learning to optimize a classification metric defined by a black-box function of the confusion matrix. Such black-box learning settings are ubiquitous, for example, when the learner only has query access to the metric of interest, or in noisy-label and domain adaptation applications where the learner must evaluate the metric via performance evaluation using a small validation sample. Our approach is to adaptively learn example weights on the training dataset such that the resulting weighted objective best approximates the metric on the validation sample. We show how to model and estimate the example weights and use them to iteratively post-shift a pre-trained class probability estimator to construct a classifier. We also analyze the resulting procedure's statistical properties. Experiments on various label noise, domain shift, and fair classification setups confirm that our proposal compares favorably to the state-of-the-art baselines for each application.
翻译:我们考虑学习优化由混乱矩阵的黑盒功能定义的分类衡量标准。这种黑盒学习设置无处不在,例如,当学习者只能够查询利息衡量标准时,或者当学习者必须使用一个小的验证样本通过业绩评估来评估衡量标准时,或者在吵闹标签和域适应应用中,学习者必须使用一个小的验证样本来评估该标准。我们的方法是适应性地学习培训数据集中的示例权重,这样得出的加权目标最接近于验证样本的衡量标准。我们展示了如何建模和估计示例权重,并用它们来迭代制一个经过预先训练的等级概率测算器来构建分类者。我们还分析了由此产生的程序的统计特性。关于各种标签噪音、域变换和公平分类设置的实验证实,我们的提议优于每种应用的最新基线。