We consider the problem of testing the equality of conditional distributions of a response variable given a vector of covariates between two populations. Such a hypothesis testing problem can be motivated from various machine learning and statistical inference scenarios, including transfer learning and causal predictive inference. We develop a nonparametric test procedure inspired from the conformal prediction framework. The construction of our test statistic combines recent developments in conformal prediction with a novel choice of conformity score, resulting in a weighted rank-sum test statistic that is valid and powerful under general settings. To our knowledge, this is the first successful attempt of using conformal prediction for testing statistical hypotheses beyond exchangeability. Our method is suitable for modern machine learning scenarios where the data has high dimensionality and large sample sizes, and can be effectively combined with existing classification algorithms to find good conformity score functions. The performance of the proposed method is demonstrated in various numerical examples.
翻译:我们考虑了测试一个响应变量的有条件分布是否平等的问题,因为两个人群之间存在着一种共变矢量。这样的假设测试问题可以来自各种机器学习和统计推论假设,包括转移学习和因果预测推论。我们开发了来自符合预测框架的非参数测试程序。我们测试统计数据的构建结合了符合预测的最新动态和新选择的符合评分,从而产生了在一般情况下有效和强大的加权级和总和测试统计。据我们所知,这是首次成功地尝试使用一致预测来测试无法交换的统计假设。我们的方法适合现代机器学习假设,即数据具有高度的维度和大样本大小,并且可以有效地与现有的分类算法相结合,以找到良好的符合评分功能。在各种数字实例中都证明了拟议方法的绩效。