Testing the equality of two conditional distributions is at the core of many modern applications such as domain adaption, transfer learning, and algorithmic fairness. However, to our surprise, little effort has been paid to studying this fundamental problem. In this paper, we contribute to the scarce literature by developing a new two-sample measure named conditional energy distance (CED) to quantify the discrepancy between two conditional distributions without imposing restrictive parametric assumptions. We study the fundamental properties of CED and apply CED to construct a two-sample test for the equality of two conditional distributions. A local bootstrap is developed to approximate the finite sample distribution of the test statistic. The reliable performance of the proposed two-sample conditional distribution test is demonstrated through simulations and a real data analysis.
翻译:测试两种有条件分配的平等是许多现代应用的核心,例如域适配、转让学习和算法公平。然而,我们感到惊讶的是,几乎没有努力研究这一根本问题。在本文中,我们为稀有文献作出了贡献,制定了一个新的双类抽样措施,称为有条件能源距离(CED),以量化两种有条件分配之间的差异,而不强加限制性的参数假设。我们研究了CED的基本特性,并应用CED为两种有条件分配的平等建立一个双类测试。我们开发了一个本地靴子,以接近测试统计数据的有限抽样分布。拟议的双类有条件分配测试的可靠性能通过模拟和真实的数据分析得到证明。