Assessing equity in treatment of a subpopulation often involves assigning numerical "scores" to all individuals in the full population such that similar individuals get similar scores; matching via propensity scores or appropriate covariates is common, for example. Given such scores, individuals with similar scores may or may not attain similar outcomes independent of the individuals' memberships in the subpopulation. The traditional graphical methods for visualizing inequities are known as "reliability diagrams" or "calibrations plots," which bin the scores into a partition of all possible values, and for each bin plot both the average outcomes for only individuals in the subpopulation as well as the average outcomes for all individuals; comparing the graph for the subpopulation with that for the full population gives some sense of how the averages for the subpopulation deviate from the averages for the full population. Unfortunately, real data sets contain only finitely many observations, limiting the usable resolution of the bins, and so the conventional methods can obscure important variations due to the binning. Fortunately, plotting cumulative deviation of the subpopulation from the full population as proposed in this paper sidesteps the problematic coarse binning. The cumulative plots encode subpopulation deviation directly as the slopes of secant lines for the graphs. Slope is easy to perceive even when the constant offsets of the secant lines are irrelevant. The cumulative approach avoids binning that smooths over deviations of the subpopulation from the full population. Such cumulative aggregation furnishes both high-resolution graphical methods and simple scalar summary statistics (analogous to those of Kuiper and of Kolmogorov and Smirnov used in statistical significance testing for comparing probability distributions).
翻译:评估子人口公平性评估通常涉及向全体人口的所有个人分配数字“分数”,以便相似的个人获得相似的分数;通过偏向性分分数或适当的共差来匹配分数是常见的。鉴于这种分数,相向分数的人可能获得或可能不会获得与亚人口个人成员身份无关的类似结果。想象不平等的传统图形方法被称为“可靠性图表”或“校正图”,将分数装入所有可能值的分区,每件文件图绘制只有子人口群体个人的平均结果以及所有个人的平均结果;将子人口与整个人口的数据比较,这样比较子人口群体的平均分数可能在某种程度上与整个人口群体的平均数不同。不幸的是,真实的数据集只包含有限的观察,限制了本项的可用解析,因此常规方法可以模糊因宾式而出现的重要变化。幸运的是,将本纸面中提议的子人口从全部人口平均分数的分数与所有个人的平均分数的分数;将子人口组数的图示值与整个人口群体平均数值值的数值值比值值比较,将K数的数值比值值值值值值值值值值的累积值值值值值值值比值比值比值比值的分数比值比值,在纸面图中,在纸面图中,这两次的递算的递值的累积的递算的递算值的递算值的递算取的累积的递算。