Parameters of sub-populations can be more relevant than super-population ones. For example, a healthcare provider might be interested in the effect of a treatment plan on a subset of their patients; a video subscription service might be interested in the satisfaction of its current customers; or policymakers might care about the impact of a policy in a state with a given population. In these cases, one is interested in a finite population, as opposed to an infinite super-population. Such finite population can be characterized by fixing some attributes that are intrinsic to them, leaving unexplained variation like measurement error as random. More generally, inference for a population with fixed attributes can be modeled as inferring parameters of a conditional distribution given these attributes. It is desirable that confidence intervals are conditionally valid for the realized population, instead of marginalizing over many draws of such populations. We provide a statistical inference framework for parameters of populations with fixed attributes. By leveraging the attribute information, we derive estimators and confidence intervals that are closely related to a specific finite population. When the data is from the population of interest, our confidence intervals attain asymptotic conditional validity given the attributes, and are typically shorter than those for super-population inference. In addition, we develop procedures to infer parameters of new populations with differing covariate distributions; the confidence intervals are also conditionally valid under mild conditions. Our methods extend to situations where the fixed information has weaker structure or is only partially observed. We demonstrate the validity and applicability of our methods on simulated and real-world data.
翻译:亚人口组的参数可能比超人口组的参数更具有相关性。例如,保健提供者可能有兴趣了解治疗计划对其某一组病人的影响;录像订阅服务可能有兴趣了解其目前客户的满意度;或决策者可能关心某州政策对特定人口的影响;在这种情况下,人们感兴趣的是有限人口,而不是无穷的超人口组。这种有限人口的特点可以是确定他们所固有的某些属性,使诸如测量误差等无法解释的变异成为随机的。更一般地说,对具有固定属性的人口的推论,可以作为推断有条件分布参数的模型;对于已实现的人口而言,信任期最好有条件地有效,而不是在某一特定人口组中处于边际地位。我们利用属性信息,可以得出与特定有限人口密切相关的估测度和信任期。当数据来自有兴趣的人口时,我们对于具有固定属性的人口群的假设期,我们的信心期可以作为推断为有条件的定值参数的参数。对于已有条件的人口来说,信任期对于已有条件的参数是有条件的,而对于已观察到的比这类人口组的多的人来说,我们提供一个统计推算框架的参数组的参数通常会更短。我们根据精确的精确的精确的模型来显示我们的人口组系的模型结构。