Matching on a low dimensional vector of scalar covariates consists of constructing groups of individuals in which each individual in a group is within a pre-specified distance from an individual in another group. However, matching in high dimensional spaces is more challenging because the distance can be sensitive to implementation details, caliper width, and measurement error of observations. To partially address these problems, we propose to use extensive sensitivity analyses and identify the main sources of variation and bias. We illustrate these concepts by examining the racial disparity in all-cause mortality in the US using the National Health and Nutrition Examination Survey (NHANES 2003-2006). In particular, we match African Americans to Caucasian Americans on age, gender, BMI and objectively measured physical activity (PA). PA is measured every minute using accelerometers for up to seven days and then transformed into an empirical distribution of all of the minute-level observations. The Wasserstein metric is used as the measure of distance between these participant-specific distributions.
翻译:暂无翻译