多重插补方法应用于试验中的时期级加速度计数据 (Multiple Imputation Approaches for Epoch-level Accelerometer data in Trials)

Clinical trials that investigate interventions on physical activity often use accelerometers to measure step count at a very granular level, often in 5-second epochs. Participants typically wear the accelerometer for a week-long period at baseline, and for one or more week-long follow-up periods after the intervention. The data is usually aggregated to provide daily or weekly step counts for the primary analysis. Missing data are common as participants may not wear the device as per protocol. Approaches to handling missing data in the literature have largely defined missingness on the day level using a threshold on daily wear time, which leads to loss of information on the time of day when data are missing. We propose an approach to identifying and classifying missingness at the finer epoch-level, and then present two approaches to handling missingness. Firstly, we present a parametric approach which takes into account the number of missing epochs per day. Secondly, we describe a non-parametric approach to Multiple Imputation (MI) where missing periods during the day are replaced by donor data from the same person where possible, or data from a different person who is matched on demographic and physical activity-related variables. Our simulation studies comparing these approaches in a number of settings show that the non-parametric approach leads to estimates of the effect of treatment that are least biased while maintaining small standard errors. We illustrate the application of these different MI strategies to the analysis of the 2017 PACE-UP Trial. The proposed framework of classifying missingness and applying MI at the epoch-level is likely to be applicable to a number of different outcomes and data from other wearable devices.

翻译：用加速度计测量步数在临床试验中常用于研究干预措施的物理活动水平，通常以5秒的时期为单位。参与者通常在基线时期佩戴加速度计一个星期，然后在一个或多个星期的随访期间佩戴。由于参与者可能没有按照协议佩戴设备，因此数据通常存在缺失。文献中处理缺失数据的方法通常在每天的水平上使用每日穿戴时间的阈值来定义缺失，这会导致丢失关于缺失数据的时间的信息。我们提出了一种细化到时期级别的缺失数据识别和分类方法，然后提供了两种处理缺失数据的方法。首先，我们提出了一种参数化方法，该方法考虑了每天缺失时期的数量。其次，我们描述了一种非参数化的多重插补方法（MI），其中在可能的情况下，白天的丢失期间由同一人的捐赠者数据替换，或者由匹配人口统计学和与身体活动相关的变量的不同人的数据替换。我们的模拟研究比较这些方法在多种情况下的效果，并发现非参数化方法在保持小的标准误差的情况下产生了最少偏差的治疗效果估计。我们阐述了这些不同 MI 策略在 2017PACE-UP试验的分析中的应用。时期级别的分类缺失和应用MI的提出框架可能适用于其他可穿戴设备的不同结果和数据。