Today, GPS-equipped mobile devices are ubiquitous, and they generate Location-Based Service (LBS) data, which has become a critical resource for understanding human mobility. However, inherent limitations in LBS datasets, primarily characterized by discontinuity and sparsity, may introduce significant biases in representing individual movement patterns. This study develops data quality metrics for LBS data, examines their disparities among different populations, and quantifies their effects on inferred individual movement, stays in particular, in the Boston Metropolitan Area. We find that data from higher-income, more educated, and predominantly white census block groups (CBGs) show higher sampling rates but paradoxically lower data quality. This contradiction may stem from greater privacy awareness in these communities. Additionally, we propose a new framework to resample LBS data and quantitatively evaluate the inferential biases associated with data of varying quality. This versatile framework can analyze the impacts originating from different data processing workflows with LBS data. Using linear regression models with clustered standard error, we assess the impact of data quality metrics on inferring the number of stay points. The results show that better data quality, characterized by the number of observations and temporal occupancy, can significantly reduce the bias when calculating the stay points of an individual. The introduction of additional data quality metrics into the regression model can further explain the bias. Overall, this study provides insights into how data quality can influence our understanding of human mobility patterns, highlighting the importance of carefully handling LBS data in research.
翻译:暂无翻译