In the era of big data, the ubiquity of location-aware portable devices provides an unprecedented opportunity to understand inhabitants' behavior and their interactions with the built environments. Among the widely used data resources, mobile phone data is the one passively collected and has the largest coverage in the population. However, mobile operators cannot pinpoint one user within meters, leading to the difficulties in activity inference. To that end, we propose a data analysis framework to identify user's activity via coupling the mobile phone data with location-based social networks (LBSN) data. The two datasets are integrated into a Bayesian inference module, considering people's circadian rhythms in both time and space. Specifically, the framework considers the pattern of arrival time to each type of facility and the spatial distribution of facilities. The former can be observed from the LBSN Data and the latter is provided by the points of interest (POIs) dataset. Taking Shanghai as an example, we reconstruct the activity chains of 1,000,000 active mobile phone users and analyze the temporal and spatial characteristics of each activity type. We assess the results with some official surveys and a real-world check-in dataset collected in Shanghai, indicating that the proposed method can capture and analyze human activities effectively. Next, we cluster users' inferred activity chains with a topic model to understand the behavior of different groups of users. This data analysis framework provides an example of reconstructing and understanding the activity of the population at an urban scale with big data fusion.
翻译:暂无翻译