Socio-economic indicators provide context for assessing a country's overall condition. These indicators contain information about education, gender, poverty, employment, and other factors. Therefore, reliable and accurate information is critical for social research and government policing. Most data sources available today, such as censuses, have sparse population coverage or are updated infrequently. Nonetheless, alternative data sources, such as call data records (CDR) and mobile app usage, can serve as cost-effective and up-to-date sources for identifying socio-economic indicators. This work investigates mobile app data to predict socio-economic features. We present a large-scale study using data that captures the traffic of thousands of mobile applications by approximately 30 million users distributed over 550,000 km square and served by over 25,000 base stations. The dataset covers the whole France territory and spans more than 2.5 months, starting from 16th March 2019 to 6th June 2019. Using the app usage patterns, our best model can estimate socio-economic indicators (attaining an R-squared score upto 0.66). Furthermore, using models' explainability, we discover that mobile app usage patterns have the potential to reveal socio-economic disparities in IRIS. Insights of this study provide several avenues for future interventions, including user temporal network analysis to understand evolving network patterns and exploration of alternative data sources.
翻译:社会经济指标为评估一个国家的总体状况提供了背景,这些指标包括教育、性别、贫穷、就业和其他因素的信息,因此,可靠和准确的信息对社会研究和政府警务至关重要,因此,可靠和准确的信息对社会研究和政府警务至关重要,今天现有的大多数数据来源,如人口普查、人口覆盖面稀少或不经常更新,不过,替代数据来源,如呼叫数据记录和移动应用程序使用等,可以作为确定社会经济指标的成本效益高和最新来源。这项工作调查了移动应用程序数据,以预测社会经济特征。我们利用数据进行一项大规模研究,收集了分布在550 000平方公里以上、由25 000多个基站提供服务的大约3 000万用户的移动应用程序的流量。数据集覆盖整个法国领土,涵盖2.5个月以上,从2019年3月16日至2019年6月6日。使用应用使用模式,我们的最佳模型可以估计社会经济指标(将R-qued评为0.66分)。此外,我们利用模型解释,我们发现移动应用程序使用模式有可能揭示社会经济网络中社会经济差距的变化,包括IRIS的不断演变的模型分析。