Mood inference with mobile sensing data has been studied in ubicomp literature over the last decade. This inference enables context-aware and personalized user experiences in general mobile apps and valuable feedback and interventions in mobile health apps. However, even though model generalization issues have been highlighted in many studies, the focus has always been on improving the accuracies of models using different sensing modalities and machine learning techniques, with datasets collected in homogeneous populations. In contrast, less attention has been given to studying the performance of mood inference models to assess whether models generalize to new countries. In this study, we collected a mobile sensing dataset with 329K self-reports from 678 participants in eight countries (China, Denmark, India, Italy, Mexico, Mongolia, Paraguay, UK) to assess the effect of geographical diversity on mood inference models. We define and evaluate country-specific (trained and tested within a country), continent-specific (trained and tested within a continent), country-agnostic (tested on a country not seen on training data), and multi-country (trained and tested with multiple countries) approaches trained on sensor data for two mood inference tasks with population-level (non-personalized) and hybrid (partially personalized) models. We show that partially personalized country-specific models perform the best yielding area under the receiver operating characteristic curve (AUROC) scores of the range 0.78-0.98 for two-class (negative vs. positive valence) and 0.76-0.94 for three-class (negative vs. neutral vs. positive valence) inference. Overall, we uncover generalization issues of mood inference models to new countries and how the geographical similarity of countries might impact mood inference.
翻译:过去十年来,在透视文献中研究了移动感测数据的偏差。这种推断使一般移动应用程序和移动健康应用程序中宝贵的反馈和干预措施能够产生符合环境的和个性化的用户经验。然而,尽管在许多研究中都强调了模型的概括问题,但重点始终是利用不同遥感模式和机器学习技术改进模型的精度,在同质人群中收集了数据。相比之下,较少注意研究情绪中度曲线模型的性能,以评估模型是否向新国家普及。在这项研究中,我们收集了来自8个国家(中国、丹麦、印度、意大利、墨西哥、蒙古、巴拉圭、联合王国)678名参与者的329K自报的移动感测数据集,以评估地理多样性对情绪推导模型的影响。我们界定并评价了具体国家(在一国内进行训练和测试的)、特定大陆(在大陆内进行训练和测试的)、国家内测度(在未看到的国家内进行正面数据)和多国(在多个国家内进行训练和测试的)对20级个人情绪模型进行感测测算(在两种个人情绪下进行个人情绪测算)。