Various institutes produce large semantic datasets containing information regarding daily activities and human mobility. The analysis and understanding of such data are crucial for urban planning, socio-psychology, political sciences, and epidemiology. However, none of the typical data mining processes have been customized for the thorough analysis of semantic mobility sequences to translate data into understandable behaviors. Based on an extended literature review, we propose a novel methodological pipeline called simba (Semantic Indicators for Mobility and Behavior Analysis), for mining and analyzing semantic mobility sequences to identify coherent information and human behaviors. A framework for semantic sequence mobility analysis and clustering explicability based on integrating different complementary statistical indicators and visual tools is implemented. To validate this methodology, we used a large set of real daily mobility sequences obtained from a household travel survey. Complementary knowledge is automatically discovered in the proposed method.
翻译:各研究所制作了包含日常活动和人员流动信息的大型语义数据集。这些数据的分析和理解对于城市规划、社会-心理学、政治科学和流行病学至关重要。然而,没有一项典型的数据挖掘流程是专门为透彻分析语义流动序列而定制的,目的是将数据转化为可理解的行为。根据扩大的文献审查,我们提议建立一个名为Simba(流动和行为分析的语义指标)的新的方法管道,用于采矿和分析语义流动序列,以确定连贯的信息和人类行为。一个基于不同补充统计指标和视觉工具整合的语义序列流动分析和集群框架已经实施。为了验证这一方法,我们使用了从家庭旅行调查中获取的大量真实的日常流动序列。在拟议方法中自动发现了补充性知识。