Crimes emerge out of complex interactions of human behaviors and situations. Linkages between crime incidents are highly complex. Detecting crime linkage given a set of incidents is a highly challenging task since we only have limited information, including text descriptions, incident times, and locations. In practice, there are very few labels. We propose a new statistical modeling framework for {\it spatio-temporal-textual} data and demonstrate its usage on crime linkage detection. We capture linkages of crime incidents via multivariate marked spatio-temporal Hawkes processes and treat embedding vectors of the free-text as {\it marks} of the incident, inspired by the notion of {\it modus operandi} (M.O.) in crime analysis. Numerical results using real data demonstrate the good performance of our method as well as reveals interesting patterns in the crime data: the joint modeling of space, time, and text information enhances crime linkage detection compared with the state-of-the-art, and the learned spatial dependence from data can be useful for police operations.
翻译:犯罪事件之间的联系非常复杂。 检测犯罪联系与一系列事件之间的联系是一项极具挑战性的任务,因为我们在犯罪分析方面只有有限的信息,包括文字描述、事件时间和地点。 实际上,只有很少的标签。 我们提议了一个新的犯罪数据模型框架,用于人类行为和情况之间的复杂互动。 我们通过多变量标记时空鹰过程来捕捉犯罪事件的联系,并将该事件的自由文本的嵌入矢量作为~标记处理,这在犯罪分析中受到“工作方式”概念的启发。 使用实际数据得出的数字结果显示了我们方法的良好表现,并揭示了犯罪数据中有趣的模式:空间、时间和文本信息的联合模型可以促进犯罪联系的探测,而与最新技术相比,而从数据中学习的空间依赖对于警察行动是有用的。