Capacitated spatial clustering, a type of unsupervised machine learning method, is often used to tackle problems in compressing, classifying, logistic optimization and infrastructure optimization. Depending on the application at hand, a wide set of extensions may be necessary in clustering. In this article we propose a number of novel extensions to PACK that is a novel capacitated spatial clustering method. These extensions are relocation and location preference of cluster centers, outliers, and non-spatial attributes. The strength of PACK is that it can consider all of these extensions jointly. We demonstrate the usefulness PACK with a real world example in edge computing server placement for a city region with various different set ups, where we take into consideration outliers, center placement, and non-spatial attributes. Different setups are evaluated with summary statistics on spatial proximity and attribute similarity. As a result, the similarity of the clusters was improved at best by 53%, while simultaneously the proximity degraded only 18%. In alternate scenarios, both proximity and similarity were improved. The different extensions proved to provide a valuable way to include non-spatial information into the cluster analysis, and attain better overall proximity and similarity. Furthermore, we provide easy-to-use software tools (rpack) for conducting clustering analyses.
翻译:能力强大的空间集群是一种不受监督的机器学习方法,通常用来解决压缩、分类、后勤优化和基础设施优化方面的问题。根据手头应用的情况,可能需要一系列广泛的扩展组合。在本条中,我们提议对PACK进行一些新的扩展,这是一种新型的功能化空间集群方法。这些扩展是集束中心、外围和非空间属性的迁移和地点偏好,PACK的强度是它能够联合考虑所有这些扩展。我们展示了PACK的实用性,在为不同设置的城市地区提供边缘计算服务器位置的真实世界范例,我们在这里考虑外部用户、中心位置和非空间属性。我们用关于空间相近性和属性的简要统计数据对不同的设置进行评价。因此,聚集群的相似性最佳改善53%,而相近率则只有18 % 。在替代的假设中,近似和相似性都得到了改进。不同的扩展证明提供了将非空间信息纳入不同设置的城市区域边缘计算服务器位置的有价值的方法,我们在这里考虑过外端、中心位置和非空间属性。不同的设置是用关于空间相近处和属性的简单化的软件组合分析,我们提供了更好的工具。