Algorithms implementing populations of agents which interact with one another and sense their environment may exhibit emergent behavior such as self-organization and swarm intelligence. Here a swarm system, called Databionic swarm (DBS), is introduced which is able to adapt itself to structures of high-dimensional data characterized by distance and/or density-based structures in the data space. By exploiting the interrelations of swarm intelligence, self-organization and emergence, DBS serves as an alternative approach to the optimization of a global objective function in the task of clustering. The swarm omits the usage of a global objective function and is parameter-free because it searches for the Nash equilibrium during its annealing process. To our knowledge, DBS is the first swarm combining these approaches. Its clustering can outperform common clustering methods such as K-means, PAM, single linkage, spectral clustering, model-based clustering, and Ward, if no prior knowledge about the data is available. A central problem in clustering is the correct estimation of the number of clusters. This is addressed by a DBS visualization called topographic map which allows assessing the number of clusters. It is known that all clustering algorithms construct clusters, irrespective of the data set contains clusters or not. In contrast to most other clustering algorithms, the topographic map identifies, that clustering of the data is meaningless if the data contains no (natural) clusters. The performance of DBS is demonstrated on a set of benchmark data, which are constructed to pose difficult clustering problems and in two real-world applications.
翻译:用于执行与彼此互动和感知其环境的代理物群的算法,可能显示自我组织和群温智能等新兴行为。在这里,引入了称为Databionic 群温(DBS)的群温系统,该系统能够适应数据空间中以距离和/或密度为基础的结构为特征的高维数据结构。通过利用群温情报、自我组织和出现之间的相互关系,DBS可以作为优化集群任务中全球目标功能的一种替代方法。群温不使用全球目标函数,并且没有参数,因为它在同化过程中搜索纳什平衡。对于我们的知识来说,DBS是第一个将这些方法结合在一起的群温和系统。它的群集可以超越诸如K-平均值、PAM、单一链接、光谱集、模型集和沃德等常见的组合方法,如果先前没有关于数据的知识,则作为优化组合中全球目标功能的函数。一个中心问题是对群集数的正确估计。DBS真实的群落和无参数应用是因为它在纳什平衡过程中搜索应用了纳什平衡。对于我们的知识,DBS是第一个群集的群集的群集的群落,如果它能够评估其他数据群集,那么,那么,它所组的地形群集的地形群集中的数据群集就是有。