在解决数据集中问题方面,基于反对党的学习战略与解决数据集中问题的学习战略相结合, (A Hybrid Chimp Optimization Algorithm and Generalized Normal Distribution Algorithm with Opposition-Based Learning Strategy for Solving Data Clustering Problems)

2023 年 2 月 16 日

A Hybrid Chimp Optimization Algorithm and Generalized Normal Distribution Algorithm with Opposition-Based Learning Strategy for Solving Data Clustering Problems

翻译：在解决数据集中问题方面,基于反对党的学习战略与解决数据集中问题的学习战略相结合,

Sayed Pedram Haeri Boroujeni,Elnaz Pashaei

from arxiv, 48 pages, 14 Tables, 12 Figures

This paper is concerned with data clustering to separate clusters based on the connectivity principle for categorizing similar and dissimilar data into different groups. Although classical clustering algorithms such as K-means are efficient techniques, they often trap in local optima and have a slow convergence rate in solving high-dimensional problems. To address these issues, many successful meta-heuristic optimization algorithms and intelligence-based methods have been introduced to attain the optimal solution in a reasonable time. They are designed to escape from a local optimum problem by allowing flexible movements or random behaviors. In this study, we attempt to conceptualize a powerful approach using the three main components: Chimp Optimization Algorithm (ChOA), Generalized Normal Distribution Algorithm (GNDA), and Opposition-Based Learning (OBL) method. Firstly, two versions of ChOA with two different independent groups' strategies and seven chaotic maps, entitled ChOA(I) and ChOA(II), are presented to achieve the best possible result for data clustering purposes. Secondly, a novel combination of ChOA and GNDA algorithms with the OBL strategy is devised to solve the major shortcomings of the original algorithms. Lastly, the proposed ChOAGNDA method is a Selective Opposition (SO) algorithm based on ChOA and GNDA, which can be used to tackle large and complex real-world optimization problems, particularly data clustering applications. The results are evaluated against seven popular meta-heuristic optimization algorithms and eight recent state-of-the-art clustering techniques. Experimental results illustrate that the proposed work significantly outperforms other existing methods in terms of the achievement in minimizing the Sum of Intra-Cluster Distances (SICD), obtaining the lowest Error Rate (ER), accelerating the convergence speed, and finding the optimal cluster centers.

翻译：本文关注基于连接原则的数据集群,根据连接原则将类似和不同的数据分类为不同组别的数据。虽然典型的组合算法,如K手段是高效的技术,但它们往往会陷入本地的奥地马,在解决高层面问题方面趋同速度缓慢。为解决这些问题,引入了许多成功的超重优化算法和基于情报的方法,以便在合理的时间内达成最佳解决方案。它们的设计是为了通过允许灵活的移动或随机行为来摆脱当地的最佳问题。在这项研究中,我们试图利用三种主要组成部分,例如:Chimp Oppimization Algorithm(CHOA)、通用正常分配Algorithm(GNDA)和反对派学习(OBL)方法。首先,两个不同的独立小组战略和七个混乱地图,称为CHOA(I)和COA(II),它们的目的是通过允许灵活的移动或随机行为来达到数据组合的最佳结果。其次,CHOA和GO(GOA)应用的新型算法与最近的一种最高级的算法方法,这些方法可以明显地用到GOA的原始算法, 。