Artificial Bee Colony (ABC) algorithm which is one of the most recently introduced optimization algorithms, simulates the intelligent foraging behavior of a honey bee swarm. Clustering analysis, used in many disciplines and applications, is an important tool and a descriptive task seeking to identify homogeneous groups of objects based on the values of their attributes. In this work, ABC is used for data clustering on benchmark problems and the performance of ABC algorithm is compared with Particle Swarm Optimization (PSO) algorithm and other nine classification techniques from the literature. Thirteen of typical test data sets from the UCI Machine Learning Repository are used to demonstrate the results of the techniques. The simulation results indicate that ABC algorithm can efficiently be used for multivariate data clustering.
Clustering, which is an important tool for a variety of applications in data mining, statistical data analysis, data compression, and vector quantization, aims gathering data into clusters (or groups) such that the data in each cluster shares a high degree of similarity while being very dissimilar to data from other clusters [1], [2] and [3]. The goal of clustering is to group data into clusters such that the similarities among data members within the same cluster are maximal while similarities among data members from different clusters are minimal.
Clustering algorithms are generally classified as hierarchical clustering and partitional clustering [3], [4] and [5]. Hierarchical clustering groups data objects with a sequence of partitions, either from singleton clusters to a cluster including all individuals or vice versa. Hierarchical procedures can be either agglomerative or divisive: agglomerative algorithms begin with each element as a separate cluster and merge them in successively larger clusters; divisive algorithms begin with the whole set and proceed to divide it into successively smaller clusters [6] and [7]. Partitional procedures that we concerned in this paper, attempt to divide the data set into a set of disjoint clusters without the hierarchical structure. The most popular partitional clustering algorithms are the prototype-based clustering algorithms where each cluster is represented by the center of the cluster and the used objective function (a square-error function) is the sum of the distance from the pattern to the center [8].
The most popular class of clustering algorithms is KK-means algorithm which is a center based, simple and fast algorithm [9]. However, KK-means algorithm highly depends on the initial states and always converges to the nearest local optimum from the starting position of the search. In order to overcome local optima problem, the researchers from diverse fields are applying hierarchical clustering, partition-based clustering, density-based clustering, and artificial intelligence based clustering methods, such as: statistics [10], graph theory [11], expectation-maximization algorithms [12], artificial neural networks [13], [14], [15] and [16], evolutionary algorithms [17] and [18], swarm intelligence algorithms [19], [20], [21], [22], [23] and [24] and so on.
In this paper, Artificial Bee Colony (ABC) optimization algorithm, which is described by Karaboga based on the foraging behavior of honey bees for numerical optimization problems [25], is applied to classification benchmark problems (13 typical test databases). The performance of the ABC algorithm on clustering is compared with the results of the Particle Swarm Optimization (PSO) algorithm on the same data sets that are presented in [26]. ABC and PSO algorithms drop in the same class of artificial intelligence optimization algorithms, population-based algorithms and they are proposed by inspiration of swarm intelligence. Besides comparing the ABC algorithm and PSO algorithm, the performance of ABC algorithm is also compared with a wide set of classification techniques that are also given in [26]. The paper is organized as the clustering problem in Section 2, implementation of the ABC algorithm introduced in Section 3, and then experiments and results presented and discussed in Section 4. We conclude the paper in Section 5 by summarizing the observations and remarking the future works
In this work, Artificial Bee Colony algorithm, which is a new, simple and robust optimization technique, is used in clustering of the benchmark classification problems for classification purpose. Clustering is an important classification technique that gathers data into classes (or clusters) such that the data in each cluster shares a high degree of similarity while being very dissimilar from data of other clusters. The performance of the ABC algorithm is compared with Particle Swarm Optimization algorithm and other nine techniques which are widely used by the researchers. The results of the experiments show that the Artificial Bee Colony algorithm can successfully be applied to clustering for the purpose of classification. There are several issues remaining as the scopes for future studies such as using different algorithms in clustering and comparing the results of ABC algorithm to the result of those algorithms.