THE EFFECTIVE BRKGA ALGORITHM FOR THE 𝑘 -MEDOIDS CLUSTERING PROBLEM

. This paper presents a biased random-key genetic algorithm for 𝑘 -medoids clustering problem. A novel heuristic operator was implemented and combined with a parallelized local search procedure. Experiments were carried out with fifty literature data sets with small, medium, and large sizes, considering several numbers of clusters, showed that the proposed algorithm outperformed eight other algorithms, for example, the classics PAM and CLARA algorithms. Furthermore, with the results of a linear integer programming formulation, we found that our algorithm obtained the global optimal solutions for most cases and, despite its stochastic nature, presented stability in terms of quality of the solutions obtained and the number of generations required to produce such solutions. In addition, considering the solutions (clusterings) produced by the algorithms, a relative validation index (average silhouette) was applied, where, again, was observed that our method performed well, producing cluster with a good structure.


Introduction
Clustering Analysis (CA) is a tool commonly used in a wide range of applications [12,13], including Data Reduction, Hypothesis Generation, Business Applications and Market Research, Biology and Bioinformatics, and web mining.According to [13], CA is a multivariate analysis technique comprising a set of algorithms applied to form clusters, based on a data set formed by  objects with  variables, aiming to produce clusters with a high degree of similarity between objects in the same cluster (cohesion) and a low degree of similarity among objects in different clusters (separation) [16].
According to [17], to define the clusters and evaluate the quality of the solutions obtained, an objective function is used as a criterion, which is based in a distance metric, such as the Euclidean distance.The classic clustering problem (CP) is NP-Hard, and obtaining the optimal global solution is a highly complex computational task [10,17].
Due to the complexity of CP and their varied applications, in recent decades, several algorithms have been developed [1,26,30,37,44,51,52,56].In particular, there is a set of heuristic algorithms of more general use in the literature, divided into two main categories: non-hierarchical and hierarchical [13,19].Additionally, several mathematical programming formulations for clustering problems are presented in the [17,28,29,34].
About non-hierarchical approaches, there are two classical algorithms based on the prototype model: -means/-medoids [13] and PAM (Partitioning Around Medoids) algorithms [20].According to [15,20], medoids correspond to the () most representative items of the given set of objects.Besides, medoid-based algorithms tend to produce higher-quality clusters and more robust to the presence of outliers or noises, and are used in databases whose objects have quantitative and qualitative attributes.
The goal of this article is to tackle the -medoids clustering problem.Therefore, is proposed a heuristic algorithm that combines concepts of the BRKGA metaheuristics [11,24] with a new crossover operator and a local search procedure.The computational experiments were carried out by applying classic algorithms and the proposed BRKGA-based algorithm to fifty literature data sets, considering six associated scenarios regarding the number of clusters  ∈ {2, 3, 4, 5, 6, 7}.From these experiments, the proposed algorithm produced better solutions to those produced by various algorithms in the literature, such as the classic PAM and CLARA algorithms.In particular, the proposed algorithm produced a high percentage of global optimal solutions.The main contributions of this paper are as follows: -New algorithm based on the biased random-key genetic algorithms (BRKGA) is proposed to solve the medoids clustering problem.It provides a new option for solving this hard clustering problem using BRKGA concepts.-A novel heuristic operator was implemented and combined with a parallelized local search procedure.
-The scalability and stability of the proposed algorithm observed from experiments carried out with fifty literature data sets, considering different numbers of clusters.-Analysis of solutions using statistical measures and relative validation index -silhouette.
The outline of this paper is as follows: the Section 2 describes the clustering problem with -medoids.Section 3 provides a review of the most relevant papers associated with this problem that can be found in the literature.Section 4 presents a description of the BRKGA metaheuristic, and the details of the proposed algorithm -BRKGA -medoids clustering algorithm -BRKGAMED.Section 5 describes the data sets used in the computational experiments reported in this work, in addition to a discussion on the calibration of the parameters used in the BRKGAMED.In Section 6, results, and analyses from applying the proposed algorithm show its effectiveness in comparison with eight related -medoids clustering algorithms, in particular, PAM and CLARA heuristics, and an integer programming formulation for the -medoids clustering problem.Finally, Section 7 contains our summary and discussion.

𝑘-medoids clustering problem
Consider a set  formed by  objects  = { 1 , . . .,   , . . .,   }, such that each   is defined by a vector   = ( 1  ,  2  , . . .,    ) with  variables.From ,  objects are selected to define medoids used to form  clusters denoted by  1 ,  2 , . . .,   , so that the following constraints are satisfied: The medoids are represented by a set  = { 1 , . . .,   , . . .,   }( ⊂ ), each element   corresponding to the index () of the   object selected as the medoid of the respective cluster   .Additionally, set  is defined so that the sum of the distances of each of the remaining ( − ) objects of  to its nearest medoid is minimum, which is equivalent to minimizing the following objective function: (2.1) According to [25], the -medoids clustering problem is NP-Hard.This characteristic motivates adopting heuristic algorithms that, although they do not guarantee the global optima, tend to produce solutions corresponding to local optima of reasonable quality, demanding low computational time [38].The application of a brute-force algorithm, which ensures global optima, is infeasible due to the size of the solution space  of this problem, that is, the total subsets of -medoids, given by: For  ≪  this expression grows explosively with , in general, even faster than a higher-order polynomial.For example, for  = 200 and  = 5, the number of solutions is in the order of 10 9 .
Additionally, this problem can be formulated as an integer linear programming problem and solved by applying exact algorithms such as branch and cut or branch and bound [50].However, due to the number of variables ( 2 + ) and restrictions ( 2 +  + 1), the resolution might require significant computational time, producing, in many cases, only a local optimum or a feasible solution within this time range. ) ,   ∈ {0, 1}, ,  = 1, . . ., . (2.7) In the formulation above, which was first proposed in [48],   is a binary variable that assumes the value 1 if object  ( = 1, . . ., ) is defined as medoid, and zero otherwise;   is also a variable 0-1 that assumes the value 1 if object  is allocated to the cluster defined by medoid .The objective function in (2.3) aims to minimize the distance of the objects regarding their medoid.Constraint (2.4) ensures that each object  must be associated with a single medoid.Constraint (2.5) ensures that an object  can only be associated with object  if the latter is defined as a medoid.Constraint (2.6) ensures that the number of medoids of the partition is .In (2.7) we have the integrality constraints.
Additionally, it is important to emphasize that the -medoids problem is similar to the -Median problem [5,22], an important optimization problem classified as NP-Hard.Initially addressed by Hakimi [14], this problem corresponds to a classical location problem associated with several real applications.In the -Median problem, one must determine  facilities (usually called medians) among a set of  candidates to satisfy a specific demand associated with a set of  clients to minimize transport costs and other logistical restrictions.When geographic data are grouped by applying an algorithm to solve the -medoids problem, this corresponds to solving a basic -Median problem, where the only objective is to minimize the distances between objects without considering other logistic constraints such as, for example, capacity.

Related works
A well-known and commonly used algorithm for this problem is the PAM (Partitioning Around Medoids), proposed in [20].It determines the -medoids by applying two procedures called Build and Swap.The authors also proposed another algorithm, called CLARA (Clustering Large Applications), which consists of combining a simple random sampling procedure and the PAM algorithm.In addition to these two algorithms, the next references bring the most relevant works available in the literature that propose more sophisticated algorithms, some focused on efficiency (speed) and others on effectiveness (quality of solutions).As examples, the algorithm called grouping genetic algorithm (GGA) proposed in [8] and the article by Han and Ng [15], which proposes a modified version of the CLARA algorithm, called CLARANS.
In [47], medoids are defined according to the silhouette maximization.To reduce the computational time required by PAM algorithm and to produce good quality solutions, in [55] it was proposed an algorithm called CLATIN (Clustering Large Applications with Triangular Irregular Network), that uses the concept of the triangular irregular network in the swap procedure of PAM.In [4], PAM algorithm is revisited, and improvements in its swap procedure are proposed.In [32], a fast algorithm that uses the -means algorithm to define the initial medoids is presented.In [27], Nascimento et al. is presented a Lagrangian heuristic for the -medoids problem.
In [54], the similarity between objects, given by the Euclidean distance, is not used directly, but to order them.Each pair of objects is assigned an integer between 1 and , representing the order of similarity among them.In each iteration, the medoids are updated to the most dissimilar object in relation to the other objects in the cluster and, once the maximum number of iterations has been reached, each object is allocated to the cluster with the most similar medoid.According to the authors, such strategy can find all gaussian-shaped clusters.
In a more recent study, Yu et al. [53] proposed an algorithm that uses a variance measure to determine medoids and focuses on efficiency.In [41], it was proposed a novel parallel -medoids algorithm, denominated PAMAE, that can be applied to large data sets and achieves both good accuracy and efficiency.In [35,36], faster versions of the PAM, CLARA, and CLARANS algorithms are proposed, based on improvement of the swap procedure used in PAM algorithm.
In [45,46], it was proposed a parallel heuristic for a -medoids clustering problem with variable number of clusters and provided a dual bound for the objective value, thus allowing one to ascertain the optimality of a solution found.In [43], it was proposed a novel fuzzy kernel -medoids clustering algorithm for uncertain objects which works well on data sets with arbitrary-shaped clusters.In [49], the authors use an efficient method that combines the PAM and CLARA algorithms for image segmentation.In [33], Punhani et al. is considered a -prototype algorithm to generate results like which product is popular among customers and generates more revenue in a particular region.In [6], the authors proposed an algorithm to minimize the number of iterations in -medoids clustering, where the medoids value was determined by the purity value, and cluster validity was measured with the Davies-Bouldin index.
In addition to these approaches, there are works based on the application of metaheuristics, such as the genetic algorithm proposed by Lucasius et al. [23] for large data sets.In [39], a hybrid genetic algorithm called HKA, that combines a crossover operator with a local search procedure based on -means algorithm.In another correlated study [40] proposed a variant of the genetic algorithm presented in 2004 that solves the -medoids problem without considering a fixed  value, using for such a combination of a crossover operator with the Davies-Bouldin index.In [18], a hybrid algorithm is proposed that uses the CRO (Chemical Reaction Optimization) algorithm, applied to expand the search for the optimal medoid.

BRKGA metaheuristic and proposed algorithm
The biased random-key genetic algorithms -BRKGA [11,24] is a metaheuristic that has been applied to several optimization problems [2,7,9,21,30].In a BRKGA, the population is composed by  chromosomes that correspond to random key vectors with  real values, generated according to the uniform distribution [0, 1].In each generation of BRKGA, a procedure called decoder, a selection procedure -that corresponds to an elitism strategy, and crossover and mutation operators are applied to each vector of the current population.The decoder is responsible for transforming each of the random key vectors into vectors corresponding to the feasible solutions to the optimization problem.After applying the decoder, the value of the objective function is calculated for the  feasible solutions, and such solutions are then ordered according to its corresponding value (in ascending order, in case of a minimization problem).In order to apply the selection and crossover operators, the random key vectors associated with the population of the current generation are divided into two sets, namely: an elite set   containing the random key vectors corresponding to the   best feasible solutions (as per the calculation of the objective function and corresponding ordering of the solutions) and a non-elite set  NE containing the  −   remaining random keys vectors.The elitism strategy consists of copying the vectors of   to the of the next generation, then the other vectors of the next population are obtained by applying the crossover and mutation operators.
With regards to mutation,   random key vectors are generated analogously to the first generation, then these vectors are inserted in the next generation population.Finally, in order to complement the population of the next generation,  −   −   random keys vectors are produced by applying the uniform crossover proposed by Spears and Jong [42].For such, it is used, at each crossover execution, a vector of   , a vector of  NE and a crossover probability.Figure 1 illustrates two generations followed by the application of the BRKGA.

BRKGA algorithm for the 𝑘-medoids clustering problem
The proposed algorithm, called BRKGAMED, uses BRKGA metaheuristic concepts, but differs in terms of representation and generation of population's chromosomes and as regards the use of a new crossover operator.Regarding the representation of chromosomes,  vectors  are generated to compose the first generation (initial population), each vector  defined based on  values randomly selected between 1 and  (number of objects in the data set).These values correspond to the medoids of set  , defined in Section 2, and are used to define the allocation of the other ( − ) objects in data set  to the nearest medoid.
Once the allocation has been made, the objective function of equation (2.1) is calculated, then elite set   is defined as the   vectors associated with the medoids that produces the lowest values of the objective function, and the remaining ( −   ) vectors from set  NE .
As in the case of a standard BRKGA, the vectors of set   are copied to the population of the next generation and the ( −   ) remaining vectors are obtained from the application of the crossover and mutation operators.In respect to mutation,   vectors are produced analogously to the generation of the initial population.After crossover, the new population will consist of the   vectors of set   and (−  ) vectors produced from applying the mutation and crossover operators described above.Algorithm 1 shows the pseudo-code of the BRKGAMED algorithm.Algorithm 2 and Table 1 bring, respectively, the pseudo-code associated with the crossover operator -which has the function of a local search, where the best improvement strategy is considered -and an example of its application.In Table 1, the vector   corresponds to a set of  medoids deriving from   , and vector   corresponds to a set of  medoids deriving from  NE .Table 1 shows an example of the application (lines 2-6) of the crossover described in the Algorithm 2, considering  = 3 and the vectors   = (2, 5, 10) and   = (1,3,8).
Algorithm 1: Pseudo-code of BRKGAMED algoritm.Upon analysis of the first loop (line 2 -Combination medoids) of Algorithm 2, each of the  elements of   is combined with all subsets of   formed by ( − 1) elements ( −1

𝑘
) and vice versa.Therefore, each execution of the procedure in this algorithm produces 2 2 (2 ×  ×  −1  ) new chromosomes, of which the one (vector   ) with the lowest associated value of the objective function (lines 13-15) is selected.
In order to determine the best set of medoids among the 2 2 sets ( lines), that is, vector   corresponding to the lowest value of objective function, firstly the medoids of each  line are assigned to vector   , then the remaining ( − )  ∖   objects are allocated to the nearest medoid of   , thus defining the  clusters  1 , . . .,   , . . .,   (lines 6-8).
Then, it is evaluated for each of the clusters   ( = 1, . . ., ), which object (  ) that, when defined as medoid, has the lowest sum of distances to the other   objects.If ∃  ∈   such that ∀  ∈   (  ̸ =   ),   ̸ =   [𝑟], where  = arg min     ,   will be the new medoid of cluster   and the -th position of   is updated (lines [9][10][11].This implies testing |  | − 1 objects by cluster as possible medoids of the cluster.

Data sets and parameters calibration
Experiments were carried out with 50 literature data sets to evaluate the performance of BRKGAMED against algorithms from the literature and the formulation described in Section 2. Additionally, in these data sets, the number of objects () ranges between 49 and 5000, and the number of variables ( ) ranges between 1 and 1213, as shown in Table 2.For the purposes of comparability and reproducibility of the experiments, the R function that implements BRKGAMED and all data sets are available in github.com/jambrito/BRKGAMED.A fundamental issue for any metaheuristic to have a reasonable performance concerns determining the values associated with its set of parameters.In the case of the BRKGAMED algorithm, the values of the parameters were defined using as reference the recommendations made in [11] and a preliminary calibration experiment using parameters and values in Table 3.More specifically, five data sets have been selected (marked with an asterisk in Table 2 out of the 50 data sets available, then BRKGAMED was applied 10 times in each data set to  ∈ {3, 4, 5} and considering 972 combinations of the parameters , ,   ,   ,   ,   -accounting to 145 800 executions (number of data sets ×  values × combinations of parameters × 10).
Considering each combination of the six parameters above, data sets, and  values, it was calculated the average of the objective function values (Eq.(2.1)) obtained in the 10 executions.Then, taking as the best combination the one corresponding to the largest number of solutions with lowest average values (in all data sets and  values), we have the following combination:  = 50,  = 50,   = 0.2  = 10,   = 0.7  = 35,   = 0.85 and   = 0.35.

Experiments
This section presents results related to the application of the BRKGAMED, the formulation described in Section 2 and eight algorithms from the literature: PAM and CLARA proposed in [20], FASTPAM, FASTCLARA, FASTCLARANS proposed in [35], HKA algorithm proposed in [39], PARK algorithm proposed in [32] and RANK algorithm proposed in [54].The authors implemented the BRKGAMED and HKA algorithms using the R programming language, and the other algorithms are available in functions implemented in three R packages, as shown in Table 4.The formulation was implemented using solver GUROBI (version 9.5.1)available in the gurobi package in R. Additionally, all experiments related to applying the nine algorithms and the formulation were carried out on a computer with 16 GB of RAM and AMD FX-6300 six cores 3.50 GHz processor.Considering the multicore architecture features of the computer used in the experiments, combined with package parallel available in the R language, which has functions that allow to implement parallelism, the crossover operator was parallelized in the BRKGAMED algorithm.Two experiments were carried to properly present the results of the algorithm.In the first one, presented in Section 6.1, the purpose was to evaluate the effectiveness of the algorithm in achieving reasonable quality solutions for the 50 data sets and different  values.The second experiment, presented in Section 6.2, sought to evaluate the stability of BRKGAMED considering repeated executions of the algorithm for a subset of data sets.

Experiment I -Analysis of the performance of algorithms
In this experiment, BRKGAMED and eight algorithms from the literature were applied in the fifty literature data sets, considering  ranging from 2 to 7 (300 solutions for each algorithm).The formulation (GUROBI solver) was applied with a maximum running time of 3 h, except for the WAVEFORM21 data set (5000 objects), in which the solver presented an error.
As for choosing the value of , given the number of data sets involved in the experiment, it would not be feasible to carry out a detailed analysis for each case.An alternative would be to adopt a common practice in the literature on clustering, in which  ∈ {2, . . ., ⌈ √ ⌉} (see [3,31]), where  is the number of objects in the data set.However, the experiment carried out sought to compare the performance of the methods, regarding the stability and quality of the solutions, in terms of the silhouette index.Once data sets with the number of objects varying between 49 and 5000 were considered and the adoption of an upper bound for  as a function of  would not favor this objective, since there would be no comparability between all the solutions produced, the upper bound for  adopted in the experiment considered the smallest value of  among the data sets used, that is,  ∈ {2, . . ., ⌈ √ 49⌉}, since, for two distinct data sets  and , with the number of objects   and   and being   <   ,  ∈ {2, . . ., ⌈ √   ⌉} meets the upper bound suggested in the literature for both data sets.
Besides, using the same  range and the maximum running time of 3 h for the GUROBI solver.The formulation was applied to 49 data sets, except for the WAVEFORM21 data set, consisting of 5000 objects, in which the solver presented an error (out of memory error when running the model).The parameters used in BRKGAMED were defined in a calibration experiment.For the HKA algorithm, were adopted the parameter values defined in [39].In the algorithms from the literature were considered the default values of the parameters of the functions presented in Table 4.
This experiment made possible to get the objective function values (Eq.(2.1)), the processing times and the allocation of the objects to their respective clusters.The object function values were used to determine, by number of clusters, the following results: (i) percentage of global optimal solutions produced by the algorithms, based on the total of global optimal solutions produced by the formulation within the maximum time of 3 h; (ii) percentage of best solutions -best solution produced considering the nine algorithms and the formulation, not necessarily corresponding to a global optima and (iii) summary statistics calculated based on the relative gaps (Eq.(6.1)) obtained from the difference between the best solution ( best ) and the solution produced by each algorithm and the formulation (  ) -for the fifty data sets (for data set WAVEFORM1, the best solution was considered to be that associated with at least one out of the nine algorithms) and  ∈ {2, 3, 4, 5, 6, 7}.gap = 100 * (  −  best )/ best . (6.1) It is possible to verify, upon analysis of Table 5, the efficacy of BRKGAMED against other algorithms, considering the percentage of global optimal solutions produced.For the number of clusters equal to 3, BRKGAMED achieved the optimal in 100% of cases.Moreover, the lowest percentages of this algorithm, almost around 88%, occurred for  = 6 clusters.Additionally, the PAM, PAMF, and HKA algorithms presented the closest percentages of global optimal, with respect to BRKGAMED.The most favorable scenario for these three algorithms occurred for  = 2, when the differences were, in percentage points, respectively, of 2.1% (HKA) and 14.6% (PAM and PAMF).For  ∈ {3, 4, 5, 6, 7}, such differences varied between 29.2% (HKA,  = 3) and 72.9% (HKA,  = 7).
Additionally, the worst results are associated with the CLARAF and CLARANSF algorithms, with percentages of global optimal below 7% and even 0%  = 6 and  = 7.The RANK algorithm failed to produce the global optimum for all data sets and number of clusters.
Upon analysis of Table 6, associated with gaps between the solutions (except for the maximum gap), BRKGAMED algorithm generally produced gaps values of 0%.In this table, cells highlighted with shades of gray (from the lightest to the darkest) and in italics/underline correspond, respectively, to gap values with mean, median ( ) and 3rd quartile ( 3 ) within the following ranges [0, 0.1%], (0.1%, 0.5%], (0.5%, 1.0%] and (1.0%, 5.0%].In particular, upon analysis of the mean gap, the worst BRKGAMED result was 0.5%, for  = 6.Based on such cells, PAM and PAMF algorithms present the closest gaps in relation to BRKGAMED (less than 1.0%), followed by the HKA algorithm with gaps of up to 1.0% for  between 2 and 5, and gaps of up to 2.0% for  = 6 and 7. CLARA, CLARANSF, PARK, and RANK algorithms have the largest gaps.In particular, RANK algorithm presented the worst results regardless of the number of clusters, with gaps varied between 29% and 47% in the mean gap and between 20.6%, and 26.9% in the median gap.
To conclude the analyses associated with the solutions, Figure 2 shows the percentages of best solutions produced by algorithm versus number of clusters, where, once again, BRKGAMED algorithm significantly outperformed the other ones, with percentages between 88% and 100%, followed by the PAM, PAMF and HKA algorithms.Complementing the results produced by BRKGAMED, it was performed a comparative analysis of the processing times required by BRKGAMED, HKA, and the mathematical formulation.It is noteworthy that both algorithms are evolutionary [24], therefore, they work with populations (sets of solutions) and combination, mutation, and elitism operators.Such approach requires intensive computation and requires more processing time in the search for good quality solutions.
The other algorithms considered, such as PAM and CLARA, are fast (of the order of up to 5 s per data set), although they produced a smaller number of global optimal and best solutions when compared to the BRKGA algorithm, as shown in Table 5, and Figure 2. Table 7 shows the mean and median associated with the processing times (in seconds) required by BRKGAMED, HKA, and the formulation.In general, BRKGAMED presented lower values than those demanded by HKA and the formulation.Compared to HKA, BRKGAMED was up to 30 times faster (median and  = 4).
Another way to evaluate the quality of solutions produced by clustering algorithms generally concerns the application of an index associated with the relative validation criterion.In this work, it was used the average silhouette, which, according to [20], allows to evaluate how proper is the allocation of each object in its cluster, regarding the distance to all other objects in the data set.Figure 3 shows the proportions of best average silhouettes (in relation to the 50 data sets), by number of clusters and by algorithm, associated with the  highest values of average silhouette.So, in terms of proportion, it is observed that BRKGAMED performed reasonably.

Experiment II -BRKGAMED stability
To evaluate BRKGAMED stability, the algorithm with the highest percentages of best solutions, a second experiment was carried out with a subset of the data sets in Table 1.More specifically, considering ten data sets and  ∈ {3, 4, 5, 6} (a total of 40 scenarios), the algorithm was applied 50 times in each of the data sets presented in Table 8 and, in each execution, both the objective function value associated with the best solution and the total number of generations required to produce such solution were stored.Notes.* Results obtained from the 50 BRKGAMED executions.
In Table 8,  best corresponds to the value of the objective function (according to equation (2.1) obtained considering Experiment I and the other columns show the statistics associated with the values of objective function obtained from the 50 executions of BRKGAMED, namely: minimum ( ), mean ( ), and maximum ( ), in addition to the coefficient of variation () in percentage values.It is possible to verify, upon analysis of said table that, in 35 out of the 40 scenarios evaluated (87.5%), the minimum value ( ) obtained for the objective function was equal to the value of the solution produced by BRKGAMED in Experiment I ( best ).In addition, the  was equal to zero in 50% of the scenarios and was less than 0.20% in the remaining cases.It means that, in most runs, the algorithm produced the same solution, which also corresponds to the best solution, since the mean solution, in most cases, was equal to the minimum solution.
Complementing the analyses from this experiment, Table 9 presents, for the same scenarios, the minimum ( ), mean ( ), median ( ), and maximum ( ) values obtained from the total number of generations demanded by BRKGAMED to produce the best solution in each execution.Considering that the parameters associated with the maximum number of generations () and the number of generations without improvement (  ) were defined, respectively, as 50 and 18.It can be seen that, in most cases, it took a few generations for BRKGAMED to produce good quality solutions.Considering the mean and median values -gray cells (scenarios where the algorithm reached the best solution within 15 generations), it is possible to verify that, in most scenarios, BRKGAMED required 30% of the total number of generations to achieve good quality solutions.In addition, in about half of the cases, the maximum number of generations was around 25 (50% of the total number of generations).

Conclusions and future work
In this work, we presented an algorithm (BRKGAMED) to solve the -medoids clustering problem, which is NP-Hard.This algorithm combined BRKGA metaheuristics concepts with a new proposed crossover operator, which incorporates a local search procedure.To evaluate the performance of this algorithm, we performed several experiments with fifty data sets of varying sizes, comparing our algorithm with well-known PAM and CLARA, its variants, and other clustering algorithms proposed for this same problem.Additionally, an integer programming formulation was applied to solve this problem -which allowed the evaluation of the percentage of global optimal solutions produced by the algorithm.
The BRKGAMED, the algorithms from the literature, and the formulation were applied to such data sets to produce solutions with number of clusters ranging from 2 to 7, where were evaluated percentages of global optimal solutions, percentages of best solutions, relative gaps, and average silhouette.
Regarding the global optimal solutions, BRKGAMED produced, in general, percentages above 90%.For  = 3 it was obtained 100% global optimal solutions, and the lowest percentage observed was in the order of 88% ( = 6).Besides were observed percentages between 88% and 100% (Fig. 2), while evaluating the best quality (or winning) solutions.In this sense, from the global optimal and the best solutions, the results obtained in these experiments showed that BRKGAMED consistently outperformed other algorithms, including PAM and HKA algorithms.
However, when evaluating the gaps in Table 6, it is observed that the BRKGA algorithm had, on average, better performance than the other algorithms for  ≤ 5. But, for  = 6 and  = 7, the average gaps between the BRKGA algorithm and the PAM algorithm are very close, with a slight superiority of the PAM algorithm for  = 6.Finally, BRKGAMED produced average silhouettes of reasonable quality when compared to other algorithms, as shown in Figure 3.
In the second experiment, where BRKGAMED was applied 50 times to a subset of 10 data sets, it was possible to observe the algorithm stability regarding the quality of the solutions produced and the number of generations required to produce such solutions.This statement is corroborated by the mean and the low values of the coefficient of variation of the objective function values (Tab.8).Thus, based on the results presented in this article, considering experiments involving data sets of varying size, it was possible to verify the efficacy of BRKGAMED against other algorithms found in the literature, which indicates that this algorithm is a relevant alternative to be considered when solving the problem of -medoids.
As future work, we plan to develop a new crossover operator incorporating a local search procedure, based on the VNS metaheuristic and Path Relinking procedure, to produce reasonable quality solutions, demanding less processing time.Another possibility is to solve the -medoids problem without previously defining, the number of clusters, which characterizes the automatic clustering problem.To attain this goal, BRKGAMED can be adapted using the average silhouette combined with the objective function, so as to define the ideal number of clusters.

Figure 1 .
Figure 1.Application of BRKGA with the transition between two generations.

Figure 2 .
Figure 2. Percentage of best solutions by algorithms and number of clusters.

Figure 3 .
Figure 3. Proportion of the Best Average Silhoutte produced by algorithms.

1
Generate  vectors  with  values between 1 and  (Initial Population) 2 while stopping criteria are not satisfied do Calculate  obj value for each vector applying equation (2.1) Classify solutions as elite and non-elite defining sets  and NE Copy to next population the  vectors (medoids) of the set  Generates  mutants vectors with  values between 1 and  and copy them to the next population Combine elite and non-elite vector and generate ( −  − ) vectors to the next population applying crossover -Algorithm 2 Combine each element of   with  ∖ [] Combine each element of  with   ∖   [] ) objects to the nearest medoid of   defining clusters 1, . . ., 5 6 7 1  best ← +∞; 2 for  ← 1 to  do 3  ← 4   ← 5 Add  to  and Add   to  6 for  ← 1 to 2 2 do 7   ←  [, ] (-th vector of medoids) 8 Allocate ( − 9 for  ← 1 to  do 10 Determine the  ∈  whose sum of distances to others (|| − 1) objects of  is minimal 11 if  ̸ =   [] then   [] ←  12 Calculate objective function  obj (  ) 13 if  obj (  ) <  best then 14  best ←  obj (  ) 15  ←

Table 2 .
Summary of data sets.
* Data sets used in the calibration experiments.

Table 4 .
Algorithms and their packages.

Table 5 .
Percentage of global optimum by algorithm and number of clusters.
* Number of global optimal produced by the formulation -cpu time of 3 h, considering the 49 data sets.Global optimum not obtained for A1 data set and for the WAVEFORM21 data set, solver presented an error.

Table 6 .
Relative gaps by algorithm and number of clusters.

Table 7 .
Average and median computational time by number of clusters (seconds).

Table 8 .
Statistics obtained from objective function values.

Table 9 .
Statistics associated with total generations.