A BEAM SEARCH FOR THE EQUALITY GENERALIZED SYMMETRIC TRAVELING SALESMAN PROBLEM

Abstract. This paper studies the equality generalized symmetric traveling salesman problem (EGSTSP). A salesman has to visit a predefined set of countries. S/he must determine exactly one city (of a subset of cities) to visit in each country and the sequence of the countries such that s/he minimizes the overall travel cost. From an academic perspective, EGSTSP is very important. It is NPhard. Its relaxed version TSP is itself NP-hard, and no exact technique solves large difficult instances. From a logistic perspective, EGSTSP has a broad range of applications that vary from sea, air, and train shipping to emergency relief to elections and polling to airlines’ scheduling to urban transportation. During the COVID-19 pandemic, the roll-out of vaccines further emphasizes the importance of this problem. Pharmaceutical firms are challenged not only by a viable production schedule but also by a flawless distribution plan especially that some of these vaccines must be stored at extremely low temperatures. This paper proposes an approximate tree-based search technique for EGSTSP. It uses a beam search with low and high level hybridization. The low-level hybridization applies a swap based local search to each partial solution of a node of a tree whereas the high-level hybridization applies 2-Opt, 3-Opt or Lin-Kernighan to the incumbent. Empirical results provide computational evidence that the proposed approach solves large instances with 89 countries and 442 cities in few seconds while matching the best known cost of 8 out of 36 instances and being less than 1.78% away from the best known solution for 27 instances.


Introduction
Logistics in general and transportation in particular are the cornerstones of modern life. Their importance emanates from their multi-fold repercussions on the cost of goods, profit margins of transportation companies, clients' service quality, drivers' well being, and air pollution.. In fact, they involve several parties: end clients, manufacturers, distributors, drivers, stock holders, etc. In addition, they require the scheduling of several interrelated tasks that are dynamic in nature and constrained in time and space. The economic and temporal constraints augment their complexity. Solving them requires the migration of tools from diverse disciplines including information technology, optimization, and vehicle routing.
Among the most widely studied transportation problems is the traveling salesman problem (TSP). A traveler has to visit a finite number of countries starting from one country and returning back to it, and visiting every country exactly once. The objective is to find a minimal cost route, where the cost can be total duration, travel distance, etc. TSP's importance emanates from its occurrence as a subproblem of complex real life problems in the transport of passengers/goods and in scheduling. For these problems, TSP identifies a minimal-cost itinerary for each salesman. For example, TSP is a special case of the equality generalized symmetric traveling salesman problem (EGSTSP), where the salesman chooses exactly one of many cities of a country to visit; i.e., a TSP with a covering constraint.
Formally, consider a set of nodes that are divided into clusters. EGSTSP searches for the shortest route that visits exactly one node from every cluster starting and ending at the same cluster. EGSTSP is more difficult than TSP because of the combinatorial aspect added by the sizes of the clusters. EGSTSP occurs in several real-life applications such as maritime ship routing, distribution of medical supplies, urban waste management, telecommunication networks, logistics, rapid post dispatching, VLSI, circuit designs, and in laser cutting to determine the trajectory of a laster cutter [18,21]. During the COVID-19 pandemic, EGSTSP has drawn a lot of attention. With reduced air-traffic and disrupted logistic chains, the procurement and dispatching of goods to confined customers and isolated cities has become a true challenge. In addition, the availability of a vaccine raises the issue of its fair distribution and of health care equity. Some of these vaccines impose a cold chain that can't be broken. In such cases, optimizing the distribution plan is of prime importance. This optimization is equivalent to solving a large scale EGSTSP. Evidently, its exact solution may be challenging. However, the continuous advancement of the computing technologies provides near-optimal solutions to such difficult problems. They are allowing approximate methods to undertake a more extensive search; thus obtaining nearer-global optima in shorter times. EGSTSP has been solved by exact approaches (such as dynamic programming, branch and bound, and branch and cut), and by approximate ones such as local improvement heuristics ( -opt, swap, insertion, etc.), and meta-heuristics (tabu search, ant colony, genetic algorithms, etc.). This paper proposes a new approximate hybrid approach for the EGSTSP. Hybrid heuristics have identified the best known solutions to several complex combinatorial optimization problems. They are powerful search methods because they tackle two competing goals: exploration and exploitation. Exploration is a diversification of the search. It investigates the solution space in order to determine the part that has a higher chance of containing the global optimum. Exploitation refines (or intensifies) the search on the part of the space that has a high potential of containing the global optimum.
The proposed hybrid heuristic is a beam search (BS) (i.e., a truncated branch and bound) that is augmented with improvement techniques. It ensures exploration via a standard width-first BS and exploitation via local search heuristics. BS strives for global optimization while local search heuristics strive for local optimization in the global optimum's neighborhood. That is, BS can be assimilated to evolution while local search to learning. Generally, synchronization of evolution and learning yields efficient hybrid heuristics. Specifically, the proposed hybrid BS embeds -a low-level hybridization, which addresses the functional composition of BS by subjecting the partial solution at each node to a local search; and -a high level hybridization that maintains BS self containing by subjecting the incumbent of BS to a -opt type of search.
To the best of the authors' knowledge, this is the first application of BS to EGSTSP. In addition, the hybridization explores the success of local search to assess the nodes of the tree and to estimate their potential. It subsequently chooses the nodes with the best potential to branch on and prunes the non-promising ones; thus, it explores the search space's parts that contain near-global optima while it discards the others. It then applies a 2-opt, a 3-opt or the notorious improved Lin-Kernighan (LK) heuristic [9] to its incumbent. The computational investigation provides computational evidence of the good performance of the hybridized BS within a reduced runtime. Its deviation from the best known solution is less than 0.0578% for half of the instances and less than 1.78% for three quarters of them. Section 2 defines the problem. Section 3 reviews the literature on EGSTSP. Section 4 details the proposed approaches. It presents the algorithm of a standard BS, its adaptation to EGSTSP, the low-level hybridized BS, and the high-level hybridized BS when applying each of the three local improvement methods: 2-Opt, 3-Opt and LK. Section 5 presents the computational results, which assess the efficiency of the methods in terms of solution quality and runtime and highlights the utility of the hybridizations. Finally, Section 6 summarizes the findings and gives potential research extensions.

Problem definition
EGSTSP is an NP-hard combinatorial optimization problem. It consists of finding the optimal path of a salesman who has to travel through a set of countries while visiting exactly one city from each country and visiting every country once. The optimal path minimizes the total traveled cost. Hence, the salesman must determine for each country the city s/he will visit and the order of visit of the countries. EGSTSP is more complex than TSP. For TSP, each country consists of a single town while EGSTSP has the additional complexity of choosing a city from each country. Because it extends TSP, which is NP-hard, EGSTSP is also NP-hard.
Herein, we define EGSTSP using the notation of [6,7] and present their integer linear program (ILP). Formally, consider a complete non-oriented graph = ( , ) where = {1, . . . , } is a set of nodes that are divided into mutually exclusive clusters ℎ , ℎ = 1, . . . , , and ≥ 3. = {[ , ] : ∈ , ∈ , ̸ = } denotes the set of edges connecting pairs ( , ) of distinct nodes ∈ and ∈ , ̸ = . The cost of traveling through edge ∈ is . This cost may be assimilated to a linear function of the Euclidean distance between and . The objective of EGSTSP is to determine a minimal cost cycle ⊆ such that includes exactly one city from each cluster, and each cluster is visited once.
ILP uses two types of binary variables: = 1 if the salesman travels through edge ∈ and 0 otherwise, and = 1 if the salesman visits node ∈ and 0 otherwise. Using the aforementioned notation and these two sets of binary variables, EGSTSP can be formulated as follows.
The objective function, given by equation (2.1), minimises the total travel cost. Equation (2.2) preserves the flow through every node. A node is visited if it has both a predecessor and a successor node; therefore the righthand side is 2; otherwise, the righthand side must be zero. Equation (2.3) ensures that the tour includes exactly one city from each cluster. Equation (2.4) guarantees the connectivity of the solution: Each cut separating two visited nodes and must be crossed at least twice. Finally, equations (2.5) and (2.6) determine the nature of the decision variables.
Because EGSTSP is NP-hard, solving large instances of EGSTSP using ILP is difficult. Herein, we are interested in efficiently solving large instances of EGSTSP using heuristic methods, and in comparing the heuristics' solutions to the ILP results that are readily available in the literature (and given by lit in the computational section and in the Appendix).

Literature review
Small instances of the equality generalized TSP (EGTSP) were solved exactly using dynamic programming [26], branch and bound [14], and branch and cut [6]. Large instances have been tackled approximately; for example, Noon and Bean [20] applied the TSP's closest neighborhood heuristic. Lien et al. [16] assimilated EGTSP to a TSP whose number of nodes is three times as large as the number of clusters. Dimitrijevic and Saric [5] developed an alternative transformation that had fewer nodes; i.e., using twice as many nodes as the number of clusters of the original EGTSP. Ben-Arieh et al. [2] opted for a transformation that had as many nodes as the number of clusters of EGTSP using the 'exact' Noon-Bean, and two modifications of the non-exact Fischetti-Salazar-Toth transformation. Helsgaun [9] transformed EGSTSP into a classic TSP and applied LK to the transformed TSP. Karapetyan and Gutin [11] proposed an LK heuristic for EGSTSP. Smith and Imeson [25] applied an iterative remove and insert heuristic for EGSTSP. They opted for three insertion mechanisms: the furthest node, the cheapest, and random insertion. Karapetyan and Gutin [12] also designed a large neighborhood search for EGSTSP. Renaud et al. [23] proposed an Initialization, Insertion and Improvement heuristic that [22] further generalized. Khachai and Neznakhina [13] developed a dynamic programming based heuristic for EGSTSP.
Another surge of the EGSTSP literature came from hybrid approaches. Ardalan et al. [1] hybridized the Imperialist Competitive Algorithm with a local search. Lawrence and Daskin [15] hybridized a random key genetic algorithm with a local search. Their algorithm is quite fast. It identifies its best solution within the first two or three iterations. Its good performance is due to the utility of the local search in identifying the best solution. However, their algorithm is outperformed by the mimetic algorithm of [8], who combined the advantages of genetic algorithms and local search. Chira et al. [4] designed a "sensible" ant colony system that makes the ants sensitive to the pheromone level in their trail; thus, explore the most promising regions of the search space. Yang et al. [28] augmented ant colony optimization to EGSTSP with a mutation mechanism and a local search. They showed the importance of the local search, in particular, for instances with fewer than 200 nodes. Bontoux et al. [3] proposed a mimetic algorithm whose crossover operator is based on a large neighborhood search.
Different variants of EGSTSP have appeared recently. Sundar and Rathinam [27] applied a branch and cut and [30] extended Christofide's TSP algorithm to the multi-depot EGSTSP where there are several travelers; each departing from a different depot (node). Mestria [19] considered the clustered TSP, where all nodes of a cluster must be visited in a contiguous manner. The author hybridized a variable neighborhood random descent with local search (for intensification) and with a greedy randomized adaptive search (for diversification). This latter consists of a constructive heuristic and a perturbation method. The author applied several variable neighborhood structures, in a random order. Jiang et al. [10] proposed a hybrid genetic ant colony algorithm for the multiple TSP, where each salesman departs from a specific depot and returns to it. Yuan et al. [29] studied the generalized TSP with time windows, where arrival to a city must occur within a time window. They proposed two integer linear programs and valid inequalities that are separated dynamically within a branchand-cut algorithm. They initiated their branch and bound from a feasible solution built via a simple heuristic. They solved instances with up to 30 clusters within a one-hour runtime. Salman et al. [24] imposed precedence constraints on EGSTSP, developed a new branching rule, and adapted some existing bounds to the problem.
This literature review suggests that EGSTSP was never tackled via BS. It further suggests that hybridization is a key factor in the success of most approaches to TSP related problems. To explore these findings, this paper proposes a hybrid BS that employs local search at each node and applies a -opt type of search to the incumbent.

Proposed approaches
We efficiently solve EGSTSP using hybridized BS-based algorithms. BS is a truncated tree search. It avoids exhaustive enumeration by branching on a subset of elite nodes, believed to lead to the optimum. They usually have minimal fitness values, which are either the cost of their partial solutions or their upper bounds. At each iteration, nodes are selected for branching, where is the beam width. The other nodes are permanently discarded, and no backtracking is performed. We enhance the performance of BS by hybridizing it at two levels. The low-level hybridization adds a local search phase at each node of the BS tree. The high-level hybridization applies 2-opt, 3-opt or LK heuristics to the best solution that BS obtains. Section 4.1 describes a standard BS. Section 4.2 explains our adaptation of BS to EGSTSP. Sections 4.3 and 4.4 present the low-and high-level hybridization.

Standard beam search
The pseudo code of a standard BS is given in Figure 1. It consists of an initialization step, an iterative step and a stopping criterion. The initialization step declares the set of current nodes of the tree to the root node 0 and the set ℳ of offspring nodes to the empty set. When an initial feasible solution x is available, this step further sets the incumbent x * and its value * = (x * ) to, respectively, this initial solution x and its objective function value. When an initial feasible solution is not available, the upper bound * is set to ∞.
The iterative step chooses a node from , and sets it as the current node. It branches out of the current node, and adds all new nodes to ℳ except for leaves. Leaves constitute feasible solutions; thus, are candidate solutions. A leaf becomes the incumbent whenever its cost is less than * . The iterative step appends the smallest-cost nodes of ℳ to and re-initializes ℳ to the empty set. This process is reiterated until no further branching is possible; that is, till = ∅. When applying a width-first BS, the nodes of belong to the same level of the tree.

Proposed beam search
This section presents our proposed BS-based method BS 0 for EGSTSP. BS 0 identifies a least cost ordering of the clusters. It assimilates the nodes of the tree to partial solutions (i.e., ordered subsets of ), and branching out of a node to augmenting it with an additional cluster. Its tree starts at the root node (i.e., level ℓ = 0) with an empty tour, and has at most levels. A partial solution ℓ corresponding to a node at level ℓ, ℓ = 1, . . . , , is a sequence of cities 1 , 2 , · · · , ℓ all belonging to and to different clusters. As all tree-search techniques, BS 0 has three major steps: branching, assessment, and selection.
The branching of a node of the tree corresponds to appending a cluster to the partial solution of that node. That is, out of a node of level ℓ emanate −ℓ branches; each leading to a different cluster. A node inherits the path of its parent, and appends a cluster to the end of its parent's path. Specifically, branching out of the node corresponding to ℓ consists in appending a city from a non-visited cluster to ℓ .
The assessment of the cost of a newly created node ℓ is based on a straightforward/simple lower bound and on an upper bound. The lower bound is the cost ℓ of the partial solution ℓ . It is the sum of the travel costs between the successive nodes of ℓ : It is the sum of its parent node's cost ℓ−1 = 1 , 2 + 2 , 3 + . . . + ℓ−2 , ℓ−1 and of the travel cost ℓ−1 , ℓ from its parent node to the appended cluster. The upper bound is a total-cost of a complete solution constructed by iteratively appending the closest city of a 'not yet assigned' cluster to the partial solution ℓ .
At a given level ℓ of the tree, the selection chooses the best nodes among all generated child nodes for further branching at the next level ℓ+1 of the tree. These iterative branching, evaluation and selection processes are repeated until ℓ = ; that is, until all clusters are visited. Herein, BS 0 is started with a feasible solution obtained via a greedy heuristic that chooses arbitrarily the first city 1 and iteratively appends the closest city from a non-visited cluster.
In summary, BS 0 is a constructive approach that starts at the root node with an empty tour and appends a cluster at each level of the tree. It stops when the tour has clusters visited. It has an ( ) worst case time complexity. Thus, our transformation of EGSTSP into TSP is less complex than competing transformations. It maintains < nodes whereas TSP considers nodes.

Enhanced beam search
The low-level hybridized BS, denoted hereafter as BS 1 , subjects each partial solution ℓ obtained at a node of a level ℓ, ℓ = 3, . . . , , of the tree to a local search. The local search is simple but efficient. It preserves the order of the clusters in ℓ but changes the selected node of one or more clusters. It chooses the 'best' city among all nodes of every cluster of the partial solution ℓ . At a level ℓ ∈ {3, . . . , }, BS generates − ℓ nodes. Let ℓ be one of these nodes and let ℓ = ( , and iterates through all the cities of cluster ℎ . It retains the city * ∈ ℎ that minimizes the distance from [ℎ − 1] to to [ℎ + 1]; i.e., When applied to a node ℓ , the local search has (ℓ ) complexity ( =¯), where¯= max ℎ=1,..., {| ℎ |} is the maximum number of cities among all clusters. Because it is applied to all ∑︁ ℓ=3 ℓ( − ℓ) nodes of the tree, the local search increases the complexity of BS 1 to at worst ( 2¯) . Yet, it allows BS 1 to attenuate the myopic nature of BS 0 ; i.e., BS 0 may miss the global optimum when it selects the best nodes of a level and permanently prunes the others.

High-level hybridized beam search
The high level hybridized BS, denoted BS · , · = 2, 3, 4, applies a 2-Opt, a 3-Opt, or LK heuristic to the best solution obtained by BS 1 . Because the hybridization is high-level, the worst time complexity of BS · , · = 2, 3, 4 is the sum of the complexity of BS 1 and of the adopted hybridization approach.
The 2-Opt has an ( 2 ) complexity where is the number of clusters of the tour. It chooses two clusters of the tour randomly and reverses the flow between them. It is repeated as long as the solution is improved.  [1]. It repeats this process as long as the solution is improved.
LK yields near-global optima when started from a large number of initial solutions. An LK perturbation of a solution causes, on average, increases of the order of 10 to 15% of its cost. It is one of the best heuristics for the symmetric TSP because of its adaptive nature. Indeed, it swaps a number of partial sequences of the tour. This number is not predetermined; yet, it offers a good tradeoff between solution quality and runtime. While 2-opt and 3-opt break 2 and 3 edges of the tour, LK chooses the number of edges to be broken such that this number yields a minimal cost tour. In this sense, LK may be perceived as a variable-exchange of edges. It chooses links to exchange and tests the utility of exchanging + 1 links. (Initially = 2.) Any exchange must generate a feasible neighbor. Its utility is assessed via the difference of the costs of the current solution and its neighbor. It is only adopted when it reduces the current solution's cost. LK marks the exchanged edges yielding the best net cost reduction as permanent and prohibits their elimination for a number of iterations by inserting them into a tabu list. When the exploration of exchanging + 1 links reduces the incumbent's cost, LK updates the incumbent, and reduces ; otherwise, it increases . LK stops when the incumbent can no longer be improved. Even though the complexity of LK is not well determined in the literature, our implementation has a worst time complexity of ( 5 ): It binds to 5.

Computational results
The computational investigation assesses the performance of hybridization in general, and of its type, in particular, on the solution quality and on the runtime of BS. For this purpose, it uses five versions of BS: BS 0 A standard width-first beam search of beam width , BS 1 BS 0 augmented with a local search at each node of the tree, BS 2 BS 1 with its best solution subject to a 2-opt, BS 3 BS 1 with its best solution subject to a 3-opt, and BS 4 BS 1 with its best solution subject to the LK heuristic with up to 5.
It applies these five versions (coded in C and run on an Intel Core i3-4030U, 1.90 GHz, 4GB RAM) to 36 benchmark instances of EGSTSP, all available at http://www.cs.rhul.ac.uk/home/zvero/GTSPLIB/. Let lit be the best known solution, and BS· , · = 0, 1, 2, 3, 4 the corresponding BS · solution value, for a beam width = 1, 2, 3, 4, 5, 10, obtained within runtime · (expressed in seconds). For this solution, the percent optimality . Herein, we analyze the results, reported in A, focusing on the utility of the low-and high-level hybridization of BS. We then conclude with some useful remarks.

Utility of the low level hybridization
First, we compare the runtime and solution quality of BS 0 to that of BS 1 ; that is, of BS without and with local search at each node. (cf. Tabs. A.1 and A.2 for the detailed results.) We undertake this comparison to highlight the importance of the low-level hybridization undertaken at each node of each level ℓ of the search tree. Figure 2, which displays the mean runtime of BS 0 , . . . , BS 4 , suggests that the mean runtime of BS increases linearly as a function of the beam width . Its average runtime (in seconds) can be estimated as a linear function of :¯0 = 0.5454 − 0.0459 and¯1 = 0.5473 + 0.0080, with 99.03% and 99.98% respective coefficients of determination. This behavior is expected as a larger beam width requires more evaluations of partial solutions, of bounds, of sorting, stocking, and retrieving. Figure 3, which displays box plots of the observed run times, further clarifies this tendency. Yet, it stipulates that the local search does not increase the run time. A statistical paired t-test infers that there is no difference between the mean run times of BS 0 and BS 1 at any level of significance while a paired statistical test infers that the mean ∆ BS1 is less than the mean ∆ BS0 at any level of significance and that the mean difference ∆ BS0 − ∆ BS1 has a 4.84% point estimate and a 4.19% lower bound of a 95% confidence interval estimate. This difference is due to the local search, which enhances the search of BS, by investigating the neighborhood of the partial solution at each node. In fact, ∆ BS0 > ∆ BS1 for all tested instances and for all beam widths. In addition, the average percent deviation 100% is of the order of 26%; further highlighting the importance of the local search undertaken by BS 1 at every node. Because BS 1 is superior to BS 0 in terms of solution quality while being equally good in terms of runtime, it can be inferred that BS 1 is better than BS 0 . Figure 4 displays the box plots and means of the percent deviation of the solutions of BS · , · = 0, . . . , 4, from lit . Zooming on the box plots and means of BS 0 and BS 1 , we detect a seemingly counter-intuitive behavior for small . Increasing from 1 to 4 does not decrease ∆ BS0 and ∆ BS1 . This is most likely because it makes BS choose, at a level ℓ, partial solutions that -despite their good quality at level ℓ-do not lead to near-optima. That is, the diversification brought up by the larger beam width focuses on areas of the search space that do not contain the global optimum. The local search undertaken at each node does not mitigate this glitch. On the other hand, increasing beyond 5 overcomes this issue. Setting = 10 allows BS to obtain solutions that are closer to the global optimum. That is, it makes BS investigate areas of the search space that contain near-global  optima. This highlights the importance of the choice of the partial solutions at a level ℓ in order to direct the search toward the most promising regions. In this sense, the local search provides a lookahead strategy that helps BS judiciously choose its partial solutions.  As Figure 4 reveals, the improvements of the solution quality due the high-level hybridization are much larger than their counterparts due to the low-level hybridization, regardless of the beam width. These improvements occur at no additional runtime cost except for the last three instances when run with BS 4 and a beam width = 10. These instances are marked as outliers in Figure 3, which displays the box plots and means of the observed run times of BS 0 -BS 4 . For all beam widths, the mean run time of any of the approaches is larger than its median; signaling the existence of outlier cases, corresponding to the last three instances. Despite the presence of these outliers, which drive the run time of BS 4 up for = 10, paired t-tests infer that there is no statistical evidence to claim that the mean run time of any pair of hybridized versions of BS are different at a 5% significance level.

Utility of the high level hybridization
The lack of exploitation and of exploration of the search space makes BS 0 obtain better results for larger beam widthes. This behavior persists for BS 1 , which benefits from a local search at each of its nodes, and for BS 2 , which benefits from an intensified 2-opt search around its best solution. However, for BS 3 and BS 4 , the 3-opt and the LK intensification makes BS obtain its best solutions using a beam width = 3, with a mean runtime less than 2 seconds. This is confirmed by Figures 2 and 5, which display respectively the mean percent deviation from lit and mean runtime as a function of beam width for BS 0 to BS 4 .

Remarks
LK is known to obtain good results when initialized from several random initial solutions. The proposed approach BS 4 provides evidence that it is possible to generate initial solutions for LK in a more systematic manner. Furthermore, the results infer that BS 3 with a beam width = 3 yields, on average, better results than the other considered beam searches. However, it remains true that the incumbent of BS 1 can be subjected to three types of searches 2-opt, 3-opt, and LK, at a negligible additional runtime. In fact, there is no statistical difference between the runtime of BS 1 and BS · , · = 2, 3, 4; implying that the bulk of their runtime is caused by the BS component. Finally, even though = 3 yields in general the best performance, running BS 1 with different beam widths constitutes a good diversification strategy. Using these two additional aforementioned intensification and diversification mechanisms reduces the percent deviation gaps of the BS solution to those observed in the literature; matching the best solution in 22.22% of the instances, and averaging a 0.01344%  deviation. The mean should be interpreted with care as it is affected by two outlier values, recorded for instances 40kroA200 and 80rd400, as shown in Figure 6. These outliers are clearly depicted in Figure 7, which displays the resulting box plot of percent deviations for this BS. The corresponding five-point summary of the percent deviation is (Minimum = 0, Q1 = 0.00063, Q2 = 0.00578, Q3 = 0.01779, Maximum = 0.07027), where Q1, Q2 and Q3 are the first, second and third quartiles. Ignoring the two outlier instances brings the largest deviation over the other 34 instances to 0.03925% and its average to 0.01020%.

Conclusion
This paper addressed EGSTSP via a beam search that obtains good solutions for large beam widths. However, to avoid the exponential increase of runtime associated with branch and bound, we opted for both a low-and a high-level hybridization of the beam search. First, we performed a local search at each node of the tree. This local search acts as a lookahead strategy. It allows the beam search to retain the partial solutions that could lead to near-global optima in lieu of selecting the lowest cost partial solutions. This local search improved the performance of the beam search without affecting its runtime. Second, we subjected the best solution of the beam search to each of three local search operators: 2-Opt, 3-Opt and Lin-Kernighan. This high level hybridization further improved the solution quality of the standard beam search by up to 70% without affecting its runtime. Applying the three search operators to the incumbent offers BS more exploration and exploitation power. The proposed hybridization can be applied to different variants of traveling related problems including vehicle routing, dial-a-ride, and delivery with time windows. Other types of search techniques can also be considered such as simulated annealing, variable neighborhood search, adaptive, and data-driven techniques.

Appendix A. Detailed computational results
The results of BS·, · = 0, . . . , 4 are reported in Tables A.1-A.5. The first column indicates the '.gts' label of the instance whereas the second column reports its best known solution lit , available in the literature. The next six triplets of columns report the BS · solution value BS· , · = 0, . . . , 4, its percent optimality gap , and its runtime · in seconds when the beam width = 1, 2, 3, 4, 5, 10.