MATHEMATICAL MODELING OF A BI-OBJECTIVE HUB LOCATION-ROUTING PROBLEM FOR RAPID TRANSIT NETWORKS

. This paper aims to develop a mathematical model for rapid transit networks based on a hub and spoke model, comprising stopovers (stations) in the hub and non-hub (spoke) alignments. Due to the use of rapid transit systems in both the hub-level sub-network ( i.e ., the network among the hub nodes) and the spoke-level sub-network ( i


Introduction
Hub location problems (HLPs) focus on the location of hub facilities and the design of hub networks.Hub facilities serve as consolidation or transshipment nodes on origin-destination routes.Transportation systems frequently employ hub and spoke architectures efficiently to route flows between many origins and destinations.Hub nodes provide the possibility of connecting a large number of origin-destination pairs by using a small number of links and the possibility of using economies of scale in transportation costs, as a result of the consolidation of demand flows.In order to benefit from the economies of scale, hub facilities can be connected with highly efficient pathways.
In recent decades, there has been an increasing interest in rapid means of transit such as rapid train, subway and bus rapid transit (BRT) in urban public transportation systems.The use of these rapid public vehicles can be considered as a solution to relieving the congestion that has beset roads and airports in many parts of the world.The establishment of rapid transit systems requires large investment to install stations and links among them.The success of such investment strongly depends on how well those systems are demanded by the public, which is, in turn, dependent on the network design such as the location of stations.With regard to the features of hub models and the large investment needed to build connections in rapid transit systems, using hub structures for designing these systems seems to be advantageous.Some studies, such as [22,23] have proposed pre-assigned topological configurations, e.g., a star, a triangle or a cartwheel, for designing rapid transit networks (RTNs) which have similar structures as hub networks.
The present study aims to develop a mathematical model of designing a rapid urban public transportation network based on a hub-and-spoke model.Instead of using pre-assigned configurations, the network is planned to be based on a general hub structure with stopovers (stations) in the hub and non-hub (spoke) alignments.In fact, what is sought is a hub location model which uses rapid transit systems in both hub-level and spoke-level sub-networks.In the hub-level sub-network, more efficient (larger and faster) vehicles are used to benefit from the economies of scale.For example, in a train network, the hub-level sub-network can be serviced by express trains with high capacity and speed, and the spoke-level sub-network can be services by local trains with lower capacity and speed.In addition, different modes of rapid transportation can be considered in the system, e.g., a subway in the hub-level sub-network and BRT in the spoke-level sub-network.
Due to the employment of rapid transit systems in the hub-level and spoke-level sub-networks, the proposed model relaxes some of the common assumptions and properties in classical hub location models.In this regard, direct connections are allowed between spoke nodes, hub and spoke nodes and edges have considerable (nonzero) setup costs, both hub and spoke edges have capacity constraints, the hub-level sub-network is not necessarily a complete network, and paths between origin-destination (OD) pairs do not necessarily contain at least one and at most two hubs.One of the important characteristics of the model is that, unlike generic hub models, it provides possibility not only for direct links between spoke nodes but also for the transshipment of flows at spoke nodes.Given that rapid vehicles usually rout in lines, in the hub network topology, both hub-level and spoke-level sub-networks are considered to be composed of multiple lines.In addition to designing the network, which involves decisions on the location of hub and spoke nodes and the selection of hub and spoke edges, the goal of the research is to simultaneously determine hub and spoke rapid transit lines and the way of routing OD demand flows through these lines.
It should be noted that the proposed model possesses the key distinct features of HLPs, including (a) service demand is associated with flows between OD pairs, (b) hub facilities are intermediate nodes on the OD paths which serve as transshipment or consolidation points, (c) there is the benefit of routing flows via hubs; as a requirement, it holds true in situations where the origin and the destination are not located in the same spoke line, and (d) there is a cost-based or service-based objective that depends on the design of the hub network (namely the location of hubs and the selection of links) and the routing of flows [13].
Classical hub location problems are modeled mainly from the viewpoint of costs.They aim to minimize the total cost of a network to satisfy every demand.However, from a profit point of view, it may be more advantageous not to serve every demand.This issue is of importance especially in designing RTNs where setup costs are considerable.Therefore, this research seeks not only to minimize the costs but also to maximize the profit (i.e., revenues minus costs) by assuming no force to satisfy all the demands.In public transportation systems, in addition to the profit of the system operators (managers), the quality of service should also be enhanced to satisfy the users (passengers).As such, this study addresses the maximization of the profit (from the operator's point of view) and the minimization of the service time (from the user's point of view).
Figure 1 illustrates a hub-based RTN.Hub and spoke nodes are presented with filled squares and circles, and hub and spoke edges are shown with wide and narrow links, respectively.The unfilled circles show the demand centers which are not selected to be serviced.In this study, all the demand centers are considered as hub and spoke candidates.In the network shown by this figure, nodes 11 and 18 are not serviced, nodes 8, 14, 15, 16, 17, 19 and 20 are hub stations, and the remaining nodes are spoke ones.The network consists of two hub lines, determined by line paths along stations (17,14,15,16) and (8,15,19,20) and four spoke lines by the line paths along stations (1,5,9,13,14), (2,6,10,15), (3,7,12,16) and (4,8).
An adaptive large neighborhood search (ALNS) algorithm is proposed to solve the developed model.The model is tested and evaluated on instances derived from two well-known datasets for hub problems.The performance of ALNS is compared with that of the CPLEX solver on small-size instances.The statistical results have proved the efficiency of the proposed algorithm.The resulting hub networks are analyzed under various parameter settings and some managerial insights are given.
The RTN design model proposed based on hubs and spokes has potential applications in designing new RTNs such as subways, BRT systems and multimodal systems in large cities, as in developing countries.It can also be used to improve the existing systems, serve as a decision support tool to evaluate the prospective locations of hub and spoke stations and links, and decide on optimal shipment strategies by considering the criteria for the corresponding operators and users.
The main contributions of this study are as follows: (a) A mathematical model is introduced to design a rapid transit network based on a hub and spoke model with stopovers (stations) in the hub and spoke alignments.Rapid transit systems are employed in the hub-level and spoke-level sub-networks.(b) The hub-based rapid transit network is designed by decisions about the location of both hub and spoke nodes and the selection of both hub and spoke edges simultaneously.In this regard, the capacity constraints in the hub and spoke edges are taken into account.The model relaxes some of the common assumptions and features of classical hub location models, as the transshipment of flows in the spoke nodes is possible, and the setup costs of all the hub and spoke nodes and edges are considerable (non-zero).(c) In addition to the decisions on the network design, decisions are made to determine the hub and spoke lines and the routing of flows through them.From the viewpoint of the hub network topology, this study is based on multiple lines topology for both hub-level and spoke-level sub-networks.(d) The criteria for the operators and users are incorporated, i.e., the maximization of profit (revenues minus costs) with no force assumed to service all the demand nodes and the minimization of transit times.(e) An ALNS algorithm is developed to tackle the computational complexity of the problem for real-size instances.
The remainder of the paper is organized as follows.The next section presents a survey of the relevant literature.In Section 3, the problem is defined, and a mixed-integer linear programming model is formulated for it.The ALSN algorithm is presented in Section 4, and the results of the computational experiments are reported in Section 5. Finally, Section 6 provides some concluding remarks.

Literature review
This section presents a review of the relevant literature in the two contexts of hub location and rapid transit network design problems.
Five common assumptions underlie most HLPs.They include (a) direct connections between non-hub nodes are not allowed, (b) the costs of edges satisfy the triangle inequality, (c) the discount factor is constant for hub edges, (d) hub edges have no setup cost, and (e) spoke edges have no setup cost.These five assumptions, without any other restrictions, imply two properties in generic HLPs.First of all, all hubs are fully interconnected by the hub edges, i.e., a hub-level sub-network is a complete network.Secondly, the flows between OD pairs visit at least one and at most two hubs in a hub network in order to minimize the transportation costs.In new approaches, researchers have tried to relax these assumptions in order to make their models realistic enough to be employed in real-world problems.The relaxation of completeness of hub-level sub-networks has attracted significant attention in recent years; it has not been the case for many real-life networks.For example, in the case of transportation networks where the construction cost of edges is not negligible, the completeness of a hublevel sub-network is not based on reality.In some incomplete hub models, no particular topological structure is considered for the hub-level sub-network, not even being connected [7,8], while some other models have particular structures including circle [15], tree [14], star [31] and simple paths or lines [40,41].As poke-level sub-network can also have these structures or other particular structures such as direct connections [3,37], multi-stops [30,57], complete sub-graphs [52] and tours [4,16,25,26,28,29,42].The choice of the structure depends on the field of application.For a review of the particular topological structures of HLPs, the reader is referred to [13,30].As already mentioned, due to employing rapid transit systems in the hub-level and spokelevel sub-networks, the model in this study relaxes Assumptions 1, 4, 5 and, thus, the corresponding properties.From the perspective of the hub network topology and to the best of our knowledge, this study is the first one that uses a topology of multiple lines for both hub-level and spoke-level sub-networks.
Initially, hub networks were mainly used in air transportation.In recent years, considerable attention has been paid to hub structures for designing public transportation networks, which usually are multi-modal, especially those with rapid transit modes.Some of the studies in this case are [12,24,27,36,40,41,43,51,54].In these studies, rapid transit modes are usually considered for the hub-level sub-networks, and the spoke-level sub-networks are usually assigned to road vehicles)such as cars and trucks) [40,41], buses [24,43], or ships [21].Like most of these studies, the present research deals with setup costs for hub edges and, thus, incomplete hub-level sub-networks.However, contrary to the existing hub studies and their application in public transportation, to the best of our knowledge, the present study is the first one that decides on the location of hub and spoke nodes and edges simultaneously.It also determines vehicles transit lines in hub-level and spoke-level sub-networks with capacity constraints in spoke and hub edges, considering both operator and user criteria.
There are not many studies in the literature focusing on maximization objectives in hub location problems especially in non-competitive situations (see e.g., [2,27,44,49,53]).Like studies such as [2,44,49,53], this research focuses on profit maximization with no concern about all the demand nodes to be necessarily served.
As already mentioned, in recent decades, there has been an increasing interest in rapid transit systems.An RTN design includes two intertwined problems, namely the determination of alignments and the location of stations [13,33].In the recent literature on RTN design problems, different aspects of the problem have been captured by optimization models and solved by operations research techniques.In this context, initial efforts were oriented toward determining a single alignment and locating stations given an alignment [32,46].In recent years, however, more realistic cases have been investigated.Some of them, such as [22,23], have dealt with pre-assigned topological configurations, e.g., a star, a triangle and a cartwheel, to design a network with a single mode of rapid transportation and determined optimal rapid transit lines through pre-assigned corridors.Some others such as [20,34,35,38,39,48] have investigated RTN designs by determining rapid transit lines, without considering pre-assigned corridors.Tactical decisions have also been incorporated to strategic designs, and integrated RTN design and line planning have been addressed in some problems [9,10,17].Instead of using predefined configurations, the present study designs an RTN based on hub and spoke structures with the possibility of stops (stations) for vehicles in the hub and spoke alignments.In addition to the RTN design, rapid lines are determined in the hub-level and spoke-level sub-networks which constitute the first phase of the line planning process.In RTN design studies, both operator-based and user-based criteria are taken into consideration.The objective functions commonly used in these studies include minimizing costs [5,38,56], minimizing transit times [5,6,19], and maximizing covered demands [20,32,38].This research also includes both operator-based and user-based criteria by maximizing the total net profit (revenues minus costs) and minimizing the total weighted travel time.
A summary of the main features of the most related studies conducted in the fields of hub location for public transportation networks and RTN design is presented in Table 1.For comparison, the last row of the table presents the characteristics of this study conducted on hub location for RTN design.

Model and formulation
In this section, a framework is developed to model the problem of designing a rapid transit network based on hub location.The problem is called a hub-based RTN design problem.It is concerned with locating hub and spoke nodes, allocating spoke nodes to hubs, locating hub and spoke edges, constituting hub and spoke transit lines, determining the percentage of OD demands to be served, and determining the way of routing the demand flows through the network.The objectives of the model are maximizing the total net profit and minimizing the total weighted travel time.Certain capacity constraints are also postulated in the hub and spoke edges.The main assumptions of the developed mathematical model are as follows: -Each installed spoke node is located in one and only one spoke line.
-The flows can be routed through the included lines.
-Each spoke node is allocated to at most one hub node.
-Without loss of generality, there are the same lower and upper bounds for the length of all the lines, and a different upper bound to the number of nodes exists for the hub and spoke lines.-In order to calculate the objective functions correctly, all the data related to the costs, revenues and demand flows should be scaled in a same time horizon.For example, if the demand flows are defined for a year or a day, the routing costs and revenues are annual or daily.-Without loss of generality, the underlying or potential graph used as a basis for building RTN is defined by a set of potential nodes to locate stations and all the possible edges among them.
As a result of these assumptions, the flows can change their lines only at the hub nodes.In other words, if the origin and destination of an OD pair are not located in a same line, the corresponding flow has to be routed through the hub-level sub-network.The hub-based RTN design model involves the following sets, parameters and variables.

𝑤 𝑚𝑛
The demand flow from node  to node .FCh  The construction cost of hub edge {, }.

FCs 𝑖𝑗
The construction cost of spoke edge {, }.ch The construction cost of hub node .cs The construction cost of spoke node .

𝑐 𝑖𝑗
The transportation cost per unit of flow traveling through edge {, }.

𝛼
The discount factor for the transportation cost at the hub edges.

𝑟 𝑚𝑛
The revenue per unit of flow traveling from node  to node  (sum of the ticket price and the government subsidy) ℎ The upper bound on the number of lines which can cross any hub edge.lh max The maximum number of the possible hub lines (lh max = |LH|).ls max The maximum number of the possible spoke lines (ls max = |LS|).hMax The upper bound on the number of the hub nodes per hub line.sMax The upper bound on the number of the spoke nodes per spoke line.dMax The upper bound on the length of each line.dMin The lower bound on the length of each line.

Caph
The maximum allowed flow in the hub edges.

Caps
The maximum allowed flow in the spoke edges (the assumption is Caps < Caph).

𝑡 𝑖
The time of transfer between the lines in node  per unit of flow.ah The access time to enter a hub line.as The access time to enter a spoke line.eh The exit time to leave a hub line.es The exit time to leave a spoke line.ℎ The average speed of trains in the hub lines (ℎ = (1 + (1 − )) • ).

𝜆𝑠
The average speed of trains in the spoke lines.

𝑀
A sufficiently large scalar.If there is at least one hub line other than hub line  included in the network, it is equal to 1, otherwise 0.

Flow routing variables
fh

𝑚𝑛𝑖𝑗
The proportion of the flow from  to  pathing through directed hub edge (, ) in hub line .fs

𝑚𝑛𝑖𝑗
The proportion of the flow from  to  pathing through directed spoke edge (, ) in spoke line .

𝑓 𝑚𝑛
The proportion of the flow from  to  serviced by the RTN.

𝜏 𝑚𝑛𝑖
The proportion of the flow from  to  changing the line in node .

Modeling of the hub-based RTN design problem
Considering the assumptions and notations in the previous section, the bi-objective formulation for the hubbased RTN design problem is as follows: Network design and line determination constraints: Connectivity of the network constraints: Flow routing constraints: Bounding and sign constraints: Objective function (3.1) maximizes the net profit which is calculated by the subtraction of the total costs, including the transportation costs and the installation costs of the hub and spoke nodes and edges, from the total revenue obtained through satisfying the demands.The revenue is considered as the sum of the ticket price and the government subsidy.The objective function (3.2) minimizes the total weighted travel time.Constraints (3.3) and (3.4) guarantee that each installed node and edge can only be a hub or a spoke.Constraints (3.5) and (3.6) are added to the model in order to define the variable   correctly.Constraints (3.7) hold that each installed node is assigned to one and only one hub node.Each hub node is assigned to itself.Constraints (3.8) enforce each included spoke line to be assigned to one and only one hub node.Constraints (3.9) guarantee that spoke lines assigned to a particular hub node can only consist of the spoke nodes which are assigned to that hub node.Constraints (3.10) mean that a hub station is selected to be the station of a line only if it is already built in the network.Constraints (3.11) ensure that a spoke station is selected to be the station of a line only if it is already built in the network and that it cannot be in more than one spoke line (i.e., flows can change their line only at the hub nodes).Constraints (3.12) and (3.13) enforce an edge to be included in a line if it is already built in the network.Constraints (3.14)-(3.17)ensure that an edge is built in the network only if the adjacent nodes are already built.Constraints (3.18) impose an upper bound on the number of the lines that can circulate at any hub edge of the network, i.e., they prevent a concentration of lines at hub edges (constraints (3.11) enforce this bound to be equal to one for the spoke edges).Constraints (3.19) and (3.20) postulate a maximum number of nodes for each line and impose that no node or edge can be part of a non-included line.Constraints guarantee that the hub node corresponding to a specific spoke line is a node of that line and the line is connected to the hub-level sub-network through that hub node.In fact, these constraints ensure the connectivity of the spoke-level sub-network to the hub-level sub-network and make sure that all the spoke lines are connected to the hub-level sub-network through their corresponding hub nodes.According to constraints (3.25) and (3.26), one can consider  = 2 for the constraints already enumerated.Constraints (3.32) postulate connectivity for the hub-level sub-network and mean that each hub line must share at least one node with at least one other included hub line.Constraints (3.33) are added to the model in order to define the   variables correctly.Constraints (3.34) and (3.35) mean that, if the flow corresponding to an OD pair uses an edge of a line, the edge must already be selected to be a part of that line.Constraints (3.36)

Linearization of the model
To linearize constraints (3.9), the binary variables are added to the model, and these constraints are replaced with the following ones.
To linearize constraints (3.32), the binary variables are added to the model, and these constraints are replaced with the following ones.

Solution algorithm
This section presents the proposed solution to the hub-based RTN design problem.First, the weighted sum method is used along with normalization in order to transfer the bi-objective problem into a single-objective one.Then, considering tow interconnected sub-problems, an ASLN meta-heuristic algorithm is described to solve the single-objective problem.

The weighted sum method with normalization
With regard to the concavity of the Pareto frontier of objective functions  and − for different instances, the weighted sum method is used with normalization in order to handle the bi-objective model proposed in this study.This method is a typical way to solve bi-objective problems, and it works by associating a weighting coefficient with each objective function and optimizing the weighted sum of the objectives.Using the weighted sum method with normalization, the bi-objective problem has been transformed into the following single-objective one: where [0, 1] is the weighting factor which denotes the relative importance of  and  ,  max and  min are the optimal values of the problem with single objective functions  and  , respectively, and  min and  max are the values of functions  and  in the optimal solutions for  min and  max , respectively.
Since  min =  min = 0, the problem is equivalent to the following hub-based RTN design problem called HRTND.The HRTND problem is solved by the proposed meta-heuristic algorithm described in the next subsection.

An adaptive large neighborhood search method
In order to solve real-size instances of the HRTND problem efficiently, an ALNS meta-heuristic algorithm is proposed here.It is capable of jointly handling the network design, line determination and flow routing decisions.The ALNS meta-heuristic algorithm was first introduced by Ropke and Pisinger [47] in the context of vehicle routing problems and other problems with similar characteristics.It is based on the idea of improving an initial solution iteratively by applying a succession of destroy and repair operators.The success of operators in improving the best found solution enhances their probabilities of being selected.The ASLN algorithm belongs to the category of large-scale neighborhood search techniques defined in [1], but it only examines a relatively low number of solutions.The adaptive part of the algorithm is given by the dynamic updating of the probabilities.The main advantages of the algorithm include the capacity of exploring large parts of the solution space in a structured way, robustness, low probability of getting trapped in local optima, and easy calibration of its parameters [45].For a survey of ALNS algorithms and applications one can refer to [55].In the following, the procedure of implementing the proposed ALNS algorithm is explained in detail.
In the presented problem, the ALNS starts with an initial feasible network solution formed by one or more connected lines that satisfy the problem constraints, i.The new solution is accepted with a simulated annealing criterion by comparing its objective value to that of the current solution.The operators are chosen with a certain probability which depends on their performance in the past iterations, and they are applied on the lines which are selected randomly.The ALNS updates the current and best-known solutions as well as the scores of the last applied operators.The algorithm stops when a stopping criterion (i.e., the maximum number of iterations, the final temperature or the allowed maximum time) is satisfied.These main steps are schematically shown in Figure 2.
In the following, the main components of the ALNS algorithm are described, and then the pseudo-code of the algorithm is presented.

The routing sub-problem
Once a hub RTN is built through the ALNS algorithm, the routing sub-problem is solved to determine the optimal paths of the OD pairs that maximize the weighted sum of the total net profit and the total time and satisfy the capacity constraints.This problem is a linear program and can be solved efficiently by LP solvers such as CPLEX.However, it can actually be solved more efficiently if the number of its variables is reduced.In this case, one can write the variables fs   in terms of other variables, i.e., the variables fh   and   , as a result of fixing the integer variables and the assumptions mentioned in Section 3. The corresponding formulation includes some notations as defined below: OD  The set of OD pairs in which both the origin and destination nodes are located in the same spoke line.Using the above simplified version of the problem reduces the number of variables from 2 4 +  2 to  4 +  2 .As a result, the time required to solve the routing sub-problem reduces considerably, which plays an important role in the efficiency and effectiveness of the proposed ALNS algorithm as this sub-problem is solved in every iteration of the algorithm.

Initial network solution
In this section, a method is proposed to make an initial feasible hub-based RTN solution which consists of several connected hub and spoke lines.This method involves the following three steps.
(1) Determining the location of hubs.
At first, a number  is chosen randomly from the interval (1, hMax × lhMax] as the number of hubs.Then, an efficiency criterion defined as , is calculated for each node , and the  nodes with the highest efficiencies are selected as the hub nodes. (2) Finding the allocation of nodes.
After selecting  hub nodes, the rest of the nodes are allocated to these hubs.The objective of the optimum allocation model is to minimize the total demand weighted distance.The mathematical framework is as follows.) are needed for defining the variable   correctly.In the initial network solution devised in this study, each group of the spoke nodes assigned to a hub together with that hub constitutes a spoke line.Constraint (4.17) sets the number of these groups to the minimum value of  and lsMax.
The spoke lines are determined through a few steps.At first, among the group of nodes constituting a spoke line, the farthest node from its associated hub is selected as one of the extreme nodes of the line.The other extreme node is the associated hub node.This selection is based on the Euclidean distance.Then, the shortest line passing through the nodes of the group and connecting the defined extreme nodes is considered as the spoke line.After each spoke line is determined, the feasibility of the resulting network solution is checked and the line violating the lower and upper bounds on the length of the line and the upper bound on the number of the nodes of the line is removed from the network.To determine the hub lines, the hub nodes are classified in [/hMax] groups.To constitute the first hub line, one of these groups is selected and the tow farthest nodes of it are considered as the extreme nodes of the line.Then, the shortest line passing through the nodes of the group and connecting the defined extreme nodes is considered as the first hub line.For determining each of the other hub lines (if needed), one of the remaining groups is selected and its hub nodes together with a hub node randomly selected among the nodes of the last included hub line, are considered as the nodes constituting the line.Among these nodes, the two farthest nodes are considered as the extreme nodes of the line.Then, the shortest line passing through the nodes and connecting the defined extreme nodes, is considered as the hub line.After each hub line is determined, the feasibility of the resulting network solution is checked and the hub line violating the lower and upper bounds on the length of the line and the upper bound on the number of the nodes of hub line are removed from the network.When a hub line is removed, its associated spoke lines are also removed.

Destroy and repair operators
There are six operators involved in the implementation of the presented ALNS.They include two destroy operators, two repair operators and two operators combining both types.The repair operator inserts a new line or extends an existing one.The destroy operator removes part of a line or a full line.One of the combined operators removes a line and then inserts a new one.The other one removes part of a line and then extends an existing one.In the following, these operators are addressed with more details.
This operator randomly inserts a new line in the network as follows.At first, it randomly decides if it inserts a hub or a spoke line.To insert a hub line, an integer number  is chosen randomly from the interval [1, min{hMax,  fh }], where  fh is the number of the free (non-hub and non-spoke) and the hub nodes of the existing network.Then,  points are randomly chosen among the free and hub nodes, if possible.Finally, considering two farthest nodes as extreme ones, the shortest line passing through the chosen nodes is added to the network if it satisfies the lower and upper bounds on the length of the line, the upper bound on the number of the nodes of the line, not contained in other lines nor containing other lines, upper bound on the number of lines passing through each edge, and the connectivity of the resulting network.
To insert a spoke line, the number  is randomly chosen from the interval [1, min{sMax,   }], where   is the number of the free nodes.Then,  nodes are randomly chosen among the free nodes as the spoke nodes of the line, if possible.The hub node associated with the line which is one of its extreme nodes is randomly chosen among the hub nodes, and the farthest chosen node from that is considered as the other extreme node.Finally, the shortest line passing through the chosen nodes is added to the network if it satisfies the upper and lower bounds on the length of the line.
(2) Extend-line operator.This operator extends a randomly chosen line.To extend a hub line, the number  is randomly chosen from the interval [1, min{ fh , hMax −   }], where   is the number of the nodes of the line.Then,  nodes are randomly chosen among the free and hub nodes, if possible.Finally, the shortest line passing through the chosen nodes and one of the extremes of the existing line, which is chosen randomly, is added to the line if it satisfies the upper bound on the length of the line, the upper bound on the number of the nodes of the line, not containing other lines, and the upper bound on the number of the lines passing through each edge.
To extend a spoke line, the number  is randomly chosen from the interval [1, min{  , sMax −   }].Then,  nodes are randomly chosen among the free nodes, if possible.Finally, the shortest line passing through the chosen nodes is added to the chosen spoke line if it satisfies the upper bound on the length of the line.
(3) Remove-line operator.This operator randomly removes an existing line, if the resulting network, after removing the line, remains connected.
(4) Remove-part-line operator.This operator randomly removes part of a line.First, an existing line is randomly chosen to be partially removed.For the chosen hub lines, the operator removes part of the line between an intermediate node and a terminal one which are randomly chosen.The operator is applied if the new line satisfies the lower bound on the length of the line, not being contained in the other lines, and the connectivity of the resulting network.
For the spoke lines, the operator removes part of the line between an intermediate node randomly chosen and the spoke extreme node.The operator is applied if the new line satisfies the lower bound on the length of the line.
This operator first removes a line using the remove-line operator and then inserts a new line using the insert-line operator, if possible.( 6) Remove-part-line and Extend-line operator.This operator first removes a part of a line using the remove-part-line operator and then adds a part to a line using the extend-line operator, if possible.

Selection of operators
Let   be the weight for operator , which measures how well the operator has performed recently.The probability of choosing the operator is calculated by   / ∑︀    .In each iteration, the weights are adjusted by means of scores.The scores measure the contribution of the operators to the improvement of the objective function in a block of  iterations.
The score of each operator is initially set to zero in the beginning of the segment.It can be increased at each iteration by  1 ,  2 or  3 as follows.If the applied operator results in a new best solution, its score is increased by  1 .If it results in a new solution worse than the best solution and better than the incumbent one, its score is increased by  2 .If it results in a solution that is worse than the incumbent one but it satisfies the acceptance criterion, its score is increased by  3 .In practice,  1 ≥  2 ≥  3 .After each block of  iterations, all the scores are reset to zero.The weight for operator  can be updated through where   is the score for operator ,   is the number of times that operator  was used in the last segment, and  is the reaction factor parameter which allows controlling how fast the weights change according to the performance in the last block.

Acceptance and stopping criteria
A new solution is accepted according to a simulated annealing acceptance criterion.That is, given the current solution  with an objective value of (), the new neighboring network solution  ′ is accepted if ( ′ ) > ().Otherwise, it is accepted with the probability of  (( ′ )−())/ where  > 0 is the temperature.The temperature starts with the value  =  0 , but, after a certain number of iterations, it decreases at the cooling rate of 0 <  < 1.The parameter  0 is set in a way that the neighboring solution  ′ , which is at least % worse than the current solution, can be accepted with the probability of 50%.The parameter is calculated as follows: 0 = (( ′ ) − ())/ ln 0.5.(4.19) The execution of the ALNS algorithm ends when a given maximum number of iterations is reached, a given final temperature is attained or the running time exceeds a given threshold.

The ALNS algorithm
In this sub-section, the pseudo-code of the proposed ALSN algorithm is given for more insight (see Algorithm 1).While the selected operator cannot be applied do Select an operator according to   ,  = 1, . . ., 6.
Attempt to apply the selected operator on the current solution  RTN in order to obtain the new network solution  ′ RTN .End Update the number of times the operator is used.
Set  =  ′ , () = ( ′ ), and   =   +  2 .Else, if  ′ is accepted by the SA criterion, then Set  =  ′ , () = ( ′ ), and   =   +  3 .If the number of the while iterations is a multiple of , then update the weights of all the operators according to (4.18) and reset their scores to 0. If ( ′ ) <  (︀   )︀ and  0 is adjusted, then set  =  • .End Output: It consists of the obtained network infrastructure, a set of lines by their itinerary, the values of the satisfied demands as well as their routing, profit, and time.

Computational experiments
This section presents the results of the computational experiments performed to assess the performance of the proposed ALNS algorithm and to analyze the resulting hub RTN networks.
In the first part of the experiments, the ALNS algorithm was compared to the exact branch-and-cut procedure implemented in CPLEX (v12.10), for the HRTND problem.In the second part of the experiments, an analysis was done to check how discount factor and revenue parameters affected the characteristics of the resulting hub-based RTN networks.
All the algorithms were coded in C++ using the Concert Technology of IBM ILOG CPLEX 12.10 to solve linear programs.All runs were implemented on a personal laptop with the Intel R ○Core TM i5-8250U CPU @ 1.60-1.80GHz processor and Windows 10 with 8 GB of RAM memory.
The computational experiments were conducted with two well-known datasets adopted from the literature of hub location.They included the Australia Post (AP) introduced by Ernst and Krishnamoorthy [18] and the Turkish network dataset which contained the data on travel distances between Turkish cities [50] and the flow between them calculated by the method presented in [11].The AP dataset provided a symmetric OD demand matrix, while the Turkish network dataset provided an asymmetric one.
In this study, the OD demand and distance matrices provided by the AP and the Turkish network datasets were used, and the values of the other model parameters were set on their basis.As already mentioned, in order to calculate the objective functions correctly, all the data related to costs, revenues and demand flows had to be considered on the same scale.Therefore, the objective functions were calculated in a time horizon of  years.To this end, the demand matrices of the datasets were used to estimate the related demand flows in  years as formulated below.For the AP dataset,   , which denotes the number of passengers from  to  in  years, was calculated as where  AP  is the demand flow of  to  taken from the AP dataset and  AP is the estimated total number of the yearly trips.For the Turkish network dataset,   was calculated as where    is the demand flow of  to  taken from the Turkish dataset.This parameter is equal to   . /( ∑︀  =1   −   ), where   is the population of node , according to [11].Also   is the estimated average number of the yearly trips for each person.The distance matrix of the model was considered equal to the In cases where a set of values is given for a parameter, the desired value is selected randomly from the set, unless it is explicitly mentioned.
distance matrices provided by the datasets (in terms of km).The remaining parameters were set as reported in Table 2.The values in this table were used in all the experiments, unless it is explicitly mentioned.For the ALNS algorithm, the score parameters were set to  1 = 10,  2 = 5 and  3 = 2.The cooling rate was set to  = 0.9997, the reaction factor to  = 0.3, and the number of iterations for updating the weights of the operators to  = 50.Furthermore, the initial temperature was set at  0 = 10 000 at first, and then it was updated during the algorithm implementation considering  = 0.5.Finally, the termination criteria were the temperature ( ) falling lower than 0.0001, a time limit of 36 000 s (10 h), or a maximum number of 2000 iterations.

ALNS algorithm performance
In order to gain an insight into the performance of the ALNS algorithm, its capability to reach an optimal solution was investigated through comparing it to the CPLEX solver.
In this subsection, experiments were conducted with  ∈ {10, 20, 30} for the AP and Turkish network datasets.For each number of , some instances were solved with two values of the maximum number of the hub and spoke lines, i.e., (lh max , ls max ) ∈ {(1, 2), (2, 4)} for  = 10, (lh max , ls max ) ∈ {(2, 4), (3,6)} for  = 20, and (lh max , ls max ) ∈ { (3,6), (4,8)} for  = 30.In each case, the weight of the profit function was varied with regard to   ∈ {0.25, 0.50, 0.75, 1.00}, and the discount factor was fixed to 0.7.This experimental framework was dealt with in 24 instances.In all the experiments, a CPU time limit of 10 h was used for the CPLEX solver, as for the ALNS algorithm.In order to better compare the solutions produced by the methods, the parameters  max and  max were calculated via the CPLEX solver for instances with  = 10, where CPLEX could reach the exact optimal solution.For the remaining instances, the parameters were determined via the ALNS algorithm, which usually yield better values for the parameters.
Tables 3 and 4 present the computational results of comparing the ALNS method and the CPLEX in terms of the computational time and the quality of the solutions.The comparisons were performed for AP and Turkish network datasets.In order to compare the solutions produced by the two methods, a performance measure was reported in the "Dev (%)" columns.It was the percentage of the deviation between the solutions found by the ALNS method and the CPLEX, considering only the instances for which the CPLEX obtained a feasible solution.This measure was calculated as 100  CPLEX are the objective values obtained by the CPLEX and the ALNS, respectively.In situations in which both objective values are equal to zero, the measure is defined equal to zero.The "CPU Time" columns present the consumed CPU times of the methods in seconds.Also, the "" columns present the obtained objective values of the methods.Concerning the ALNS execution, three different experiments were conducted for each instance, the average results of which are reported in Tables 3 and 4.
According to the tables, the ALNS outperforms the CPLEX in part as a consequence of the big size of the mixed integer programming model.The advantage of the ALNS increases as the problem becomes more difficult, i.e., when the nodes and/or lines increase in number.Note that the CPLEX failed to solve the instances with 30 nodes for both datasets, which was due to excessive memory requirements.These out-of-memory cases are indicated by "OM".According to the reported optimality deviations, the CPLEX presented better solutions for the instances with 10 nodes, while the ALNS performed better for the other instances.In more than 93% of the cases, the percent deviation between the two objective values was less than 15%, which proves the efficiency of the ALNS algorithm in solving the HRTND problem.More precisely, as the tables suggest, the ALNS method yielded good solutions with the percent deviation of −16.6% on average.Furthermore, the ALNS algorithm solved the problem faster on average.The superiority of the ASLN over the CPLEX increased as the problem became larger.This behavior was similar for both datasets.It can be concluded that the use of the commercial CPLEX solver is not adequate for solving complex HRTND problems.
Figures 3 and 4 represent a detailed comparison of the processes of convergence to the best solutions obtained by the ALNS and CPLEX methods through the experiments reported in Tables 3 and 4, respectively.The vertical axis represents the running time, whereas the horizontal one corresponds to the value of the objective function.The results related to the mean objective values obtained in three runs of the ALNS method are presented in the figures.As it can be seen and as already mentioned, the ALNS method could reach good solutions within much less time than the CPLEX.More precisely, the ALNS averagely captured more than 85% of the best objective value obtained by both methods in less than 25% of the maximum CPU time of the methods, which means its convergence speed is very high.The figures also suggest that the ALNS method usually begins with a considerably better initial solution than the CPLEX, which confirms the affectivity of the method used to determine an initial solution.
In a part of the numerical investigation, the Pareto frontier was plotted for some instances by changing the weights of the objective functions, through which the frontier emerged to be concave.This means that using the weighted sum method is reasonable for handling the bi-objectivity of the problem addressed in this study.The Pareto frontiers of the experiments with  = 10, ℎ = 1 and  = 2 for the AP and Turkish network datasets are given in Figure 5 as some examples.

Analyzing the network design
In the second part of the experiments, analyses were performed of the discount factor, , and the revenues (sum of the ticket prices and the government subsidies),   s, as two important parameters which affect the design of hub-based RTN networks and can be determined based on decision makers' policies.To this end, the obtained solutions were investigated mainly in terms of the total net profit and the total travel time values, the percentage of the served demands and the topology of the network.Then, some managerial insights were given.
In this subsection, the three experiments explained in the previous subsection are referred to again.They concern problems with the size of  = 10, lh = 1, ls = 2,  = 20, lh = 2, ls = 4 and  = 30, lh = 3, ls = 6 for the AP dataset with   = 0.5.These experiments correspond to the second, 10th and 18th rows of Table 3, and their best solution network configurations are presented in Figure 6.The thick lines represent the hub lines (different hub lines are given in different dashed types), and the narrow ones represent the spoke lines.The nodes that are not connected to the network are the ones which are not serviced.For each experiment, some instances were solved with the varied values of the parameter under investigation but the fixed values of the other parameters.In each case, the values of  max and  max were the same as those in the experiments of the previous subsection.For the instances with  = 20 and  = 30, where there was no optimal solution by the exact solver CPLEX, the solution provided by the ALNS algorithm was used.In these cases, the best solutions found in three runs are reported here.
Table 5 gives the results of the experiments conducted to investigate the effect of .For each size of the problem, the next four columns indicate the corresponding values for the discount (1 − ), the total percentage of the satisfied demand, the total net profit, and the total travel time, respectively.The results show that the  3. decision maker can obtain significantly more profit when the discount is higher due to the economies of scale.The choice of this parameter does not seem to have a decreasing or increasing effect on the value of the total time and the percentage of the captured demand.Figure 7 presents the resulting network configurations.As it can be seen, the ratio of the total length of the hub lines to that of the spoke lines becomes greater or remains the same when the discount 1 −  increases.In detail, this ratio is 0.54 for the first three rows of the table, 0.45, 0.58, and 0.67 for rows 4-6, and 0.71, 0.78, and 0.94 for rows 7-9.
Similarly, Table 6 presents the results of the experiments in the investigation of the effect of   .For this investigation, the relation   =   was taken into account, and the parameter  was varied as given in Table 6. Figure 8 presents the resulting network configurations.As understood from Table 6 and as expected, an increase in the revenue parameter  leads to the rise of the percentage of the satisfied demand and the total net profit.It also increases the total time of transportation.This can be because having more satisfied passengers may result in exceeding the capacity of edges on shorter OD paths and the necessity of routing the passengers in longer ones.According to Figure 8, the parameter has a significant impact on the network topology.Lower revenues give rise to shorter networks with lower numbers of stations and edges.This is because as revenues and, therefore, satisfied demands decrease, high investment for building a large number of stations and edges cannot be profitable.5.
Using the computational results, the effectiveness and efficiency of the proposed model were validated, and useful insights were provided into the interactions among the different aspects of the studied complex decision problem.From these results, one can conclude that different planning decisions are jointly involved in obtaining good solutions.This shows the importance of considering different aspects and phases of a planning process in order to obtain good solutions for real-life situations.The proposed model can, thus, be employed to design new rapid transit systems successfully as different decision-making parameters are taken into consideration.The model can also be used as a decision support tool to improve or extend the existing systems through investigating the effects of important parameters on the resulting networks while considering the criteria of net profit and time.For example, as mentioned, a producer can use more efficient vehicles in hub lines and give a more discount to gain more profit without effecting the total time significantly.In addition, an increase in the revenue (i.e., the sum of the ticket prices and the attracted governmental subsidies) raises both the total net profit and the time.Therefore, one can decide about these parameters based on the degree of the importance of the two criteria.Of course, for more accurate analysis, new tactical decisions (e.g., determining the optimal fleet size considering the fleet acquisition costs) or elastic demands (e.g., price sensitive demands) may be incorporated.These issues are outside the scope of this paper, as the goal is here to propose an efficient basic model of designing hub-based RTNs; the issues may be picked to study in future research.
As the computational results in this subsection suggests, a network achieved with reasonable data and a practical-size problem (see parts b and c in Fig. 6) has a configuration close to well-known ones, e.g., star and triangular networks.However, the model proposed in this paper is more general than those based on pre-assigned configurations for designing RTNs.There is, indeed, no need to consider a special pre-assigned configuration when establishing a new RTN.Moreover, as already mentioned, the model introduced in this paper can be employed to analyze the effect of different changes in existing RTNs in order to improve or extend them.

Conclusion
In this study, a novel bi-objective mathematical programming model is presented for designing a rapid transit network based on a hub and spoke model.The network consists of stopovers (stations) in the hub and spoke alignments.In the proposed hub location model, both hub-level and spoke-level sub-networks are composed of multiple lines.The model relaxes some of the usual assumptions in classical hub location models by considering the possibility of the transshipment of the flows in spoke nodes (in addition to hub nodes), the setup costs for hub and spoke nodes and edges, and the capacity constraints for hub and spoke edges.In addition to determining the network infrastructure, which is done by making decisions about the location of hub and spoke nodes and the selection of hub and spoke edges, the model simultaneously determines hub and spoke lines and routes of flow to transport the demand between OD pairs.The model incorporates profit maximization (by allowing a portion of demands to be unserved) and service time minimization objectives in order to manage the interests of both the service provider and the operator.
Relaxing the assumptions and making different decisions to design a hub-based RTN is supposed to lead to better solutions for real-word situations.However, this implies a significant level of difficulty in the problem formulation as well as the computational procedure needed to solve it.Therefore, to solve realistic-size instances of the problem, a meta-heuristic algorithm was developed on the basis of an ALNS method.In the proposed solution algorithm, for an upper-level sub-problem, the network design and the line configuration decisions were managed by the ALNS with the aim of maximizing the weighted objective function.Also, in each iteration of  6.
the ALNS, a lower-level sub-problem was solved by the CPLEX to make flow routing decisions for the current network solution.
In this study, some computational experiments were performed on the well-known AP and Turkish network datasets to assess the performance of the proposed model and the solution algorithm and to analyze the resulting hub networks.The performance of the ALNS algorithm was compared to that of the CPLEX solver.The computational results proved the efficiency of the proposed ASLN algorithm; it was able to find high-quality solutions within short CPU times.According to the results, the increasing difficulty of solving larger-size instances precludes the use of commercial solvers.As another task, the resulting hub networks were analyzed under various parameter settings.The revenues and the discount factor were varied, and analyses were performed on the obtained solutions mainly in terms of the obtained values of the objective functions, the percentage of the served demands and the topology of the network.The results suggest the effectiveness and efficiency of the proposed model, and useful insights are provided into the interactions among different aspects of the studied complex decision problem.
For future research, other algorithms, either exact or meta-heuristic, can be developed, and more detailed numerical investigations are recommended so as to obtain deeper managerial insights.From the modeling point of view, even more applicable models can be found, for example, by incorporating realistic constraints (e.g., the capacity limitation of stations and the constraints on the angles between the consecutive edges of transit lines), tactical decisions (e.g., determining the frequencies of transit lines), elastic demands, alternative modes of transportation and factors of uncertainty.
(3.21) and(3.22)assign maximum and minimum lengths to each line.According to the constraints (3.23) and(3.24),at least one edge must be present in each line.Constraints (3.25)-(3.30)determine the lines topology.Constraints (3.25) and (3.26) enforce each node of a line to have at most two incident edges.Constraints (3.27) and (3.28) make the number of the edges of a line equal to the number of its nodes minus one.Constraints (3.29) and (3.30) are sub-tour elimination constraints which guarantee that no line contains cycles.Constraints(3.31) and (3.37) address the capacity constraints of hub and spoke edges.Constraints (3.38) relate to flow conservation and ensure that the flow from  to  leaves node , arrives at node , and is accounted for whenever a middle node  is used.Constraints (3.39)-(3.41)are added to define the   variables correctly.In particular, constraints (3.40) and (3.41) force variables   to take value 1, if the flow from  to  changes its line at node .Finally, constraints (3.42)-(3.47)are bounding and sign constraints.
e., relations (3.3)-(3.33)and (3.42)-(3.44).The method of defining this initial network solution will be explained in Section 4.2.2.In each iteration, a new network solution is defined through modifying the current one by applying predefined destroy and repair operators while preserving feasibility.The quality of a network solution is evaluated by solving the routing sub-problem defined with relations (3.1), (3.2), (3.34)-(3.41),and (3.45)-(3.47)where integer variables are fixed according to the network solution, using the CPLEX solver.In fact, given a line network, the flow routing sub-problem solves the problem of determining optimal flow of passengers by maximizing the objective function of the line plan.

Algorithm 1 .
Pseudo-code of the ALNS algorithm.Input: It consists of the input data of the hub-based RTN model, the initial network solution, and the value of the parameters related to the ALSN algorithm.Given the initial network solution  0 RTN , solve the routing sub-problem (formulations (4.1)-(4.13)) to obtain the related solution  0 and compute the associated objective function  (︀  0 )︀ using CPLEX.Set  =   =  0 and () =  (︀   )︀ =  (︀  0 )︀ .While stopping criteria are not satisfied do Select an operator according to   ,  = 1, . . ., 6.

Figure 5 .
Figure 5. Pareto frontier for (a) AP and (b) Turkish network instances.

Figure 6 .
Figure 6.Configuration of the networks obtained for the experiments corresponding to the (a) 2nd, (b) 10th, and (c) 18th rows of Table3.

Table 1 .
Comparison of this paper with the most related studies in terms of their characteristics.
If line  is included as a hub line, it is equal to 1, otherwise 0. ls  If line  is included as a spoke line, it is equal to 1, otherwise 0.    If node  is selected to be a spoke node of spoke line , it is equal to 1, otherwise 0.    If node  is selected to be a hub node of line , it is equal to 1, otherwise 0 (If  is a spoke line, then    = 1 means that line  and its nodes are assigned to hub node ).yh   If hub edge {, } is selected to be a part of hub line , it is equal to 1, otherwise 0. ys   If spoke edge {, } is selected to be a part of spoke line , it is equal to 1, otherwise 0.

OD ′ 𝑠
The set of OD pairs minus the set OD  (OD ′  =  ×  − OD  ).The spoke line in which spoke node  is located.  The set of the edges of the spoke line connecting  and  if ,  ∈ OD  , otherwise an empty set.  The set of the edges located in part of the spoke line connecting  to the hub-level sub-network if  is a spoke node otherwise an empty set.  The set of the spoke nodes located in part of the spoke line connecting  to the hub-level sub-network if  is a spoke node, otherwise an empty set.

Table 2 .
Values of the input parameters for the hub-based RTN model.

Table 3 .
Comparison of the ALNS algorithm with the CPLEX for AP instances.

Table 4 .
Comparison of the ALNS algorithm with the CPLEX for AP instances.

Table 5 .
Solution characteristics for different values of the discount factor with the AP dataset.

Table 6 .
Solution characteristics for the different values of the revenue factor with the AP dataset.