A MULTI-OBJECTIVE MULTI-AGENT OPTIMIZATION ALGORITHM FOR THE MULTI-SKILL RESOURCE-CONSTRAINED PROJECT SCHEDULING PROBLEM WITH TRANSFER TIMES

This paper addresses the Multi-Skill Resource-Constrained Project Scheduling Problem with Transfer Times (MSRCPSP-TT). A new model has been developed that incorporates the presence of transfer times within the multi-skill RCPSP. The proposed model aims to minimize project’s duration and cost, concurrently. The MSRCPSP-TT is an NP-hard problem; therefore, a Multi-Objective Multi-Agent Optimization Algorithm (MOMAOA) is proposed to acquire feasible schedules. In the proposed algorithm, each agent represents a feasible solution that works with other agents in a grouped environment. The agents evolve due to their social, autonomous, and self-learning behaviors. Moreover, the adjustment of environment helps the evolution of agents as well. Since the MSRCPSP-TT is a multi-objective optimization problem, the Technique for Order of Preference by Similarity to Ideal Solution (TOPSIS) is used in different procedures of the MOMAOA. Another novelty of this paper is the application of TOPSIS in different procedures of the MOMAOA. These procedures are utilized for: (1) detecting the leader agent in each group, (2) detecting the global best leader agent, and (3) the global social behavior of the MOMAOA. The performance of the MOMAOA has been analyzed by solving several benchmark problems. The results of the MOMAOA have been validated through comparisons with three other meta-heuristics. The parameters of algorithms are determined by the Response Surface Methodology (RSM). The Kruskal–Wallis test is implemented to statistically analyze the efficiency of methods. Computational results reveal that the MOMAOA can beat the other three methods according to several testing metrics. Furthermore, the impact of transfer times on project’s duration and cost has been assessed. The investigations indicate that resource transfer times have significant impact on both objectives of the proposed model. Mathematics Subject Classification. 90B35, 91B32, 68M20. Received September 5, 2020. Accepted May 29, 2021.

(1) What is the background of considering resource transfer times for the multi-skill RCPSP? (2) What is the background of using the MADM methods in different procedures of multi-agent systems? (3) How can the TOPSIS approach, as an MADM method, be applied to different procedures of a multi-agent optimization algorithm? (4) Are the proposed model and algorithm capable of producing appropriate schedules for projects? (5) How efficient is the proposed MOMAOA algorithm in solving test problems of the MSRCPSP-TT comparing to other well-known optimizers? (6) What is the impact of transfer times on the objectives of the proposed model?
The main contributions of this research are threefold: First, a bi-objective model is mathematically formulated for the MSRCPSP, where multi-skill workers must travel between different sites to perform their assigned tasks. Therefore, resource transfer times have been considered for the proposed model. Moreover, the budget considered for completion of a project is limited. Second, to solve this model, a multi-objective multi-agent optimization method has been proposed. In this study, the agents are structured in a group organization. Similar to many other optimizers, the proposed method consists of several procedures. This research developed several TOPSISbased methods for the procedures of the proposed algorithm. These developed TOPSIS-based methods are used for: (1) finding the leader agent in each group, (2) finding the global best leader agent, and (3) the global social behavior of the MOMAOA. Third, a great number of computational experiments have been conducted to evaluate the performance of the proposed algorithm in comparison with other efficient methods. The algorithms were tuned by means of the RSM. The rest of the paper is structured as follows: Section 2 surveys the literature of RCPSP with resource transfer times, the multi-skill RCPSP, and the multi-agent systems. Section 3 presents the mathematical formulation of the MSRCPSP-TT. The proposed algorithm is described in Section 4. Section 5 provides the numerical test problems and analysis of the outputs obtained by the algorithms. Ultimately, Section 6 concludes the paper and offers some suggestions for future studies.

Literature review
In this section, the most related studies of current research are briefly reviewed. Thus, the previous studies on the RCPSP with transfer times, the multi-skill RCPSP, and ultimately the multi-agent systems are surveyed.

Previous studies on the RCPSP with transfer times (RCPSP-TT)
The number of studies on the RCPSP with transfer times is very scarce. Krüger and Scholl [42] proposed some methods to solve the RCPSP with transfer times for both single-project and multi-project cases. They modeled both cases as integer linear models and developed a priority-rule based heuristic as a solution approach. In another research, Krüger and Scholl [43] developed a framework for the Resource-Constrained Multi-Project Scheduling Problem (RCMPSP) including transfer times and costs. The proposed framework includes managerial approaches to tackle resource transfers, different resource transfer types, and new roles that can be assigned to resources during these transfers. Poppenborg and Knust [61] developed a Tabu Search (TS) algorithm based on a resource flow representation for the RCPSP-TT.

Previous studies on the MSRCPSP
Different variants of the MSRCPSP have been studied in the literature. Bellenguez and Neron [3] proposed a model, where the competency of resources is different. A disjunctive-constrained multi-skill formulation has been developed by Pessan et al. [60] for scheduling maintenance tasks. Gutjahr et al. [22] integrated the learning phenomenon into the project portfolio selection problem and developed an Ant Colony Optimization (ACO) method and a Genetic Algorithm (GA) to find its solutions. Li and Womer [47] combined mixed-integer linear model with Constraint Programming (CP) and presented a Hybrid Benders Decomposition (HBD) for the multiskill RCPSP. To eliminate resource conflicts, a cut-generating scheme has been developed. Heimerl and Kolisch [28] proposed a model for the multi-skill multi-project scheduling problem. Kazemipoor et al. [39] formulated the multi-skill project portfolio scheduling problem as a Goal Programming (GP) model. For each task, an infinite set of modes has been considered. Duration of projects is minimized via a Differential Evolution (DE) algorithm. To find appropriate solutions to the MSRCPSP, Liu and Wang [50] utilized the constraint programming method along with multiple heuristics. Mehmanchi and Shadrokh [56] examined the learning and forgetting phenomena on competency of multi-skill manpower. Tabrizi et al. [65] developed a bi-stage approach based on genetic procedures and path relinking methods to optimize the Net Present Value (NPV). Correia and Saldanha-da-Gama [13] focused on the cost perspective of the MSRCPSP. Montoya et al. [57] examined utilizing a column generation approach in the Branch and Price (B&P) method. Myszkowski et al. [58] embedded priority-rule based methods in the ACO. A Teaching-Learning-Based Optimization (TLBO) algorithm was developed by Zheng et al. [74] to schedule activities in a multi-skill environment. Javanmard et al. [36] combined the MSRCPSP and Resource Investment Problem (RIP) and developed a model for it. A genetic-based and a particle-swarm-based (PSO) algorithm were developed to solve large-scale instances. Maghsoudlou et al. [51] formulated a multi-mode based model and suggested a Multi-Objective Invasive Weeds Optimization (MOIWO) algorithm. For the multiskill RCPSP, Maghsoudlou et al. [52] suggested multiple multi-objective cuckoo-search-based approaches. Chen et al. [11] focused on impacts of learning and forgetting phenomena on competency of manpower in a multiproject MSRCPSP. The step-deteriorating phenomenon has been embedded in the MSRCPSP by Dai et al. [17] and a Tabu Search method with four neighborhood structures and two mutation operators was presented to solve it. Myszkowski et al. [59] merged the Differential Evolution (DE) method and a greedy algorithm to detect feasible solutions of the model. Wang and Zheng [69] proposed a Fruit-fly Optimization Algorithm (FOA) that applies the TOPSIS method during the optimization. Zhu et al. [76] developed a Discrete Oppositional Multi-Verse Optimization (DOMVO) method for the MSRCPSP. The researchers used the path relinking method to model the black-white phase in their proposed algorithm. To enhance the quality of outputs, they utilized the opposition-based learning (OBL) approach as well. Moreover, a repairing procedure has been devised to produce feasible solutions. Laszczyk and Myszkowski [44] developed the NSGA-II with a new selection operator. They used a clone prevention approach to acquire more diverse Pareto fronts. Lin et al. [49] proposed a hyperheuristic based on the genetic programming for solving test problems of iMOPSE. Hosseinian et al. [35] utilized the Linear Threshold Model (LTM), which is usually used in the Influence Maximization (IM) problem, to model the learning phenomenon of workers in the MSRCPSP. An improved version of the NSGA-II was suggested to optimize make-span and total costs of projects. Hosseinian and Baradaran [31] found communities of workforces that can appropriately cooperate with each other by maximizing modularity. They used a greedy algorithm to find the communities, while a Dandelion Algorithm (DA) was developed to solve the MSRCPSP. Hosseinian and Baradaran [32] focused on the multi-mode MSRCPSP and a genetic algorithm has been developed for it. For this algorithm, two new procedures have been devised to find better solutions. Furthermore, the VIKOR method has been embedded in the GA for selecting candidate solutions in order to generate new offspring. Hosseinian and Baradaran [33] proposed two new algorithms, namely the Pareto-based Grey Wolf Optimizer (P-GWO) and the Multi-Objective Fibonacci-based Algorithm (MOFA) for the MSRCPSP with deterioration effect and financial limitations. They used the Data Envelopment Analysis (DEA) in the P-GWO to update the archive of non-dominated solutions. In another research, Hosseinian and Baradaran [34] studied the MSRCPSP with Generalized Precedence Relations (GPR). In their proposed formulation, the learning phenomenon has been considered for the workforces which means that they can become more efficient by repeating their skills. The researchers have modified the Pareto Archived Evolution Strategy (PAES) to solve this problem. Dai et al. [18] investigated the MSRCPSP with step deterioration and proposed a Variable Neighborhood Search (VNS) method for it. Cai et al. [7] studied the MSRCPSP with transfer times and uncertainty skills. A robust genetic algorithm was developed for the problem. Dang Quoc et al. [14,15] developed an algorithm known as the CSM (inspired by the Cuckoo Search method) and a Differential Evolution Method (DEM) for the MSRCPSP. In another research, Dang Quoc et al. [16] offers another version of the cuckoo search algorithm called the R-CSM for the Real-RCPSP. Tian et al. [67] proposed a resource-leveling operator along with a schedule-compress operator for the NTGA and MOFOA methods to improve their solutions. The former operator levels workload of employed resources, while the latter operator tries to omit idle times of resources. Table 2 summarizes different  Table 2. Summary of studies on the multi-skill RCPSP.

Previous studies on the multi-agent systems
Multi-agent methods have been widely used to solve complex problems that are intractable for other methods. Brandolese et al. [5] proposed a multi-agent paradigm to allocate production capacity to multiple requirements. Yan et al. [71] utilized the MAS to schedule activities and eliminate resource conflicts through transferring message and negotiation among agents. They introduced mobile agents so as to reduce communication cost and to increase the communication speed. Knotts et al. [41] developed eight agent-based algorithms to solve the multi-mode RCPSP. Each algorithm uses a priority rule to control the access of agents to resources. Böcker et al. [6] developed a multi-agent based scheduling model for a railway transportation system. Lee et al. [45] developed an MAS for short-term scheduling of resources, which are shared by several projects. The researchers developed a market mechanism called precedence cost tâtonnement (P-TâTO) for resource scheduling. The P-TâTO was also used to find precedence conflict-free schedules. In another research, Knotts and Dror [40] investigated the implementation of agent technology for large-scale multi-mode resource-constrained project scheduling problems. They introduced reactive and deliberative agents. These agents use different procedures to select execution modes of activities. A multi-agent system based on general equilibrium market mechanism was designed by He et al. [27] to solve large-scale instances of the RCPSP. Homberger [29] integrated a Restart Evolution Strategy (RES) with a multi-agent system for the decentralized Resource-Constrained Multi-Project Scheduling Problem (RCMPSP). Confessore et al. [12] proposed a market-based multi-agent system for the RCMPSP. In this study, each project represents an agent. They used a market-based method to resolve conflicts between projects and respective agents. This method is an iterative combinatorial auction process. A multi-agent system was proposed by Chen and Wang [9] for dynamic scheduling of a project. Adhau et al. [1] developed a multi-agent system based on an auction-based negotiation approach. This system aims at resolving resource conflicts as well as allocating different resources to multiple competing projects. Tao et al. [66] developed a Quantum Multi-Agent Evolutionary Algorithm (QMAEA) for multi-objective combinatorial optimization problems in large-scale service-oriented distributed simulation systems. Zheng and Wang [73] proposed a Multi-Agent Optimization Algorithm (MAOA) for solving the RCPSP. In the MAOA, the agents cooperate in a grouped environment. The agents evolve by means of social behavior, autonomous behavior, self-learning, and adjustment of environment. Martin et al. [54] developed a multi-agent-based distributed framework for tackling different problem domains. In their multi-agent system, each agent represents a different combination of metaheuristic and local search algorithms. They evaluated their proposed framework on permutation flow-shop scheduling problem and capacitated vehicle routing problem. Han et al. [24] proposed a multi-agent system for offshore project scheduling. The multi-agent system was designed to facilitate the integration of offshore project scheduling. Fu et al. [20] addressed a two-agent stochastic flow shop deteriorating problem to minimize the make-span and total tardiness. Hosseinian and Baradaran [30] developed a multi-objective multi-agent optimization algorithm to optimize modularity and community score in the community detection problem. Their proposed algorithm uses the Weighted Sum Method (WSM) for finding the best and leader agents. In the previous studies on the multi-agent systems, the agents represent different concepts such as solutions, algorithms, activities, resources, etc. Table 3 shows the concepts represented by agents in previous studies. Table 3 also indicates that whether multi-criteria decision making techniques have been used in multi-agent systems or not. "MAS-WM 3 " implies that the proposed multi-agent system uses a Multi-Attribute Decision Making technique (MADM), while "MAS-WOM 4 " represents that the MADM techniques have not been utilized in the proposed MAS.

Significance of this research
Due to the previous studies reviewed in this section, there is a research gap for the multi-skill RCPSP with transfer times. Hence, in this paper, a bi-objective mathematical formulation is proposed for the multiskill RCPSP with transfer times (MSRCPSP-TT). The objectives of the proposed model are minimization of make-span and total cost of project. Moreover, it can be inferred from Table 2 that evaluating the performance of a multi-agent optimization algorithm can be an interesting topic to investigate. Thus, we develop a multi-objective multi-agent optimization algorithm (MOMAOA) to solve the MSRCPSP-TT to evaluate its effectiveness. Besides, it can be concluded from Table 3 that the application of MADM techniques in multi-agent Table 3. Characteristics of multi-agent systems proposed in previous studies.

Authors
Objective Agent MADM technique Field Single Multi Activity Algorithm Resource Solution Other MAS-WM MAS-WOM [5] Capacity allocation problem [71] Project scheduling problem [41] Project scheduling problem [6] Train coupling and sharing problem [45] Resource scheduling problem [40] Project scheduling problem [27] Project scheduling problem [29] Project scheduling problem [12] Project scheduling problem [9] Project scheduling problem [1] Project scheduling problem [66] Multi-objective optimization problems [73] Project scheduling problem [54] Flow-shop scheduling problem and Capacitated vehicle routing problem [24] Project scheduling problem [20] Flow-shop scheduling problem [30] Community detection problem This research Project scheduling problem systems deserves more attention. Hence, we used the TOPSIS method in various procedures of the MOMAOA to investigate the effect of an MADM technique in a multi-agent system.

Problem description and mathematical formulation
This paper studies the multi-skill resource-constrained project scheduling problem with transfer times (MSRCPSP-TT). The assumptions of the proposed problem are as follows: -Let G (J, E) be an activity-on-node (AON) network to depict the structure of the project. J is a set of interrelated and non-preemptive activities and E is a set of edges representing Finish-to-Start (FS) precedence relations among activities with zero-time lags. The precedence relations define that which activities should be completed before other activities could be started. -Activities have known and predefined durations.
-Activities have merely one execution mode. -To accomplish a project, a set of multi-skill and unrelated workers is required. Each worker is able to perform a subset of skills from the skill pool (e.g. electrician, machinist, analyst, tester, etc.). -Workers have different use-costs.
-Expenditures of a project is bound to a limited budget.
-The workers are assigned to activities based on their required skills. For each skill of a worker, there is a certain familiarity level. The worker s is capable of performing the activity j, if and only if the worker s possesses the required skill and his/her familiarity level is not less than the standard level [69]. -Each worker is allowed to process at most one activity at a time.
-Workers have to be transferred between the execution sites of activities to perform required skills.    1  3  2  2  2  1  3  2  3  4  1  2  5  3  3  6  2  3  7 1 3 -A planning time horizon of discrete time periods has been considered to schedule activities. The project includes a dummy start activity 0 and a dummy finish activity N + 1, which mark the start time and finish time of the project, respectively. These dummy activities have no duration and they need no workers.
Consider a project comprising seven non-dummy activities to be performed by five workers. The project requires three skills. There are three familiarity levels for each skill. The activities "0" and "8" are dummy start and finish activities, respectively. The AON network of the project is illustrated in Figure 1. Each activity is represented as a node. The nodes are weighted with processing times. Table 4 details the required skills by activities. Moreover, the standard level of each skill is reported in Table 4 as well. Table 5 shows the skills which can be performed by each worker. Besides, Table 5 provides the familiarity levels of workers.
Based on the information provided in Tables 4 and 5, a skill matrix SK = [sk js ] 7×5 (j = 1, . . . , 7|s = 1, . . . , 5) is created that shows which workers can be assigned to each activity. Table 6 illustrates the skill matrix SK for the example. According to Table 6, the workers 1, 2, and 3 are eligible to perform activity "1". The workers 3, 4, and 5 can be assigned to activity "2". The worker 5 is the only eligible worker to execute activity "3". The workers 1, 4, and 5 can be allocated to activity "4", while the second worker is the only qualified human resource to perform the activity "5". The worker 5 can be assigned to activity "6" and the workers 4 and 5 can accomplish activity "7".
To transfer worker s from the operation site of activity j to the operation site of activity j , a transfer time denoted as τ jj s is needed. The triangular inequality is satisfied for all transfer times (τ jj s ≤ τ jj s + τ j j s ). For the project described above, the transfer time matrix (τ jj ) is as follows. Table 7 shows the resources assigned Table 5. Skills and familiarity levels of workers.
Worker Skill Familiarity level to activities based on the information given in Tables 4-6.
. Figure 2 illustrates a feasible schedule for the example. There are R 1 , R 2 , and R 3 available workers to perform the first, the second, and the third skills, respectively. The make-span of the project is equal to 8 time periods which has been obtained with respect to precedence relations and resource limitations. The arrows in Figure 2 indicate transfers of workers. The worker 4 has been assigned to activities "2" and "7". The worker 4 has to be transferred to the operation site of activity "7" after the completion of activity "2". Besides, the worker 5 has been assigned to activities "3" and "6". This worker needs to be transferred from the operation site of activity "3" to the execution site of activity "6". As shown in Figure 2, the transfer times have delayed the completion of project for two periods.
The objectives of the MSRCPSP-TT are minimization of make-span and total cost of project, simultaneously. In the following, Section 3.1 describes the notations of sets, parameters, and decision variables used in the  proposed model. Section 3.2 presents the mathematical formulation, and Section 3.3 describes the presented model.

Notations
The following notations are defined to formulate the MSRCPSP-TT: The fixed unit salary of worker s γ jk The required standard level of skill k for activity j L sk The level that worker s masters skill k τ jj s The required time to transfer worker s from execution site of activity j to execution site of activity j β The amount of budget considered for the whole project ϑ sk Equals 1 if worker s has skill k, otherwise it equals 0 Variables C js The cost of performing activity j by worker s θ j The processing time required by activity j considering transfer time FT j Finish time of activity j Z 1 Make-span of the project Z 2 Total cost of the project X jt Equals 1 if activity j starts at the beginning of period t, otherwise it equals 0 λ jj s Equals 1 if worker s is transferred from execution site of activity j to execution site of activity j , otherwise it equals 0 Y js Equals 1 if worker s is assigned to activity j, otherwise it equals 0 ω jk Equals 1 if activity j requires skill k, otherwise it equals 0 η jst Equals 1 if worker s is performing activity j in period t, otherwise it equals 0

Mathematical formulation
Subject to:

Model description
The objective functions (3.1) and (3.2) are to minimize the make-span and total cost of project, respectively. Constraint (3.3) secures that each activity starts exactly once. In a project, activities cannot be started more than once when preemption is not allowed. Therefore, Constraint (3.3) is required for the formulation. If a worker is supposed to perform activity j at site "B" when he/she has just completed another task in site "A", he/she should be transferred to site "B" in order to carry out the aforementioned activity. Therefore, this transferring time must be considered in the overall required duration of activity j. Equation (3.4) calculates the processing time required by activity j considering resource transfer time. When two activities are bound together regarding precedence relations, the successor must be completed after all its predecessors. In other words, the finish time of a successor cannot be larger than the finish times of its predecessors. Constraint (3.5) secures the precedence relations between activities. Based on the assumptions of the model, for performing each skill of an activity, there is a standard level. Hence, even though a worker may have the required skill of an activity, he/she may not be efficient enough to perform that specific skill. The eligibility of a worker is determined by his/her familiarity level. Therefore, the familiarity level of the workers assigned to skill k of activity j must be equal or more than the standard level requested by that specific skill of activity j. Constraints (3.6) and (3.7) guarantee that each activity can only be performed by eligible workers. Workers cannot be present at two different locations at the same time; therefore, Constraint (3.8) ensures that each worker can only perform one activity in each period. Salary of workers is different due to their familiarity levels. Thus, the cost of an activity depend on the workers assigned to it. Equation (3.9) computes the cost of performing activity j by worker s. Constraint (3.10) secures that transfer time of worker s is taken into consideration. The budget of projects are limited in real-world scenarios, therefore this limitation should be considered in the formulation. Constraint (3.11) secures that total cost of project cannot exceed the amount of budget considered for the whole project. Constraints (3.12) and (3.13) defines the feasible scope of decision variables.

Multi-agent system (MAS)
Agent is known as a notion in artificial intelligence [73]. Each agent can be interpreted as a computer system existing in a particular environment. Agents receive information from the environment by means of sensors. They can take appropriate actions to independently comprehend the target of the system without interventions from humans or other agents. For each agent, there is a set of possible action, namely social behaviors, proactiveness, and responsiveness [70,73]. The agents analyze the information received from the environment and take immediate actions to influence the environment or to adapt to its changes. Social behavior enables the agents to interact with other ones or to interact with external entities. In a multi-agent system (MAS), there is a group of independent agents that interact with one another and perform their tasks in a specific environment to accomplish predefined targets [2,37,73]. For each MAS, there are three main factors: (1) a set of available agents denoted as A = {A 1 , A 2 , . . . , A n }, (2) the environment where the agents perform their tasks and interact with each other, and (3) a set of rules that control the interactions between agents and environment [48,73]. The agents can be arranged in different organizations, for more details on various organizations of agents please visit the reference [2]. This study considers group organization for agents which has been illustrated in Figure 3.

Multi-agent optimization algorithm (MAOA)
For each multi-agent system, there are three major features: (1) environment, (2) behaviors of agents, and (3) interactions between agents [73]. These features are elaborated as follows:

Adjustment of environment
In the multi-agent system used in this study, each agent is represented as a solution. The environment is structured by the agents and their relations. This study utilized the grouped structure introduced by Zheng and Wang [73] that consists of G (g = 1, . . . , G) groups. Each group is constituted by NA g agents, where NA g denotes the number of agents in gth group. The agent that has the best fitness is considered as the "leader". Leader agents of existing groups are compared with each other. The group that has the best leader agent among all leader agents is known as the elite group. The second best agent in each group is known as the "active" agent. Figure 4 illustrates a leader-group organization.
The agents can explore the solution space accurately through adjusting the environment. In a multi-agent optimization algorithm (MAOA), all agents are re-grouped so as to update the environment. In this respect, the active agent of each group is substituted with the worst agent of the elite group. This adjustment will share the information among groups and it helps to improve the exploring procedure [73].

Social, autonomous, and self-learning behaviors of agents:
The MAOA has two sorts of social behaviors, namely local social behavior and global social behavior which are explained as follows [73]: -Local social behavior, which indicates the collaborative interaction within a group. The interaction between the leader agent of a group with other agents in that specific group is defined as the local social behavior. This type of social behavior helps to exploit the neighborhoods of existing agents in a group. Local social behavior of agents is shown in Figure 5a. -Global social behavior, which is the collaborative interaction in the entire environment. For this type of social behavior, the leader agent of the elite group works with leader agents of all groups. This type of Figure 4. Leader-group organization of agents [73].
cooperation leads to profound exploration of the entire solution space. Figure 5b shows the global social behavior of agents.
For each agent, there is another behavior called the autonomy. Based on this behavior, each agent can act independently without external interference. Due to this behavior, each agent exploits its neighborhood in a randomly manner to find better solutions [73]. Self-learning is another behavior considered for each agent in a MAS. Since agents receive information throughout the solving process, they can improve themselves via learning from the obtained knowledge [75]. The structure of the classical MAOA is depicted in Figure 6

Multi-Objective Multi-Agent Optimization Algorithm (MOMAOA)
Based on the social, autonomous, and self-learning behaviors of a multi-agent system and due to the multiobjective optimization problem tackled in this study, we propose a multi-objective multi-agent optimization algorithm (MOMAOA). In the MOMAOA, the environment is initially formed by dispersing agents into multiple groups. The agents are evolved via social, autonomous, and self-learning behaviors. The social behavior is considered as the global exploration, while the autonomous and self-learning behaviors are considered as local exploitation. To adjust the environment, the agents are transferred among groups to deepen the exploration process. Features of the MOMAOA for solving the MSRCPSP-TT are described in the following sections.

Solution representation and decoding scheme
In this paper, each agent denotes a solution of the MSRCPSP-TT. Each solution is represented as a (2 × N ) matrix as shown in Figure 7, where N is the number of project activities. The first row of each solution is a precedence-feasible activity list. Each activity j a (a = 1, 2, . . . , N ) should be positioned on the list after all its predecessors [25]. The second row is a resource list which indicates the resources assigned to each activity. π a (a = 1, 2, . . . , N ) indicates the worker assigned to the activity j a .
Having produced the agents (solutions), they are randomly dispersed into G groups. Each group comprises GS number of agents (GS denotes the group size). Hence, the population size is equal to (G × GS). The decoding  procedure determines the start times of activities according to the sequence of the activity list and the resource assignment plan. In this study, we apply the serial schedule generation scheme (S-SGS) to construct schedules for the MSRCPSP-TT. The S-SGS is an iterative procedure that consecutively adds an activity to a schedule until a feasible schedule is achieved. In each iteration, the first un-scheduled activity on the precedence-feasible activity list is selected to determine its earliest possible start time. This process continues until no un-scheduled activity is left [38].

Procedure of finding the leader agent in each group
To find the leader agent in each group LA g (g = 1, . . . , G), we firstly utilize the non-dominated sorting method in the NSGA-II proposed by Deb et al. [19] to determine the non-dominated agents (solutions). This mechanism can be embedded in most of multi-objective evolutionary algorithms to approximate the Pareto front. If there is one single non-dominated agent among all agents existing in a group, this agent is considered as the leader agent of the group. However, if there are multiple non-dominated agents in a group, we use the TOPSIS method which is a multi-attribute decision making technique to rank these agents. The concept of the TOPSIS method is that the selected alternative has the least distance from the positive ideal solution, while it should be away from the negative ideal solution [8]. To use the TOPSIS method, a decision matrix is created. The rows and columns of this matrix represent the non-dominated agents and criteria, respectively. These criteria include the make-span and total cost of the project. Both make-span and total cost of project are negative criteria. The criteria are equally important. The procedure of the TOPSIS method is illustrated in Figure 8. The procedure of finding the leader agent of each group is depicted in Figure 9.

Procedure of finding the global best leader agent
To find the global best leader agent, the non-dominated sorting method is hired once again to determine the non-dominated agents among leader agents of all groups. If there is one single non-dominated agent among all leader agents, this agent is chosen as the global best leader agent. Otherwise, the TOPSIS method described in Section 4.3.2 is used to rank the leader agents so as to find the best one. Figure 10 shows the procedure of finding the global best leader agent.

Social behavior in the MOMAOA
A crossover operator has been applied for the MOAOA that the offspring agents can inherit characteristics of both parents. This crossover operator is used as the global social behavior to perform the interaction between the global best leader agent and the leader agent of each group. The best offspring agent will take the place of corresponding leader agent if any of the following conditions is met: (1) The best offspring agent dominates the corresponding leader agent.
Since the solution representation used in this paper consists of two parts, the crossover operator produces offspring agents in two phases. The first phase is dedicated to generating a feasible activity list, while the second phase is considered to generate a feasible resource list. To generate feasible activity lists, the Magnet-Based Crossover Operator (MBCO) introduced by Zamani [72] has been hired. To determine the workers assigned to activities on offspring agents, a simple procedure is used. In this procedure, an integer random number (Rand) is generated on the interval [1,2] for each activity. If Rand is equal to 1 for activity j, the worker assigned to activity j on the global best agent will be allocated to this activity on the offspring agent. On the other hand, if Rand is equal to 2 for activity j, the worker assigned to activity j on the leader agent will be allocated to this activity on the offspring agent. Figure 11 shows the procedure of generating resource lists for the offspring agents.
For the local social behavior, a procedure is used to enhance the quality of agents with the help of their corresponding leader agent. Since transfer times of resources increase the make-span and total cost of project, the procedure of assigning resources to activities can be improved for each agent so as to minimize both objective functions. In this respect, a random binary string (RBS) 1×N is generated to let resource lists of agents inherit  from the resource list of the leader agent. If RBS j is equal to 1, the worker assigned to activity j on the leader agent will be assigned to this activity on the newly generated resource list. If RBS j is equal to 0, the worker assigned to activity j will not change on the newly generated resource list. The proposed operator used as the local social behavior is shown in Figure 12. The whole procedure of social behavior in the MOMAOA including global and local behaviors is depicted in Figure 13

Autonomous behavior in the MOMAOA
In this study, we utilize the Permutation-Based Swap (PBS) operator proposed by Chen et al. [10] for autonomous behavior of each agent. The PBS operator randomly chooses two adjacent activities with no precedence Figure 13. Procedure of social behavior. relations. A new activity list is generated by swapping these two activities. Given that these two activities have no precedence relations, the newly produced activity list is feasible [73]. Since each agent includes a resource list, the PBS operator needs to be developed to generate feasible resource lists as well. For this purpose, the PBS operator swaps the assigned workers of the selected activities in order to generate a feasible resource list. Considering the project depicted in Figure 1, the procedure of the PBS operator is illustrated in Figure 14 [46] proposed a Forward-Backward Improvement (FBI) procedure to reduce the project completion time. The FBI procedure adjusts a solution, iteratively. In each iteration, the backward and forward scheduling method is used to minimize the make-span. Similar to the multi-agent system developed in [73], the MOMAOA employs the FBI method as the self-learning behavior of the best leader agent which enables the algorithm to deepen its exploitation phase.

Adjustment of environment in the MOMAOA
The adjustment of environment is required to share information among existing groups. The MOMAOA adjusts its environment every fifteen iterations. Suppose that the global best agent belongs to group l. For each group g (g = l), the TOPSIS method is employed to determine the active agent. If the active agent of group g  (AA g ) dominates the worst agent of group l (WA), the AA g moves to group l and the WA takes the position of AA g in group g. Figure 15 illustrates the procedure of adjusting the environment.

Elitism in the MOMAOA
For the MOMAOA, there is an archive of non-dominated agents. In each iteration, the non-dominated offspring agents generated by social behavior, autonomous behavior, self-learning, and adjustment of environment are merged. Each offspring agent is compared to the agents existing in the archive. If an offspring agent succeeds to dominate any of the agents existing in the archive, it will take the place of the dominated agent. The maximum number of iterations (MaxIt) has been considered as the stopping criterion for the MOMAOA. The structure of the MOMAOA is depicted in Figure 16.

Computational study
In this section, we evaluate the performance of the MOMAOA comparing to three state-of-the-art multiobjective evolutionary algorithms, i.e. NSGA-II, PESA-II, and MOPSO. The algorithms are coded in the Matlab R2017b software. The codes are run on a PC with Intel Core 2 Quad processor Q8200 (4M Cache, 2.33 GHz, 1333 MHz FSB) and 4GB memory. The results obtained by the algorithms are described by project duration (hours) and project cost (currency unit).

Test problems
We use the iMOPSE dataset proposed by Myszkowski et al. [59], which have been generated based on realworld projects. The iMOPSE dataset consists of 36 project instances which has been produced based on the most general features of projects. These features include the number of activities (N ), the number of available workforces (S), the number of precedence relations between activities (NPR), and the number of required skills (K). Table 8 summarizes the features of the iMOPSE dataset. As shown in Table 8, there are two groups of test problems that consist of 100 and 200 activities. Test instances 1-18 are considered as small size problems while test instances 19-36 are considered as large size problems In the iMOPSE dataset, each worker masters six different skills. The scheduling complexity is different for each project due to various features. For detailed description of the iMOPSE dataset, see [59].

Performance measures
Since the objectives of the proposed model conflict with each other, it is challenging to evaluate a multiobjective evolutionary algorithm (MOEA). For multi-objective optimization problems, it is required to provide multiple but evenly distributed solutions to form a Pareto front. These solutions enable the decision maker to choose from different alternatives [55]. We use five well-known multi-objective metrics to evaluate the performances of the algorithms. These metrics are as follows: The metric measures the closeness between the solutions of the approximation front and the ideal point [77]: where, dist i is the Euclidean distance in the phenotype space between consecutive solutions on the approximation front. dist i is computed using the following formula: The closer the values of SM to zero, the more uniformly the distribution of solutions.

-Diversification metric (DM)
This metric is used to measure the extension of the Pareto front. Higher extension of a Pareto front indicates better diversity of results [77]: The computational time required by each algorithm to find optimal or near-optimal solution is another criterion to evaluate the performance of an optimizer [23].
-Set coverage (C-metric) Consider two Pareto fronts denoted as PF 1 and PF 2 . C (PF 1 , PF 2 ) indicates the percentage of solutions on the PF 2 dominated by at least one solution of PF 1 [77]: where, i and i are the solutions on the PF 1 and PF 2 , respectively. |PF 2 | is the number of solutions on the PF 2 .

Calibrating parameters of algorithms
Proper adjustment of parameters accelerates the convergence of algorithms and it enhances the quality of solutions. In this study, we use the Response Surface Methodology (RSM) as an effective statistical approach to detect promising parameters' values. The aim of the RSM is to optimize a response (output variable) which is influenced by several independent input variables (factors). Lower and upper levels of each parameter is determined in the initial step. Then, optimal levels of parameters are obtained via the RSM. Equation (3.8) formulates the generalized model of the RSM [62]: where, y represents a response variable and n denotes the number of independent input variables (δ 1 , . . . , δ n ). ε represents an error, while f is a response function. To realize the condition of the response surface, the RSM detects minimum and maximum points; therefore, the region of optimal response is obtained. In this research, we have used the Box-Behnken design (BBD) as one of the renowned response surface methodology design which is often used to tune full quadratic models [62]. It requires only three levels to run an experiment. Three levels (−1), (0), and (+1) have been considered to indicate low, zero, and high levels of variables, respectively [53]. The most effective factors of the MOMAOA are reported in Table 9. This study used the response variable (y) introduced by Rahmati et al. [63] which has been called the Multi-Objective Coefficient of Variation (MOCV) for the Pareto-based algorithms. The RSM is conducted on large-size test problems with 200 activities. Each combination of different levels obtained by the Box-Behnken designs is implemented five times. To compute the MOCV as the response variable for each experiment, the results are turned into the Relative Percentage Difference (RPD) [21]. Then, MOCV is computed for all experiments. The final tuned values of the MOMAOA are G = 4, NA = 75, ρ = 0.90 and MaxIt = 300.

Comparative analysis
In this section, we compare the performances of algorithms in solving the test problems. Table 10 reports the average values of the MID, SM, DM, and CPU time that have been obtained by 10 runs of algorithms for each test problem. Based on the MID metric, the MOMAOA has strongly prevailed other methods. This means that the MOMAOA had better convergence in comparison with the NSGA-II, PESA-II, and MOPSO. In terms of the SM metric, the proposed method outperformed other algorithms. This implies that the MOMAOA has succeeded to find more uniformly distributed solutions. The outputs of the algorithms in terms of the DM metric show that the solution set found by the MOMAOA covers a wider space comparing to the NSGA-II, PESA-II, and MOPSO. It can be inferred from Table 10 that the MOPSO has the best performance. More investigations of Table 10 reveal that by increasing the size of problems, the values of performance measures also increased. Table 11 shows the standard deviations of values acquired by the MOMAOA, NSGA-II, PESA-II, and MOPSO. As shown in this table, the MOMAOA has obtained more consistent outputs than the NSGA-II, PESA-II, and MOPSO.
To examine if the performances of the algorithms are significantly different or not, the algorithms are statistically compared via the Kruskal-Wallis test. To compare these four methods statistically, the following hypothesis test is considered. The Matlab 2017b is used to conduct the Kruskal-Wallis test at a 95% confidence interval. A null hypothesis (H 0 ) is rejected in favor of the alternative hypothesis if the P-Value is less than or equal to 5%.
H 0 ; There is no significant difference between algorithms in terms of a performance measure H 1 ; There is significant difference between algorithms in terms of a performance measure. (5.7) Tables A.1-A.4 report the outputs of the Kruskal-Wallis test. To make this paper as succinct as possible, we reported the outputs of the Kruskal-Wallis tests (Tabs. A.1-A.4) in Appendix A. However, the conclusions that can be used from these tests have been summarized as follows: The results in Table A.1 indicate that there are significant differences between these four algorithms in terms of the MID metric (P -Value = 0.0173 < 0.05). Based on the outputs in terms of the SM metric in Table A.2, the performances of algorithms are not significantly different at a 95% confidence interval (P -Value = 0.9679 > 0.05). Table A.3 shows that the algorithms do not perform statistically equal in terms of the DM metric (P -Value = 0.0437 < 0.05). Ultimately, the outputs in Table A.4 imply that the performances of algorithms are not statistically different in terms of CPU time (P -Value = 0.8361 > 0.05). Figure 17 presents interval plots to clarify the statistical results better. The upper left plot in Figure 17 indicates that in terms of the MID metric, the MOMAOA is superior to other methods. The MOPSO is ranked the second, while the PESA-II takes the third place. The upper right plot in Figure 17 indicates the interval plots in terms of the SM metric. Based on this plot, the MOMAOA is ranked the best, the PESA-II is the second, and the MOPSO is the third. The lower left plot in Figure 17 shows that in terms of the DM metric, the MOMAOA is the best ranked method, the MOPSO is the second, and the NSGA-II is the third. The lower right plot in Figure 17 implies that the MOPSO is the fastest method. Then, the MOMAOA has taken the second place, and the NSGA-II is the slowest algorithm.   Figure 17. Comparison of algorithms on the MID, SM, DM, and CPU time. Table 12 reports the comparisons between algorithms in terms of set coverage metric (C-metric). Figure 18 shows the boxplots of C-metric values for all algorithms. The lower and the upper ends of each box imply the first and the third quartiles, respectively. The line in each box indicates the median.
It can be inferred from Table 12 and Figure 18 that the MOMAOA obtained larger C (MOMAOA, NSGA-II), C (MOMAOA, PESA-II), and C (MOMAOA, MOPSO) values for most of test problems. It means that the Pareto solutions obtained by the MOMAOA are more dominant than the solutions obtained by the NSGA-II, PESA-II, and MOPSO. The statistical analysis has been conducted on the C-metric values as well; hence, the non-parametric Kruskal-Wallis test is again hired on a 95% confidence interval. The null hypothesis assumes that there is no significant difference between the performances of algorithms. If P -Value < 0.05, the null hypothesis is rejected. Tables 13-15 report the results of the Kruskal-Wallis tests. The outputs of the Kruskal-Wallis tests show the significant differences between algorithms in terms of C-metric. Table 16 reports the average of objective function values obtained by solving the test problems of the iMOPSE dataset. The algorithms have been run for 10 times and the outputs have been obtained by 10 runs of each algorithm. Since the algorithms have some probabilistic features, the average values have been reported to evaluate the overall performance of optimizers. As shown in Table 16, the MOMAOA has strongly prevailed other methods in terms of both make-span and total cost of project.     Figure 18. Boxplots of C-metric values.

Impact of resource transfer times on objective function values
To examine the impact of transfer times on make-span and total cost of project, all algorithms were used to solve the problems with and without consideration of transfer times. The best values of algorithms were averaged and the results of both cases have been shown in Figure 19. According to this figure, transfer times have a remarkable impact on both objectives. The Kruskall-Wallis test has been used to offer a statistical analysis on the effect of transfer times on the objectives. From a statistical perspective, Tables A.5 and A.6 (Appendix A) indicate that transfer times can significantly increase both objectives.  the proposed algorithm and to validate the obtained results, three meta-heuristics called the non-dominated sorting genetic algorithm II (NSGA-II), the Pareto envelope-based selection algorithm II (PESA-II), and the multi-objective particle swarm optimization (MOPSO) method were employed to solve the iMOPSE dataset consisting of 36 test problems. The input parameters of all algorithms were tuned via the response surface methodology (RSM). The algorithms were evaluated in terms of several well-known comparison measures. Besides, the algorithms were statistically compared to each other via the Kruskal-Wallis test. Based on the computational experiments, the MOMAOA was superior to the other three algorithms in most of evaluations. To show the effect of resource transfer times on the make-span and total cost of project, we solved the iMOPSE test problems with and without considering resource transfer times. The results show that considering resource transfer times has a significant impact on values of both objective functions. To extend the proposed model, the resource transfer times can be considered uncertain. Besides, the multi-skill resource-constrained multi-project scheduling problem with transfer times will also be a potential subject for further studies. Moreover, the newly proposed multi-objective algorithms can also be applied for the MSRCPSP-TT and compared with the proposed algorithm in this research. To solve the iMOPSE test problems, the literature offers some novel algorithms such as the Non-dominated Tournament Genetic Algorithm (NTGA) [44] that offer promising results comparing to classical multi-objective genetic algorithms. Therefore, one of the directions for future studies is to compare the results of recently developed algorithms for the MSRCPSP with the MOMAOA which has been proposed in this study. The MOMAOA can be developed for other complex optimization problems as well.
Appendix A.