PIPE-LINING DYNAMIC PROGRAMMING PROCESSES TO SYNCHRONIZE BOTH THE PRODUCTION AND THE CONSUMPTION OF ENERGY

Synchronizing heterogeneous processes remains a difficult issue in Scheduling area. Related ILP models are in trouble, because of large gaps induced by rational relaxation. We choose here to deal with it while emulating the interactions which take place between the various players of such heterogeneous processes, and propose a pipe-line decomposition of a dynamic programming process designed in order to schedule energy production and energy consumption Mathematics Subject Classification. 90-08, 90C39. Received September 30, 2020. Accepted June 17, 2021. 1. Context and state of the art Efficiently synchronizing heterogeneous processes remains a difficult issue when it comes to scheduling and routing (see [13,17]). It arises for instance in the management of vehicle sharing systems, of drones and trucks in the context of urban logistics, and of industrial assembly processes. Integer Linear Programming (ILP) models are flawed by large gaps induced by the relaxation of the integrality constraint (the Big M problem). By the same way, designing ad hoc Branch and Bound schemes is difficult because of the lack of efficient bounding scheme. Besides, synchronization requirements increase the impact of uncertainty and put robustness at stake, making the efficiency of global heuristics difficult to check. A way to address those issues is to introduce flexibility and modularity in the design of algorithms and rely on ad hoc decomposition schemes in order to emulate the interaction mechanisms which allow distinct players to run a complex process in a decentralized way. This synchronization issue tends to become a crucial one when it comes to energy management: The emergence of renewable energies (Photovoltaic, Wind, Hydrogen, . . . ) also means the emergence of local in situ producers which are at the same time consumers: factories, farms and even individual householders. Due to both market deregulation and emergent technologies, (see [1, 5, 6, 14]) this trend is forecast to have a major impact on Energy Economics. Key energy players are currently paying attention to it. A good example is provided by the activities of Labex IMOBS3 program in Clermont-Fd, France, devoted to Innovative Mobility. In the context of this project, we are currently involved into the control of a micro-plant for hydrogen production, which feeds autonomous vehicles with hydrogen fuel. But, while most hydrogen production is usually performed through power costly electrolysis processes, researchers rely here on solar power and photolysis [7,18,25], which make the


The EPC problem: mathematical formulation and decomposition scheme
We consider here some vehicle which has to perform internal logistic tasks, while following a route Γ which starts from some Depot node and ends at the same place after going through stations j = 1, . . . , M , according to this order. Start-node Depot has label 0 and End-node Depot has label M + 1. The time required by the   vehicle in order to go from j to j + 1 is equal to t j , taking into account the time spent by the vehicle in servicing j. The vehicle may leave Depot at time 0 and should be back no later than some time TMax. Our vehicle is powered by hydrogen fuel. The capacity of its tank is denoted by C Veh and we know, for any j = 0, . . . , M , the hydrogen amount e j required in order to move from station j to station j + 1. The initial hydrogen load of the vehicle is denoted by E 0 , and the vehicle is required to end its trip with at least the same hydrogen load. It comes that the vehicle must periodically refuel. Refueling transactions take place at a micro-plant, close to Depot: The time required by the vehicle in order to move from station j to the micro-plant (from the micro-plant to j ) is denoted by d j (d * j ); by the same way, the energy required in order to move from j to the micro-plant (from the micro-plant to j ) is denoted by ε j (ε * j ). Quantities d j , d * j , t j , e j , ε j , ε * j are strictly positive, satisfy the Triangle Inequality (see Fig. 1) and are such that E 0 ≥ ε 0 . Figure 2 displays an example of a trip performed by the vehicle along station Depot = 0, 1, 2, 3, 4, 5, 6 = Depot, while refueling between station 1 and station 2, and next between station 3 and station 4.
On another side, the micro-plant produces H 2 in situ through photolysis and electrolysis. Resulting hydrogen is stored inside the micro-plant's tank, with capacity equal to C MP . We suppose that the time space {0, . . . , TMax} is divided into N periods P i = [p.i, p.(i + 1)], i = 0, . . . , N − 1, with TMax = N.p. We identify index i and period P i . If the micro-plant is active at some time during period i, then it is active during the whole period i, and produces R i hydrogen fuel units. For safety concerns, the vehicle cannot refuel while the micro-plant is producing: this assumption is due to the fact, in case of the experimental platform we are referring to, making hydrogen simultaneously arrive into the micro-plant's tank and leave it in order to be loaded into the vehicle's tank induces variations of pressure which raise safety issues. Besides any vehicle refueling transaction requires a whole period i. At time 0, the load of the micro-plant tank is H 0 ≤ C MP and the micro-plant is not active. The same situation should hold at time TMax.
Producing hydrogen fuel has a cost, which may be decomposed into 2 components: -A constant activation cost Cost F , which is charged every time the micro-plant is activated.
-A time-dependent production cost Cost V i which is independent on the amount of hydrogen really produced during period i and reflects the time-indexed prices charged by the electricity provider.  In such a case, the vehicle refuels twice: the first refueling transaction, performed at period 4, involves 13 fuel units, and the second one, performed at period 12, involves 12 fuel units. Resulting tour ends at time 30 and production cost is 3.7 + 2 + 6 + 1 = 30. It comes that resulting global cost is 30 + 30 = 60.
The following Table 1 summarizes the input data for the EPC problem:

An integrated Mathematical Programming (MP) model
MP is not well-fitted to EPC. Still, we may use it in order to formulate our problem in an unambiguous way, based upon 3 main variables: -Production variables: • z i ∈ {0, 1}, with i = −1, . . . , N −1: z i = 1 ∼ the micro-plant is active during period i (i = −1 corresponds to a fictitious period); • y i ∈ {0, 1}, with i = 0, . . ., N − 1: y i = 1 ∼ the micro-plant is activated at the beginning of i ;   TMax: maximal time for the vehicle to achieve its tour C Veh : vehicle tank capacity E0: initial vehicle hydrogen load tj ≥ 0, with j = 0, . . . , M : required time to go from station j to station j + 1 dj ≥ 0, with j = 0, . . . , M : required time to go from station j to the micro-plant d * j ≥ 0, with j = 0, . . . , M : required time to go from the micro-plant to station j ej ≥ 0, with j = 0, . . . , M : required energy to go from station j to station j + 1 εj ≥ 0, with j = 0, . . . , M : required energy to go from station j to the micro-plant ε * j ≥ 0, with j = 0, . . . , M : required energy to go from the micro-plant to station j Micro-plant production related input C MP : micro-plant tank capacity N : number of production periods p: duration (in time units) of one production period H0: initial micro-plant hydrogen load . . , N − 1: time interval related to production period i Ri ≥ 0, with i = 0, . . . , N − 1: production rate related to period i Cost V i ≥ 0, with i = 0, . . . , N − 1: production cost related to period i is the hydrogen load of the micro-plant tank at the beginning of period i ; We involve here a fictitious period N in order to express the fact that the load of micro-plant tank at the end of the process should be at least equal to H 0 ; • δ i ∈ {0, 1}, with i = 0, . . ., N − 1: δ i = 1 ∼ the vehicle is refueling during period i ; • L * i ≥ 0, with i = 0, . . ., N − 1, with non negative integer values: in case δ i = 1, L * i is the quantity of hydrogen loaded by the vehicle during period i.
Constraints come as follows (for a better understanding, we use here a logical formulation, easy to linearize through Big M technique): • For any i = 0, . . ., N − 1: (E1) ((E1) means that at any time, the vehicle must be able to go to the micro-plant and refuel, and relies on the Triangle Inequality for energy coefficients e j and ε j ).
• For any j = 0, . . ., M : -Synchronization constraints: • For any j = 0, . . ., M : (E2) ((E2) expresses that load L * i cannot exceed neither the current load of the micro-plant tank nor the difference between C Veh and the current load of the vehicle tank).

A demander/producer decomposition scheme
We are now going to explain the collaborative how the above EPC model may be decomposed in such a manner which emulates the way decisions are going to be taken in (realistic) case decision is collaborative, which means that production manager and vehicle driver are independent players, which are required to communicate. Main idea here is that a natural behavior of a decentralized vehicle driver is to move as if it were sure to get enough fuel every time he goes to the micro-plant, and to adapt itself to real context by waiting and eventually delaying some moves. Conversely, a natural, market oriented behavior of the production manager will consist in adapting itself to demand, and cutting a trade-off between Quality of Service and production cost.

VD: Vehicle Driver model
We forget here the restrictions related to hydrogen production and do as if the micro-plant were able to provide, at any time, the vehicle with as much as energy it needs. Then our goal is to fix the Refueling Strategy of the vehicle, that is the 0, 1-valued vector x = (x j , j = 0, . . . , M ) and the load vector L = (L j , j = 0, . . . , M ) of above model, which tell us at which stations j vehicle will refuel between j and j + 1, and how much. Variables T = (T j , j = 0, . . . , M + 1), T * = (T * j , j = 0, . . . , M + 1) and V Veh = (V Veh j , j = 0, . . . , M + 1) may be considered as auxiliary variables, whose values derive from x and L in a natural way by noticing that: -The vehicle never waits: it refuels as he arrives at the micro-plant, and keeps full speed meanwhile; -At the last time it refuels, it does in such a way he is back at Depot with a load equal to E 0 . At any other time, it achieves full tank.
Constraints are the Vehicle Constraints of EPC model of above Section 2.1. What remains to specify is the performance criterion. Of course, it has to include the α.T M +1 term. But it also has to involve a component which reflects what the economic cost of the Refueling Strategy (x, L) is likely to be. Since we do not know in advance what the Production Strategy is going to be, we introduce an auxiliary cost coefficient β, and consider that the economic cost of the Refueling Strategy (x, L) is the quantity β.(Σ j L j .x j ). This coefficient is going to be a key component of the interaction between the vehicle (demander ) and the micro-plant (producer ). It comes that resulting VD: Vehicle Driver model comes as follows:

The PM: Production Manager model
Since the production manager is supposed to adapt itself to demand, we suppose here that he is provided with an information which reflects the fuel demand by the vehicle driver, such as it derives from an ad hoc Refueling Strategy. As a matter of fact, a Refueling Strategy (x, L) provides us with a number Q of refueling transactions performed by the vehicle, with related hydrogen loads µ q , q = 1, . . . , Q, and with optimistic dates when those refueling transactions take place. But one understands that those dates are going to be the issue for a deal between the vehicle and the micro-plant. In order to bring flexibility to this deal, we shall use vector x in order to provide us with lower bounds m 1 , . . . , m Q and upper bound M 1 , . . . , M Q for the periods when the refueling transactions take place, as well as with minimal delays (time lags) between two such consecutive periods, due to the trip the vehicle is required to achieve between 2 consecutive refueling transactions. Resulting information, which we call a Reduced Refueling Strategy, will consist in: As for the performance criterion, it clearly must involve the economic cost i=0,...,N −1 (Cost F .y i +Cost V i.zi ). But it also must contain some component which reflects the role of the term α.T M +1 of the objective function of the global EPC model. It comes that the Production Strategy (z, δ) should aim at minimizing the quantity is the smallest possible}.

The VD PM decomposition
Then we see that EPC may be reformulated in a collaborative way as follows: -VD PM Reformulation of EPC: {Fix parameter β and compute a Refueling Strategy (x, L) in such a way that Reduced Refueling

An exact integrated DPS EPC algorithm
MP EPC model of Section 2.1 involves a large number of heterogeneous variables tied together by logical implications. Not surprisingly ILP libraries fail in computing optimal solutions as soon as N, M stop being small. It suggests that the EPC model is complex, which is confirmed by the following result: Proof. It is enough to suppose that Cost F is null, and do in such a way that the vehicle cannot start before production has been achieved. Then EPC happens to contain the Knapsack problem.
Still, EPC lies at the (hypothetic) boarder between P and NP, and may be handled through a Dynamic Programming Scheme (DPS) DPS-EPC . Since both VD and PM sub-models underlie two distinct representations of the time (stations and real time T ∈ [0, T Max] in the case of the vehicle, periods in the case of the micro-plant), the key point in this scheme lies on the way we link together (synchronization device) related time spaces. This synchronization device relies on two components: -Relative positioning relations and == between a real time value T and a period index i : for any period i = 0, . . ., N − 1, and any real time value T ∈ {0, . . . , T Max}, we set: • V Tank and V Veh are respectively the loads of the micro-plant tank at the beginning of period i and the load of the vehicle tank when it arrives at station j. • T is a time value in 0, . . ., TMax whose meaning derives from its position with respect to period i according to relations == and : • If T i, then the vehicle is on the road to j, which it shall reach at time T ; • If T i, then the vehicle is between j and the micro-plant, possibly waiting for being refueled; • If T == i, then the vehicle is at j, and decides between keeping on to j + 1 or moving to the micro-plant.
We derive from this synchronization device the other components of DPS EPC in a natural way: -Decisions, Preconditions, Transitions and Costs: a decision D related to a time pair (i, j) and a state , with the meaning: • z = 1 ∼ the micro-plant decides to produce during period i ; • x refers to a decision taken only when T == i: in such a case, x = 0 means that the vehicle moves from station j to station j + 1 without refueling; x = 1 means that it refuels at the micro-plant while traveling from j to j + 1. • δ = 1 ∼ the vehicle is located at the micro-plant and decides to refuel during period i, forbidding the micro-plant to be active during this period. It requires T i and p.i − T ≥ d j . Decision is taken at the end of period i − 1. For any time pair (i, j) and state s = Z, T, V Tank , V Veh , no more than 4 decisions D are feasible: Then the vehicle is moving from j to the micro-plant, and it cannot refuel yet. The only choice is about z, with preconditions and costs as in the first case.
Resulting states and costs are as in the first case.
• Refueling during period i : z = 0; δ = 1. It requires ε j+1 + ε * j+1 ≤ Inf C Veh , V Tank + V Veh − ε j . We shift from (i, j) to (i + 1, j + 1) and resulting state is s = (0, p. . Notice that if it is possible for the vehicle to come back to Depot without refueling anymore, above formula must be modified in order to make appear that the vehicle only refuels what it needs in order to be back to Depot with a load equal to E 0 . • Doing nothing during period i : z = 0; δ = 0. We shift from (i, j) to (i + 1, j), resulting state is s = 0, T, V Tank , V Veh , and transition cost is null. • 4th case: T == i. Then we have 4 possibilities: • Producing during period i and keeping the vehicle towards j + 1: • Not producing during period i and keeping the vehicle towards j +1: and transition cost is α.t j , else we shift from (i, j) to (i, j + 1), with same resulting state and transition cost.
• Not Producing during period i and moving the vehicle to the micro-plant: z = 0, x = 1.
• Producing during period i and moving the vehicle to the micro-plant: z = 1, x = 1. • If s appears in S(i , j ) with a value W * , then if W * > W + CT, then W + CT becomes the value associated with (i , j ) and s else we discard s .
We denote by DPS EPC the DPS algorithm designed this way.

Filtering through rounding: a PTAS result
The number of states that an execution DPS EPC creates significantly increases with M and N. Still, if we suppose that all EPC parameters are integral and that TMax, C MP and C Veh are bounded by polynomial functions of N and M, then one may check DPS EPC becomes time-polynomial. It suggests that it should be possible to state some Polynomial Time Approximation Scheme (PTAS) result. As usual, such a result will rely on a rounding scheme: Rounding DPS ECP states: L and n = a 0 + a 1 .2 + . . . + a q .2 q , being 2 integers, a i ∈ {0, 1}, we first set: -If q ≤ L then Round(n, L) = n else Round(n, L) = a q−K .2 q−L + . . . + a q .2 q ; -If q ≤ L then Round * (n, L) = n else Round * (n, L) = a q−K .2 q − L + · · · + a q .2 q + 2 q−L .
We say that two integers n and m are equivalent modulo the L largest bits if Round (n, L) = Round (m, L). , L). As for the Round * function, it will allow us to ease capacity and initial state constraints.
Turning DPS EPC algorithm into a parametrized polynomial time algorithm DPS EPC (K ) We set L 0 = Smallest integer L such that 2 L ≥ N + M + 1, and, for any integer K ≥ 1, we proceed as follows: (1) We ease initial values H 0 and E 0 by replacing them respectively by Round *(H 0 , K + 1 ) and Round *(E 0 , K + 1 ). By the same way, we relax the time capacity constraint by replacing TMax by Round *(TMax, K ) and capacities C MP and C Veh respectively by Round *(C MP , K ) and Round *(C Veh , K ): this means we check the feasibility of any decision D = (z, x, δ) with respect to time capacity Round *(TMax, K ) and hydrogen capacities Round *(C MP , K ) and Round *(C Veh , K ). requires Ω 1 = Ω 2 . This means that relative positioning of T and i through relations , and == acts as an explicit state variable, in order to compensate the fact that those relations may be perturbed by rounding effects, and that we perform all tests which involves relations , and == while referring to Ω.
(3) We do in such a way that, at any time, S (i, j ) does not contain 2 extended states s 1 , s 2 , together with values W 1 , W 2 such that respectively s 1 and s 2 , as well as W 1 and W 2 , are equivalent modulo the K + 2.L 0 largest bits. We give priority to state s q , q = 1, 2, related to the smallest W q value.
Then we may state: Besides, for any value ε > 0, we may choose K large enough in such a way that in case EPC admits an optimal solution with value W Opt , then DPS EPC (K) yields a solution which is feasible with regards to initial values (1 + ε/2) · H 0 and (1 + ε/2) · E 0 , capacity values (1 + ε) · C MP , (1 + ε) · C Veh and (1 + ε). TMax and whose cost value is no larger than W Opt .
Proof. It is a mere consequence of the way we have been implementing our rounding strategy: Priority given in 3) to state s q , q = 1, 2, with smallest W q value, allows us to compute a solution with cost value no larger than

Logical filtering devices
In spite of above result, the number of states becomes too large when N and M increase. In order to reduce it, we may apply the following Strong Dominance Rule: given together with values W 1 and W 2 are such that: This rule has little filtering power. But other devices may be implemented. We distinguish: -Logical filtering rules: at any time pair (i, j ) during the DPS EPC process, we anticipate that it will not be possible to extend current state s = (Z, T, V Tank , V Veh ) into a feasible schedule, either because there is not enough time left (Makespan Based rules) or because it will not be possible to achieve required energy production (Energy Based rules), and so we kill s. -Quality based filtering rules: at time pair (i, j ) we check that the cost of any schedule consistent with current state s = (Z, T, V Tank , V Veh ), will be at least equal to the cost of some current feasible schedule.

Logical filtering rules
For any period i = 0, . . . , N − 1, we denote by Prod Max (i ) the maximal quantity Σ k≥i R k that the microplant can produce from time p.i on. Also, for any station j = 0, . . . , M , we get a rough estimation of both energy and time required by the vehicle in order to return from j to Depot by applying the following process: (1) Makespan Based filtering rule: if (∆ j ≥ T Max − T + 1) then kill s, since there is not enough time left for the vehicle to achieve its trip. (2) Energy Based filtering rule: if Ref uel j > V Veh + P rod-M ax(i) + V Tank then kill s, since there won't be enough energy for the vehicle to achieve its trip.

Quality based filtering rules
For any period i = 0, . . . , N − 1, any micro-plant state value Z of the micro-plant at the end of period i − 1, and any energy amount V, we pre-compute (through backward driven DPS), the minimal cost Cost Min(i, V, Z ) required from the micro-plant to produce V energy units between time p.i and time TMax, while starting with state Z at the beginning of period i : ]. Let us consider now some time pair (i, j ) and some related state s = (Z, T, V Tank , V Veh ) with value W. We deduce from above pre-computation a lower bound LB for the best EPC value which may be derived from

Complexity of VD
We are going to prove that VD is polynomial. However, we are still at the boarder between P and NP since: Proof. For any j, let us set: Let also set C = Σ j e j . Finally, let us suppose C Veh = +∞, β = 0 and T Max = +∞. Then we see that x should be such that: Under those assumptions, VD coincides with Knapsack. Conversely, any Knapsack instance may be turned this way into a VD instance.
The VD Graph graph: in order to deal with VD, we are going to make appear that it may be handled as a simple shortest path problem in the following oriented graph VD Graph: -Nodes of VD Graph are: • A source s and a sink p = ((M + 1), E 0 ); where V is a non negative number, no larger than C Veh ; -Arcs of VD Graph are: • Any arc u = (s, (0, V )), with non positive cost D u = (V − E 0 ); • Any arc u = ((j, V ), (j + 1, V − e j )), with null cost D u ; . The meaning of this construction comes through following Lemma 4.2: Lemma 4.2. Solving VD means searching for a shortest path from s to p in VD Graph.
Proof. We say that Refueling Strategy (x, L) satisfies the Empty Tank Hypothesis if: -Every time but the first time the vehicle refuels, it arrives at the micro-plant with an empty tank; -It comes back to Depot with a hydrogen load exactly equal to E 0 .
Then we notice that an optimal Refueling Strategy (x, L) may be chosen in such a way it satisfies this Empty Tank Hypothesis. If (x, L) does not agree with the Empty Tank Hypothesis, and it arrives into the micro-plant with a non null load δ > 0, then it is possible to make decrease former refueling transaction by δ, and augment current refueling transaction by δ ≤ δ, until cancelling related refueling transaction or making it Empty Tank. It comes that: -Nodes of VD Graph corresponds to the possible states of the vehicle when he performs his trip from Depot = 0 to Depot = M +1, while visiting stations j = 1, . . ., M and periodically refueling. In case the vehicle has never been refueling before j, V ≤ E 0 means the hydrogen amount it is going to consume before refueling; Else it means the current hydrogen load of the vehicle at j. -According to this, we understand the meaning of the arcs: • Arc u = ((j, V ), (j + 1, V − e j )), with null cost D u , means that the vehicle move directly from j to j + 1; , means that the vehicle move from j to j + 1 while refueling at the micro-plant: the cost of such a move is the additional cost (energy + time) induced by this detour; • Arc u = (s, (0, V )), with V ≤ E 0 , and negative cost D u = (V − E 0 ), means the decision that the vehicle is going to take at the beginning of the process, when it is going to decide about its first refueling transaction: related trip from 0 until Depot is then going to require V hydrogen units, and negative cost D u = (V − E 0 ) corresponds to the fact that the vehicle arrives at the micro-plant with an excess load (E 0 − V ) which will be used later.
We deduce that any Refueling Strategy (x, L) which satisfies the Empty Tank Hypothesis gives rise to a path in VD-Graph, whose cost is exactly the cost of the Refueling Strategy (x, L), and conversely.
As a matter of fact, we do not need the whole graph VD Graph in order to handle VD. If we apply a backward driven Bellman algorithm, then we see that we only deal with the following node collection X(j), j = 0, . . . , M +1, whose recursive definition comes as follows:   Thick arrows represent here the moves of the vehicles which involve a refueling detour. For every arc, (a + b) means the sum of respectively the energy and the time components of the cost of the arc (Fig. 6). Proof. Because of Lemma 4.2 and the above construction of the Useful VD Subgraph, we see that solving VD means searching a shortest path from s to p in the Useful VD Subgraph. But, for any j, the cardinality of X (j ) does not exceed M − j + 1, and the number of arcs which connects X(j) to X(j + 1) does not exceed Card(X (j ) + Card(X (j + 1)). We conclude.

VD dynamic programming handling
We deal with VD while computing a shortest path in the Useful VD Subgraph according to a Bellman backward driven process. This leads us to the following DPS VD dynamic programming algorithmic scheme: The DPS VD algorithm: Time, Space, Decisions and Transitions state E 0 at time M + 1; W Veh = α.T + β.U , where U is the energy to be wasted by the vehicle before the end of its trip; x • is the decision related to W Veh through backward driven Bellman equations. Initial state, related to time value M + 1, is going to be (E 0 , 0); Final states, related to j = 0, should be any pair (V Veh ≤ E 0 , T ≤ T Max). All pairs (j, s) should be nodes of the Useful VD Subgraph. -Decision Space: a decision at time j is a binary number x ∈ {0, 1}: x = 0 means a direct move from j to j + 1 without refueling, while x = 1 means that a refueling detour through the micro-plant has to be performed before reaching j + 1. -Backward driven strategy, transitions and costs: we follow Theorem 4.4 and the Useful VD Subgraph construction, and so perform our DPS process according to a backward driven strategy. Notice that running our algorithm in case α = 0 and β = 1, or α = 1 and β = 0, provides us, for any j and any current load V Veh , with the minimal time and energy amount required in order to achieve the vehicle tour from station j with load V Veh . We may store this information in order to use it as a filtering tool for the handling of the DPS EPC algorithm of Section 3. Backward transition from any state s 1 = (V Veh 1 , T 1 ) at time j + 1 induced by a decision x taken at time j ≥ 0 comes as follows: • If x = 0, then resulting state s = (V Veh , T ) associated with j is equal to (V Veh 1 + e j , T 1 + t j ). This transition requires T ≤ TMax and V Veh ≤ C Veh ; Its cost is 0.

Retrieving a (primary) refueling strategy and a reduced refueling strategy
According to Section 4.2, we retrieve full vector x, together with load vector L, by applying the following process: Refueling Strategy Retrieval procedure: j < −0; V < −V Veh ≤ E 0 , such that related T ≤ T Max and provided with minimal W Veh − β.E 0 value; Let V 1 , T 1 be time and energy values resulting from x • ; Above process yields what we call a Primary Refueling Strategy, that means vectors x and L, together with a number Q and {0, . . . , Q} indexed vectors TInf, TSup and µ, with the following meaning: -Q is the number of refueling transactions performed by the vehicle; -For q = 1, . . . , Q: • µ q is the quantity of fuel which is loaded during the refueling transaction q; • T Inf q (T Sup q ) is the earliest (latest) time when this refueling transaction may start (we consider q = 0 as referring to a fictitious refueling transaction, with TInf 0 = 0 and TSup 0 = TMax -T, where T is the time value associated with an optimal initial state); • ∆ q is the minimal delay between the date of the q th refueling transaction and the date of the (q + 1)th refueling transaction (if q = 0 then ∆ 0 = T Inf 1 ).
Reduced Refueling Strategy : in order to synchronize this Refueling Strategy with the hydrogen production process, we need to turn (routine process) values TInf, TSup and ∆ in terms of periods i = 0, . . . , N − 1. By doing this we derive what we call a Reduced Refueling Strategy, that is:

Dealing with PM through dynamic programming
In order to deal with this Production problem, we apply, as for the general EPC Problem, a forward driven dynamic programming DPS PM algorithm. But the size of related Time, State, and Decision spaces are significantly smaller. As a matter of fact those components come as follows: The • Z = 1 means that the micro-plant is active at the end of period i − 1.
• V Tank means the current load of the micro-plant tank at the beginning of period i.
• Rank ∈ 0, . . . , Q means that the Rank th refueling transaction has been performed and that we are waiting for the (Rank + 1)th refueling transaction.  • z = 1 means that the micro-plant will produce during period i ; • δ = 1 means that the vehicle will perform its (Rank + 1)th refueling transaction during period i.
Since production and refueling transactions cannot be performed simultaneously, there are only 3 possible decisions.

Filtering devices and greedy Greedy PM algorithm
As in the case of DPS EPC , we may enhance DPS PM through filtering devices: -Filtering through rounding: as in Section 3.1, we may round values V Tank and W Prod and turn DPS MP into a parametrized algorithm DPS PM (K ) which is time-polynomial for any fixed K, and such that, for any value ε > 0, K may be chosen in such a way that in case MP admits an optimal solution with value W Prod-Opt , then DPS PM (K ) yields a solution which is feasible with regards to initial value (1 + ε/2). H 0 and capacity values (1 + ε). C MP , and whose cost value is no larger than W Prod-Opt . -Dominance based filtering rules: for any i, if states E 1 = (Z 1 , V Tank 1 , Rank 1 , Delay 1 ) and E 2 = Z 2 , V Tank -Logical filtering rule (check the feasibility of the process with regards to production capacity). For any i, if state E = (Z, V Tank , Rank, Delay) is such that: -Quality based filtering rule: it involves, as in Section 3.2.2, the pre-computed function Cost Min(i, V, Z ), which provides us with the minimal cost required from the micro-plant to produce V energy units between time p.i and time TMax, Z denoting the state of the micro-plant at the end of period i − 1. This function allows us to turn DPS PM into a greedy algorithm Greedy PM : • For any i and any related state E = (Z, V Tank , Rank, Delay), hydrogen quantity which remains to be produced is V = Σ q≥Rank+1 µ q + H 0 − V Tank ; • By the same way, the Qth refueling transaction cannot take place before period m Q + i − Delay − m Rank ; • So Greedy DPM works by keeping, for any i, only one state E, and choosing related feasible decision (z, δ) in such a way that resulting state E 1 = (Z 1 , V Tank 1 , Rank 1 , Delay 1 ) be consistent with above logical filtering rule and that: This greedy algorithm may be applied in order to provide us (in case of success) with an initial production strategy (z, δ) and its value W Prod Init . Then our Quality Based Filtering rule may be formulated as follows: For any i, if state E = (Z, V Tank , Rank, Delay) and related value W Prod are such that W Prod + Cost M in i, Σ q≥Rank+1 µ q + H 0 − V Tank , Z + α. (m Q − Delay − m Rank ) ≥ W Prod Init , then kill E.

Making VD and PM collaborate: The VD->PM pipe-line
The VD PM Decomposition of Section 2.2.3 suggests that the simple way to implement a collaboration between the Vehicle Driver and the Production Manager is to make them communicate through a one-way pipe-line: Once coefficient β, which determines the objective function of VD and which behaves as the main communication link between VD and PM, has been fixed, we may compute a Refueling Strategy (x, L) for the vehicle through the DPS VD algorithm, turn it into a Reduced Refueling Strategy (Q, µ, m, M, B ) and apply the DPS PM algorithm of Section V.  run VD PM Pipe Line and get a value denoted by Pipe Line. Finally we run DPS EPC and get exact optimal Exact DPS EPC Value. So we provide the following outputs: - (1) ST Max = Maximal number of states for a given pair (i, j ) obtained when only considering the strong dominance rule. (2) ST LOG = Maximal number of states obtained for a given (i, j ) when only applying the Logical Filtering.
(3) State DPS EPC = Maximal number of states obtained for a given (i, j ) when applying all filtering rules.
Comments: one sees that, while the filtering rules based on Strong Dominance have little filtering impact, those based upon logical anticipation and optimistic estimation are significantly more efficient. Still, we keep handing a large amount of states as soon as M and N become large. The pipe-line scheme DPS Vehicle − > DPS Production involves significantly less states and CPU time, for a gap almost negligible (Tab. 6).

Conclusion
We have been presenting here a collaborative dynamic programming scheme in order to solve a scheduling problem which requires synchronizing mechanisms. Many issues remain to be addressed: Extending our approach to several vehicles; dealing with uncertainties related to hydrogen production; casting the routing issue into the decision process; adapting the algorithms to one line or dynamic decision making.