FULLY POLYNOMIAL TIME APPROXIMATION SCHEME FOR THE PAGINATION PROBLEM WITH HIERARCHICAL STRUCTURE OF TILES

. The pagination problem is described as follows. We have a set of symbols, a collection of subsets over these symbols (we call these subsets the tiles) and an integer capacity 𝐶 . The objective is to find the minimal number of pages (a type of container) on which we can schedule all the tiles while following two fundamental rules. We cannot assign more than 𝐶 symbols on each page and a tile cannot be broken into several pieces, all of its symbols must be assigned to at least one of the pages. The difference from the Bin Packing Problem is that tiles can merge. If two tiles share a subset of symbols and if they are assigned to the same page, this subset will be assigned only once to the page (and not several times). In this paper, as this problem is NP-complete, we will consider a particular case of the dual problem, where we have exactly two pages for which the capacity must be minimized. We will present a fully polynomial time approximation scheme (FPTAS) to solve it. This approximation scheme is based on the simplification of a dynamic programming algorithm and it has a strongly polynomial time complexity. The conducted numerical experiments show its practical effectiveness.

Two examples of models, the tree model (left) and the cluster-tree model (right)extracted from [14].
the former memory pages).All the symbols have the same size which is equal to 1.We are also given a collection of subsets of these symbols called the tiles  (former virtual machines) and an integer capacity ().The size of a tile is the number of symbols it contains.We have to find the minimal number of pages (former physical servers), which can contain all tiles without exceeding a maximal number of  symbols per page.The other fundamental and inherent rule of the problem is that a tile cannot be broken: all its symbols must be assigned to at least one page to be able to claim that the tile has been allocated to the page.The NP-hardness proof of this problem is developed in [5].
In Figure 1 (extracted from [4]), we can see an example of an input for the Pagination Problem.There are four different tiles and three of them partially merge.For example, tile  1 = {, , , , } fuses with tile  2 = {, ,  } as the intersection between  1 and  2 is not empty:  1 ∩  2 = {, }.
It is worth-mentioning again that we exclude the case where a tile would be entirely contained in another one as this situation does not present any interest (it can be easily detected and avoided).
In [14], the authors claimed that "[. . .] The inter-VM sharing occurs largely in hierarchical fashion" and they proposed two hierarchical sharing models to represent this organisation: a model based on a tree and a more general model based on a cluster-tree.We can see two illustrative examples in Figure 2 (both illustrations are from the article of [14]).
In this paper, we consider that the tiles are organised according to the tree model.In the tree representation of Figure 2, each node contains one piece of information, i.e., one symbol.However, by generalising having multiple symbols per node, we create the most generic inputs for our subproblem.That is why from now on, we will present the tree model representation with numbers in the nodes.We label the nodes using a breadth-first algorithm, which starts with the root.Each number  ∈ [1, 2, . . . ] (with  the number of nodes in the tree) will represent the name of a node and   ∈ R will be equal to the number of symbols contained by a node, also called the size of the node.For example, in Figure 3, the node containing symbols  and  will be labelled  5 and its size  5 will be equal to 2.
There will be as much leaves as there are tiles in the input.We denote by  the number of leaves in the tree (the size of a tile is equal to the sum of the corresponding nodes' sizes).We can also verify that || = ∑︀  =1   and here again, each symbol is contained by only one node.
It is easy to observe that the well-known 2-PARTITION problem can be reduced to our problem 2 | merging, tree |  max .Recall that the 2-PARTITION problem consists in determining if a set of  given integers ( 1 ,  2 , . . .,   ) can be partitioned into two disjoint subsets  1 and  2 such that ∑︀ ∈1   = ∑︀ ∈2   (for further information, see [11].Indeed, we can consider an input where we have a tree with one root and  =  leaves (i.e.,  tiles).Every tile  is associated to the root ( 1 = 0) and to its specific leaf (containing its proper  +1 =   symbols, with  = 1, 2, . . ., ).These hypotheses have several consequences.First, the only symbol shared by all tiles is put in the root of the tree.Then, as the tiles do not share any other symbol, all the vertices apart from the root will have only one child.
This example illustrates how one can transform an input of 2-PARTITION problem to an equivalent input of our problem and thus perform the polynomial reduction 2-PARTITION ≺  2 | merging, tree |  max .Such a reduction proves that 2 | merging, tree |  max is NP-complete.

Introduction
In order to use the dynamic programming (DP) principle, we have to define what a state of this algorithm is and then we have to establish the recursive relationship we will use in it.A state will have four integer components [, , , ]: -: number of symbols assigned to page  1 -: number of symbols assigned to page  2 -: index of last tile on  1 -: index of last tile on  2 The second step in the design of a dynamic programming (DP) algorithm is to devise the recursive relationship.In our case, it is simple: the relationship shows the choice we have to make when we schedule a tile.This choice can be presented as a question: shall we assign the current tile to page  1 or to page  2? In the DP algorithm displayed in Algorithm 1, both choices are explored (see lines 6 and 7).

Pre-process
As the main idea in our algorithm is based on the tree representing how tiles share symbols, it is natural to compute the tiles following an order coming from that tree.We will begin with the tile ending with the left-most leaf (when  = 1) and then we will continue with the tile ending with the leaf just to the right of the previous one ( = 2) and so on until the tiles are all handled (when we arrive at the right-most tile with  = ).Of course, we can process the tiles in the opposite order (starting with the right-most tile and going toward the left-most one) with the same algorithm.
Before actually presenting the pseudocode of the DP algorithm, we need to introduce one more thing we use in it: a matrix M. A cell in the matrix represents the number of symbols we need to add to a page when we want to assign a tile to it.Of course, this number depends on the tile we are considering but it also depends on the tiles already scheduled on this page.
Thanks to the hierarchy into the tiles and if the tiles are processed in a correct order (which is described above), it is easy to see that we only need to know the last tile assigned to a page in order to know how many symbols we will have to add to this page to schedule the new tile.
Let us study a quick example.Let  be one of our two pages and let us say tile  is the last tile assigned to page .The value M(, ) represents the number of symbols we have to add to page  as a consequence of scheduling tile .
The matrix is filled in a pre-process before the beginning of the dynamic programming algorithm that is why this step is ignored in Algorithm 1.This is also the reason why this computation is not taken into account later in the time complexity of the DP algorithm.

The algorithm
Algorithm 1: The dynamic programming algorithm.
Input: The set of tiles  and the set of symbols  Output: The optimal value 1 begin Complexity.Recall that  is ∑︀  =1   .Thanks to the dominating rule, there is only one possible value for  while there are  + 1 possible different values for  and at most  different values for  and .From this, we know that we generate at most ( 2 . ) different states during one iteration.As there are  iterations in total, this gives us a time complexity of ( 3 .) in a worst-case scenario analysis.

Principe
Observation.We already used this property earlier but both  and  have  as maximum value such as:  ≤  and  ≤  .Furthermore, the best (but maybe infeasible) relaxed solution we can obtain is splitting the set of symbols  into two subsets having exactly the same size and scheduling the first subset on  1 and the other one on  2. This implies that OPT ≥  2 .The principle of the FPTAS we are going to present is to decrease the time complexity of the dynamic programming algorithm thanks to the reduction of the number of states we keep at each generation while assuring a certain quality in the solution we find at the end.
Steps.For one iteration , we can represent all the states generated in   in a 2-dimension coordinate system.The -axis (resp.-axis) represents the different values that  (resp.) can take.
First, we divide both axes (each going from 0 to  ) into subintervals defined as follows: .It implies a division of the space in squares.Then, we apply a modified algorithm based on the previous DP.For each iteration, the states of   are ranked according to their value .For each interval   , we will keep the state with the smallest -value as a representative for all states being the same interval.If two representatives are possible (if two states have the same -value), we will keep the one with the smallest -value.Such a representative is denoted by [ ♯ ,  ♯ , , ] and stored in the set  #  .This simplification technique is inspired from [8], in which the state space is divided into rectangular subspaces (based on upper and lower bounds) and a specific representative state for each sub-space is kept at each iteration of the FPTAS.Another successful example can be found in [9].Other geometrical forms, not rectangular, can be used for these sub-spaces (see for instance [10]) but generally are less efficient, since they do not guarantee usually a strongly polynomial time complexity.
→ For the need of the performance guarantee of the algorithm, we set  =  2 .In Figures 4-8, we can see an illustration of the two steps described above for an arbitrary iteration .

The algorithm
These modifications lead us to the modified algorithm showed in Algorithm 2. The next subsection of this paper is dedicated to the proof that this modified dynamic programming algorithm is an FPTAS.

The proof
Before proving that the modified algorithm is an FPTAS for the problem, we need to show a lemma that will be useful in the second part of the proof.Algorithm 2: FPTAS pseudo-code: reducing the number of states kept at each iteration.
Input: The set of tiles  and the set of symbols  Output: The optimal value 1 begin Proof by Induction.Base case.For  = 0, we have: ˓→ Then, the lemma is verified for  = 0. Inductive step.We assume that the property is verified until step  − 1 (with  ≥ 1).We want to prove that the lemma holds for step .
Let us consider an arbitrary state [, , , ] ∈   produced in step .This state can have been created from two different ways: the th job was scheduled on  1 and so we can replace  by  (represented in Eq. ( 3)) or it was scheduled on  2 and we replace  by  (see Eq. ( 4)).

Conclusion.
Since both the base case and the inductive step have been performed, by induction the lemma holds for all natural numbers.
Theorem 1.The problem admits an FPTAS.
Proof.In order to prove the modified algorithm is an FPTAS, we need to show two results: • First, we must prove that the algorithm respects the performance quality required to belong to this family of algorithms; • Then, we have to prove that the time complexity is bounded by a polynomial in the input size and 1/.Performance quality.Consequence of the Lemma 1: Let [ * ,  * , , ] ∈   be a state associated with an optimal solution.
The lemma claims there exists a state [ # ,  # , , ] in  #  such as: First, the equations ( 20) and ( 21) imply that: ≤ OPT +  * OPT (25) ˓→ The solution found by the modified algorithm respects the condition quality.

Time complexity.
As we keep only one representative per box and as each interval length is equal to , there are in fact   different possible intervals respectively for  # and  # .In every box, at most one state will remain.Moreover, by using the dominance between boxes, the number of non-dominated states [ # ,  # , , ] for a given couple (, ) can be reduced to (   ).Furthermore, there are  tiles in total, which means that for one iteration, there are at most  2 .(  ) generated non-dominated states.As there are  iterations, we generate at most (  4  ) non-dominated states.→ The time complexity of the modified algorithm can be reduced to (  4  ).Conclusion.Both conditions are respected, which implies that the modified algorithm is an FPTAS.Moreover, the complexity time is strongly polynomial.

Conclusion.
The problem admits an FTPAS with a strongly time complexity.
Both algorithms we designed to address 2 | merging, tree |  max being presented, let us move on the experimental study we conducted.

Experiments
Before going further, let us first recall a central element to our algorithms: the triangular matrix M (also called cost matrix).Recall that a cell in M represents the number of symbols we have to add to a page  in order to be able to assign tile  to it, knowing that the last tile assigned to  was tile .Since this matrix is fundamental for our algorithms, we decided to try to measure the impact of the average of values in the matrix might have on the performance of the algorithms.However, we have to mention that there is an input of the problem directly impacted by these values: the larger the mean values in the matrix, the larger the weights of the symbols and therefore the larger the ∑︀  =1   will be.The only control we have chosen to keep over the   is the range of values in which they are allowed to evolve.We have chosen three: i1:   ∈ [1; 50], i2:   ∈ [51; 100] and finally i3:   ∈ [101; 200].The number of tiles in an instance and the  used in the FPTAS are logical parameters to study.We have chosen to use five values for : 0.1, 0.3, 0.5, 0.7 and 0.9.The instances created have a number of tiles between 6 and 48.The last parameter we varied is the height of the tree representing the hierarchy in the tiles.We have results for three heights (ℎ = 6, ℎ = 7 and ℎ = 8).We tried for greater heights but the computation times became too long to test all the instances we wanted.In summary, for a height, we created three subgroups according to the range of values allowed for   (i1, i2 or i3).In each subgroup, we generated three groups of one hundred instances with a certain mean in the M matrix.

Impact of the number of tiles on the hardness of an instance
As the theoretical time complexities of both algorithms depend on the number of tiles  in an input, we suppose there exists a close link between  and the difficulty of an instance.Figure 9 displays three tables presenting simple statics on executions of the dynamic programming algorithm.Each table works in the same way: it regroups all inputs having the same cost mean value .Then, the instances are separated according to the number of tiles  its contain.Finally, we ran the DP on each group defined by (, ) and we added up the number of successful runs.For example, we see there is a 100% rate of success for inputs having 13 tiles and an average cost value of 22. Conversely, the success rate drops to 0% for inputs having more than 28 tiles whatever the average cost value.
We deduce that if an instance has 26 tiles or less, then we will be able to find a solution for it using our dynamic programming algorithm.However, if the input has 28 tiles or more, the combinatorial explosion becomes too important (the memory of the computer used to perform the tests was saturated).
There remains the case of instances having 27 tiles for which the data presented in Figure 1 do not allow us to draw a reliable conclusion.However, in another group (with larger processing time:   ∈ [51, 100]) for which a small number of dynamic programming runs were successful, the results seem to indicate that 27-tile instances are still solvable by DP algorithm (see Tab. 1).
We do not present the totality of the results for ℎ = 7,   ∈ [51, 100]] and a cost mean equal to 406 because we have only tried dynamic programming on a part of the instances (the first 70).As the other instances were not processed, we should not try to interpret the results as a whole.
In order to validate our results, we tried to show a positive correlation between the execution time and the number of tiles in an instance.For this, we used the Principal Component Analysis (PCA) method.We used an implementation of PCA provided by the R library FactoMineR and ran this algorithm on several result files.We studied [7,12,13] to help us draw our conclusions.
The data files selected for the PCA are the results of running the DP algorithm and the FPTAS (for three values of : 0.1, 0.5 and 0.9) on a group of 100 instances with characteristics ℎ = 7,   ∈ [1; 50], cost mean = 109.The number of tiles in this set of instances is between 13 and 38.
The first step when applying PCA is to check that there is not a factorial axis impacted by only a small number of individuals (and therefore that the cloud of individuals along the different factorial axes is roughly regular).In fact, if an axis was predominantly impacted by a small number of individuals, we would have to start by interpreting the results in terms of individuals and not in terms of variables first.This is not an issue here: most of the inputs are close to the axe's origin but not along a single axis (an example of a cloud of individuals is given in Fig. 10).
The numbers that appear in the figures generated by the PCA represent an instance: we could not use the names of the instances directly as they were too long.The correspondence is available, but it is sufficient to know that the larger a number is, the larger the number of tiles in the instance.We did not put all the clouds of individuals we had to generate and then check, that would have made too many different graphs.
Let us now study the graph of variables obtained when we used the PCA over the data of the runs of DP algotihm.An example is available in Figure 11.Some graphs of variables of the FPTAS are available in the appendix (see Figs. A.1-A.3).
It can be seen that systematically, the arrows representing the time and the number of tiles are separated by acute angles.Moreover, The length of both arrows are very similar.
We can conclude that the time and the number of tiles are positively and closely correlated

Impact of the average value in the cost matrix
To see if the average in the cost matrix has a direct impact on the execution times of the algorithms, we grouped in a single file all the results for the instances containing twenty tiles.Then, for each value of the average in the matrix M, we computed the average execution time.The results for ℎ = 7 are presented in Table 2.
In Figure 12 are presented several histograms summarising in a more visual way the data in the Table 2.
This hypothesis is confirmed by the PCA results presented in Figure 13.In order to obtain them, we ran the PCA on all the execution times of the FPTAS whatever the  value.We notice that the PCA shows that the arrows representing the time and the average in the cost matrix are almost orthogonal.It means that both variables seem to be very slightly correlated.But as their length are similar and nearly equal the radius of the circle, they both convey a lot of information in regard of the dimensions we chose for the graph of variables.

Impact of 𝜖
Regarding the value of , it seems that she smaller the , the longer the execution times.Indeed, the arrows representing these values are almost on the same straight line, of nearly the same length but of opposite directions.This means that the execution time and the value of  are negatively correlated: the bigger the epsilon, the smaller the execution times.
We can therefore state that the execution time of the FPTAS is inversely proportional to the value of .

Conclusion and perspectives
In this paper, we presented the work we conducted on a first scenario of the pagination problem.We establish that this scenario is NP-hard in the ordinary sense since a pseudo-polynomial dynamic programming algorithm is proposed to solve optimally the problem.In addition, we propose an FPTAS with a strongly polynomial time complexity.This scheme is based on the simplification of the dynamic programming algorithm by removing a part of the generated states at every iteration of the algorithm, without deteriorating the solution quality too much.The exact algorithm is compared to the FPTAS in terms of effectiveness and fastness.The numerical results are conducted and analysed, which allows us to understand more on the correlation of some parameters by applying PCA approach.
As perspectives, the study of other scenarios seems to be very interesting (the case of 2 symbols per tile is in especially very challenging).Moreover, we will try to find another reliable alternative to devise different FPTAS as it is done in other references [2,3,9].Finally, the study of differential approximation seems of a great interest since the general pagination problem is difficult to approximate [1].

Figure 1 .
Figure 1.An example of an input for the Pagination Problem -extracted from [4].

Figure 2 .
Figure 2. Two examples of models, the tree model (left) and the cluster-tree model (right)extracted from [14].

Figure 3 .
Figure 3. Example of an input for the 2 | merging, tree |  max where one node contains two symbols.

9 if
Several quadruplets have the same values for , ,  then Keep only the one having the smallest  Delete set −1 return min [,,,]∈ {max{, }}

Figure 4 .
Figure 4.All the states in   displayed in a chequered 2-dimension space.

Figure 5 .
Figure5.We select the state with the smallest  in the 1st "square".

Figure 6 .
Figure 6.When two states have the same , we take the one with the smallest .

Figure 7 .
Figure 7.We keep only one state for each delimited zone.

Figure 8 .
Figure 8.  #  is composed of the kept state of each interval.

Figure 9 .
Figure 9. Simple statistics over the number of successful execution of the DP algorithm.

Table 1 .
Presentation of simple statistics on the percentage of successful DP executions as a function of the number of tiles in the instances.Tile count Instance count Number of successful DP runs % of successful DP runs

Figure 10 .
Figure 10.Example of a cloud of individuals (cloud obtained by PCA on the DP results).

Figure 11 .
Figure 11.Graph of variables when PCA was fed data from DP executions.

Figure 12 .
Figure 12.Presentation of average execution times of the FPTAS on instances having 20 tiles.

Figure 13 .
Figure 13.Graph of variable when we ran the PCA over execution times of the FPTAS.

Figure A. 3 .
Figure A.3.Graph of variables when PCA was fed data from FPTAS executions where  = 0.9.

Table 2 .
Table of execution times of the FPTAS as a function of the average value in the cost matrix.