RESTRICTED HAMMING–HUFFMAN TREES

. We study a special case of Hamming–Huffman trees, in which both data compression and data error detection are tackled on the same structure. Given a hypercube 𝑄 𝑛 of dimension 𝑛 , we are interested in some aspects of its vertex neighborhoods. For a subset 𝐿 of vertices of 𝑄 𝑛 , the neighborhood of 𝐿 is defined as the union of the neighborhoods of the vertices of 𝐿 . The minimum neighborhood problem is that of determining the minimum neighborhood cardinality over all those sets 𝐿 . This is a well-known problem that has already been solved. Our interest lies in determining optimal Hamming–Huffman trees, a problem that remains open and which is related to minimum neighborhoods in 𝑄 𝑛 . In this work, we consider a restricted version of Hamming–Huffman trees, called [ 𝑘 ]-HHTs, which admit symbol leaves in at most 𝑘 different levels. We present an algorithm to build optimal [2]-HHTs. For uniform frequencies, we prove that an optimal HHT is always a [5]-HHT and that there exists an optimal HHT which is a [4]-HHT. Also, considering experimental results, we conjecture that there exists an optimal tree which is a [3]-HHT.


Introduction
In information theory, there is a common trade-off that arises in data transmission processes, in which two goals are usually tackled independently: data compression and preparation for error detection.Paradoxically, these two goals have conflicting natures: while data compression shrinks the message as much as possible, data preparation for error detection adds redundancy to messages so that a receiver can detect corrupted bits, and fix them when possible.Data compression can be achieved using different strategies, often depending on the type of data being compressed.One of the most traditional methods is that of Huffman [3], which uses ordered trees, known as Huffman trees, to encode the symbols of a given message.
A Huffman tree assigns each symbol found on the message to be compressed to a new binary string, such that the total amount of data associated with the message, using this new encoding scheme, is as small as possible.Huffman trees achieve this goal by observing the frequencies of each symbol in the original message and assigning smaller codifications to higher-frequency symbols.A relevant aspect of Huffman trees is that this type of tree is proved to yield optimal codes.
In 1980, Hamming proposed the union of both compression and error detection features through a data structure called Hamming-Huffman tree [2].This data structure compresses data similarly to Huffman trees with the additional feature of enabling the detection of any 1-bit error due to error transmission.
In contrast to Huffman trees, building optimal Hamming-Huffman trees is still an open problem.An approximation algorithm having time ( log 3 ) was presented in [9,10] with a low additive error with respect to the entropy.Hamming-Huffman trees inspired other codes, such as in [11,12], in which the authors proposed a code called even codes.Even codes are obtained by a special type of Huffman tree in which symbols are allowed to be encoded only with an even number of ones.In this type of tree, nodes associated with an odd number of ones are either internal nodes or error leaves.The authors of that code provided an algorithm to build such a code in time ( 3 log ).In [14], the algorithm was improved to have time complexity of ( 3 ).They also presented two approximation algorithms for even codes having time complexity of ( log ), the second achieving a code having cost 16.7% higher than that of Huffman trees.Related/constrained problems are studied in [15].
Due to its importance and relevance for practical applications, we aim to study exact algorithms to build optimal Hamming-Huffman trees.Since the problem is still wide open, our approach is by constraining the number of levels of the tree at which its leaves can appear.This approach was employed in [1] to study Hamming-Huffman trees in which the leaves lie at a single level.
In this paper, we define a more restricted version of the problem of building optimal Hamming-Huffman trees.We tackle the problem of building optimal Hamming-Huffman trees in which the leaves lie in exactly  distinct levels.If  ≤ 2, we provide a polynomial time algorithm to solve the problem.Otherwise, we provide an algorithm to evaluate a lower bound on the optimal cost of such trees when the symbols have a uniform probability of occurrence.In this case, we also prove that there always exists an optimal Hamming-Huffman tree having their symbol leaves lying on at most 4 consecutive levels.
The paper is organized as follows.In Section 2, basic definitions and notations are presented.In Section 3, the problem of building Hamming-Huffman trees in which all symbol leaves lie on the same level is tackled.In Section 4, the problem of building Hamming-Huffman trees in which the leaves are distributed in two distinct levels is discussed.In Section 5, we prove that, for symbols with a uniform probability of occurrence, there is always an optimal Hamming-Huffman tree such that its symbol leaves lie on four consecutive levels.Also, we present an algorithm to evaluate a lower bound on the cost of such trees.In Section 6, we present some experimental results.Concluding remarks are presented in the last section.

Preliminaries
Let  be a simple graph and  ∈  ().The (open) neighborhood of , denoted by   (), is defined as When  is clear in the context, it may be omitted from the notation.
An -cube or a hypercube with dimension , is the graph   having  (  ) as the set of all binary strings with size  (and, therefore, | (  )| = 2  ).Moreover, (, ) ∈ (  ) if the binary strings of  and  differ exactly in one position.Let ,  ∈  (  ), the Hamming distance between  and , denoted by (, ), is the number of positions in which the binary strings of  and  differ.
The parity of a binary string  is the parity of the number of 1's in .
A subset  ⊆  () is called an independent set of  if, for all ,  ∈ , (, ) / ∈ ().We define the minimum neighborhood over independent sets, with size ℓ, of   as (ℓ, ) = min{| ()| :  ⊂  (  ), || = ℓ and  is an independent set of   }.A strict binary tree is a rooted tree such that each node has either two or zero children.The level of a node is the number of edges on the path from this node up to the root of the tree.A full binary tree is a strict binary tree in which all the leaves are at the same level.The height of a tree is the maximum level over all its nodes.
In the context of data compression techniques, an important data structure is the Huffman tree.A Huffman tree (HT)  is a rooted strict binary tree in which each edge (, ),  being a left (resp.right) child of , is labeled by 0 (resp.1) and the set of leaves of  is Γ, the set of all distinct symbols of which a message  to be sent consists.Given  , each symbol  of  is sequentially encoded into a binary string ().Such encoding is given by the sequence of 0's and 1's found on the edges of the directed path from the root of  to the leaf corresponding to .In Figure 1a, for instance, the leaves are encoded, reading them from left to right, as 00, 010, 011, 10, and 11.Over all possible trees, the HT for  is a tree  such that its cost is minimum, where () stands for the probability of occurrence of  in the message and |()| is the length of the string ().We say that an HT is uniform if all of its symbols have a uniform probability of occurrence, that is, each symbol has a probability of occurrence of 1 |Γ| .Figure 1a depicts a uniform HT  with ( ) = 2.4 on 5 symbols.
The concept of Hamming-Huffman trees generalizes that of Huffman trees.A Hamming-Huffman tree (HHT) is a strict binary tree holding the same properties of an HT, except that the set of leaves is partitioned into symbol and error leaves, such that the following properties hold: -Every node  of  such that ((), ()) = 1, for some symbol leaf  ∈ Γ, is a leaf of  called an error leaf ; -Every node of  is either an error leaf or an ancestor of a symbol leaf.
HHTs can be applied to detect errors which occurs in the transmission of messages.Under the assumption that when data is transmitted at most one bit can accidentally be flipped, HHTs detect such errors during the decoding process: if an error leaf is hit, the data has been corrupted during transmission.Optimal HHTs are defined exactly the same as (optimal) HTs. Figure 1b depicts an optimal uniform HHT  with ( ) = 3.8 on 5 symbols.In the presented figures, error leaves are colored black.
In this work, we will weaken the concept of HHTs to enlarge the number of trees which could be called HHTs.This is convenient to our algorithms and proofs, without losing the motivation associated with the application related to detecting error capabilities.
If a tree  is an HHT as previously defined, we will say that  is a strict HHT.The weaker definition of HHTs is the following.Given a strict binary tree  , in which the set of leaves is partitioned into symbol and error leaves,  is an HHT if ( ) is a strict HHT, where ( ) is a tree transformation defined as follows: if  has no sibling error leaves, then no modification to the tree is done; otherwise, if the sibling node of an error leaf  is an error leaf, remove both  and its sibling from  and make the node , parent of  in  , into an error leaf.Let  ′ be the resulting tree.Apply ( ′ ) recursively to obtain the final transformation.Figure 2a presents an HHT  in this weaker sense, and Figure 2b shows the strict HHT resulting of ( ).Note that ( ) can be done in time (ℓ).
The number of error leaves in HHTs is directly related to the encodings associated with the symbol leaves.In Figure 2a, symbol leaves are encoded as 000 and 111, resulting in 6 error leaves, whereas in Figure 2c, symbol leaves are encoded as 000, 011, 101, and 110, resulting in 4 corresponding error leaves.In this second HHT, two more symbols are being encoded using the same full binary tree as the one used in the first HHT.
Although HTs can be built efficiently in a greedy fashion [3], the construction of optimal HHTs is open since defined by Hamming in the eighties [2].In this work, we approach the problem by defining a constrained version of it, namely that of determining an optimal HHT  in which the symbol leaves are placed in exactly  distinct levels.A tree for which this property holds will be called a -Hamming-Huffman tree, or -HHT.Formally, we define the -HHT problem as Problem: -HHT Input: A set of symbols Γ and, for all  ∈ Γ, the probability p(a) of occurrence of  in a message.

Output:
An HHT  in which all symbol leaves lie at exactly  levels of  and such that ( ) is minimum.
In the following sections, the problem of -HHT is discussed.

Hamming Huffman trees with leaves in one level
In this section, we tackle the 1-HHT problem.This problem can be reduced to that of deciding the minimum height of the full binary HHT for which the symbol leaves can be arranged in the last level.
First, note that there is an important relation between optimal 1-HHTs with ℓ symbols and minimum neighborhoods of independent sets with ℓ elements of   .Consider the one-to-one mapping between the leaves of a full binary 1-HHT  having height  to the vertices of   , in which a leaf  corresponds to () ∈  (  ).The problem of finding the minimum number of error leaves, over all possible trees  , is equivalent to that of finding, over all independent sets  of cardinality ℓ = |Γ| in  (  ), one that minimizes | ()|.This is so because Thus, for a given 1-HHT  , the set of errors leaves of  is precisely  () in   and  = {() :  ∈ Γ}.
The efficient computation of (ℓ, ) is possible with the aid of Theorem 3.2.Before presenting the theorem, we have to state the following auxiliar lemma.4,8]).For any given non-negative integers ℓ and , ℓ < 2  , the number  has a unique representation The representation presented in Lemma 3.1 is defined as the -bounded canonical representation of ℓ.Also, in [4], Katona defined the function (ℓ, ) as follows: where the 's are those from Lemma 3.1 with respect to the given ℓ.
The following theorem was proven in [7].
A consequence of this theorem in [6], is that all the vertices belonging to the independent set  yielding | ()| = (ℓ, ) can be assumed to be, without loss of generality, of the same parity.This fact can be used directly to solve the 1-HHT problem, as shown in the following theorem.Let ℎ(ℓ) = ⌈log 2 ℓ⌉ + 1.
Theorem 3.3.Let Γ be a set of ℓ symbols, each  ∈ Γ having probability ().The height and the cost of an optimal 1-HHT  are, respectively, ℎ(ℓ) and Proof.To find an optimal 1-HHT  , it is necessary to determine the minimum height ℎ of  such that the symbols and their corresponding error leaves lie all at this level, that is, ℓ + (ℓ, ℎ) ≤ 2 ℎ .By [6], one may consider without loss of generality that the set of ℓ codifications consists of elements having a same parity.
In order to choose ℓ vertices with a same parity, one may use at most half the symbol leaves of that level.
That is, a full binary 1-HHT must have at least height ⌈log 2 ℓ⌉ + 1 to be able to contain ℓ leaves with a same parity thus, ℎ ≥ ⌈log 2 ℓ⌉ + 1.On the other hand, let  be a set of codifications having a same parity.First note that  is an independent set of   and the corresponding error leaves have the opposite parity.Therefore, any set of ℓ symbols leaves at level ⌈log 2 ℓ⌉ + 1 having a same parity consists of a valid 1-HHT.Consequently, ℎ ≤ ⌈log 2 ℓ⌉ + 1, yielding that ℎ = ⌈log 2 ℓ⌉ + 1 and ( ) = (⌈log 2 ℓ⌉ + 1) ∑︀ ∈Γ ().

Hamming Huffman trees with leaves in two levels
In this section, we discuss optimal 2-HHTs.We show that, similarly to the 1-HHTs, it is possible to build optimal 2-HHTs efficiently.We provide an algorithm for building optimal 2-HHTs that runs in time (ℓ log 2 ℓ), where ℓ = |Γ|.
A motivation to study this specific case lies in the fact that, for symbols with uniform probabilities of occurrence, there is always a Huffman tree with symbols in at most two different levels.This follows from Section 2.3.4.5 of Knuth's Vol. 1 [5].It is not known whether this is also the case for Hamming-Huffman trees.Experimental results were designed to investigate this hypothesis and the results are presented in Section 6.Furthermore, recall that the problem of building optimal general HHTs is open since the eighties.Thus, the approach of studying more restrictive cases is worthy, since a solution for a particular case may have practical value or lead to a solution for general HHTs.
First, consider any specific value for ℎ 1 .For such a value, let ℓ 1 be the number of symbol leaves that are placed at level ℎ 1 .Note that 1 ≤ ℓ 1 ≤ min{ℓ − 1, 2 ℎ1−1 }, since 2 ℎ1 is the maximum number of nodes at level ℎ 1 , and half of them have the same parity.
Once ℓ 1 symbol leaves are chosen to be placed at level ℎ 1 , there will be error nodes corresponding to such symbol leaves at this same level, and the remaining nodes will be free nodes from which the tree can grow to achieve larger levels (in particular, to achieve level ℎ 2 , where the remaining symbol leaves must lie).As seen in Figure 2, distinct sets of leaves lead to distinct sets of error leaves, the latter varying considerably in size.Clearly, to minimize the cost of the solution for the fixed values of (ℎ 1 , ℓ 1 ), it suffices to minimize the value of ℎ 2 .To do that, it suffices to distribute as uniformly as possible the remaining ℓ 2 = ℓ − ℓ 1 symbols leaves over the subtrees rooted at the free nodes.Indeed, it is possible to arrange all symbol leaves at level ℎ 2 of each subtree all having the same parity, ensuring that the leaves at level ℎ 2 pairwise have Hamming distance of at least 2. Therefore, the aim is to choose the set of symbol leaves at level ℎ 1 in such a way that the number of free nodes is maximized or, equivalently, that the number of error leaves is minimized.In other words, the algorithm must select a set of symbol leaves at level ℎ 1 which produces (ℓ 1 , ℎ 1 ) corresponding error nodes.For this choice, the maximum number of free nodes, that will be denoted by (ℓ 1 , ℎ 1 ), is given by Figure 3.A Hamming-Huffman tree with leaves on two levels.

⌉︁
symbols, and one of them receiving exactly such an amount.Minimizing the common height in each subtree rooted at each free node is a 1-HHT problem.By using the result ℎ(ℓ) of Theorem 3.3, we have that the minimum height ℎ ′ (ℓ 1 , ℎ 1 ) required for each subtree to accommodate those symbols is given by

⌉︂ )︃
To determine (ℓ, ℓ 1 , ℎ 1 ) in this case, it is needed to assign the set of symbols Γ to the set of chosen symbols leaves.But to minimize such a cost, it clearly suffices to place at level ℎ 1 the ℓ 1 symbols with the highest probability of occurrence.Assuming that Γ = { 1 ,  2 , . . .,  ℓ } is ordered decreasingly according to their respective probability of occurrence, we have that Figure 3 depicts this strategy.The nodes labeled with "" represent symbol leaves, the black nodes represent the error leaves, and the dashed nodes represent the free nodes.
The optimal cost is the minimum cost obtained by varying ℎ 1 and ℓ 1 over all possible values.Formally, the cost of an optimal 2-HHT  is given by Concerning the computational complexity for determining the optimal cost, for each 1 ≤ ℎ 1 < ℎ(ℓ), there are at most 2 ℎ1−1 possible values for ℓ 1 .Therefore, there are at most . Moreover, for the computation of each (ℓ, ℓ 1 , ℎ 1 ), the evaluation of (ℓ 1 , ℎ 1 ) is required.This evaluation can be computed in time (ℎ 2 1 ) = (log 2 ℓ) with the aid of a precomputed Pascal triangle.Besides that, a precomputed sum of values ∑︀  =1 (  ) for all 1 ≤  ≤ ℓ, which can be done in time (ℓ), can be used to obtain the summation present in (ℓ, ℓ 1 , ℎ 1 ) in constant time.Therefore, the complexity of evaluating the cost of an optimal tree is (ℓ log 2 ℓ).
The results of this section can be summarized by the following theorem.
Theorem 4.1.Let Γ be a set of ℓ symbols, each  ∈ Γ having probability ().The cost of an optimal 2-HHT  is and

Uniform Hamming-Huffman trees
In this section, we discuss the problem of building optimal uniform HHTs.In contrast to 2-HHTs, even for the more restrictive case of uniform probabilities, an efficient algorithm for building an optimal uniform HHT will remain open.Let ( ) be the difference between the last and the first levels of  which have at least one symbol leaf at that level.For instance, for  as in Figure 1a, ( ) = 1, where ( ) = 0 for any tree T of Figure 2. We prove that ( ) ≤ 4 for all optimal uniform HHT  .Moreover, we show that there is always an optimal uniform HHT  in which ( ) ≤ 3.In addition, all optimal uniform HHTs are [5]-HHTs, and there exists an optimal uniform HHT which is a [4]-HHT.Finally, we present a dynamic programming algorithm to evaluate a lower bound on the cost of such a tree.
Recall that (optimal) HTs for symbols with uniform frequencies have all leaves in at most two levels.It is unknown whether the same holds for HHTs.We consider the conjecture that there is always an optimal uniform HHT in which the symbol leaves are distributed in  ≤ 3 distinct levels, and we provide empirical evidence in favor of it.Section 6 compares the lower bound of this section with the cost of optimal 2-HHTs, the latter being computed as presented in Section 4.
Let  be a uniform HHT on ℓ symbols.Consider the following operation over a symbol leaf  of  : -() (see Fig. 4): replace the leaf  by a full binary HHT   having height two in such a way that this leaf becomes the root of   .The tree   is such that one of its leaves is the symbol leaf .Note that there are exactly two error leaves associated with  among the leaves of   , besides one free node  ′ , regardless which leaf corresponds to .Next, transform  ′ into a symbol leaf associated with any symbol  ′′ that appears in the last level of  .Finally, transform the node of  ′′ into an error leaf.Let  ′ be the resulting tree.Apply ( ′ ) to obtain the final transformation.
Let  ′ be the resulting tree after applying ().The following lemma proves that  ′ is also an HHT.
Lemma 5.1.Let  be an HHT,  be one of its symbol leaves and  ′ be the tree obtained by the operation ().The tree  ′ is an HHT.
Proof.We shall prove that, in  ′ , all the leaves with Hamming distance one to  are error leaves.As the root of   comes from a symbol leaf in  , all nodes with Hamming distance one to it in  ′ are error leaves.Moreover, as   is an HHT by construction, all the nodes in  ′ with Hamming distance one to  and  ′ are also error leaves in  ′ .Finally, the transformation carried out in the last step ensures that  ′ does not contain two sibling leaves which are both error leaves.That is, every node of  ′ is either an error leaf or an ancestor of a symbol leaf.Therefore,  ′ is an HHT.
Let  = 1 ℓ be the probability of occurrence of the symbol leaves of  and  ′ be the tree obtained by the operation (), for some symbol leaf  of  .Let  1 be the level of  and  2 be the last level of  .Note that, the symbol of  ′′ was moved from level  2 to level  1 + 2.Moreover, the symbol of  was moved from level  1 to level  1 + 2. Therefore, the cost of  ′ can be written as a function of the cost of  as ( ′ ) = ( ) − (( ) − 4).
Proof.We prove that when ( ) > 4 is always possible to obtain an HHT  ′ from  such that ( ′ ) < ( ), contradicting the optimality of  .
Let  1 and   be the first and the last level of  containing symbol leaves, respectively.Let  be the probability of occurrence of each symbol associated to  .Apply () to some symbol leaf  at level  1 to obtain  ′ .If ( ) > 4, then ( ) > 4 and, equivalently, ( ) − 4 = (( ) − 4) > 0. Therefore, by (5.1), we have that ( ′ ) < ( ).Moreover, note that each application of () eliminates a leaf in level  1 and a leaf in level   .By successive applications of descend(s) to symbol leaves, it is possible to obtain an optimal uniform HHT such that ( ) ≤ 3.
We will proceed in the remaining of this section by providing a lower bound on the cost of uniform HHTs.For this, we need to generalize the concept of Hamming-Huffman trees.A -Hamming-Huffman forest, or -HHF, is a forest  of HHTs such that the symbol leaves are distributed among exactly  distinct levels of  .Note that the trees of  may have a height greater than  since there might be some levels of  with no symbol leaves.The cost of a -HHF is defined by the sum of the costs of its HHTs.
The strategy to derive the lower bound on the cost of a -HHF for ℓ symbols and  trees is as follows.Consider ℎ 1 to be the first level in which symbol leaves appear in  .Let ℓ 1 be the number of symbol leaves to be represented in the level ℎ 1 .Clearly, the most desirable arrangement for choosing ℓ 1 nodes at level ℎ 1 is one in which the corresponding error leaves are minimized, that is, in which the free nodes are maximized.This is so because the remaining ℓ − ℓ 1 symbol leaves must be allocated as descendants of the resulting free nodes at level ℎ 1 , and the more resulting free nodes, the better.At this point, this strategy will deal with all those free nodes as independent trees of a ( − 1)-HHF.But, some of them may actually be part of the same HHT of the -HHF and, because of that, the symbol leaves allocated in a tree descending from a free node produce error leaves that may conflict with the allocation of symbol leaves descending from another free node.Since the possibility of conflict will not be dealt with, the resulting -HHF may not be feasible and that is why this strategy yields a lower bound on the cost of this -HHF.The lower bound, defined as   (, , ℓ), on the cost of an optimal -HHF of  disjoint HHTs for ℓ symbols derived from this strategy is evaluated as follows.
First note that if ℓ = 0, then   (, , ℓ) = 0. Also, if  = 0 (resp. = 0) and ℓ ≥ 1, it means that there are not enough free nodes to accommodate the remaining ℓ symbols.In other words, this scenario leads to an unfeasible solution and, therefore,   (, , ℓ) = +∞.If  = 1 and ℓ,  ≥ 1, the resulting problem is equivalent to the one of distributing ℓ symbols among  1-HHTs with the same height.In particular, each one of these trees must have at least ℎ(⌈ ℓ  ⌉) leaves to be able to accommodate all the symbols.Thus, using the same reasoning as the one used in Theorem 3.3 and, as the symbols have an equal probability of occurrence,   (, , ℓ) = ℎ(⌈ ℓ  ⌉).For the general case, the algorithm minimizes the cost over all possible pairs (ℓ 1 , ℎ 1 ).Note that despite the fact that there are only ℓ 1 symbols at level ℎ 1 , all the remaining ℓ − ℓ 1 symbols have a prefix with size ℎ 1 in their codifications.Given that, the first part of the cost of the general case is given by ℎ For the remaining part of the cost, the algorithm uses pre-computed values to solve the problem of distributing ℓ − ℓ 1 symbols among a ( − 1)-HHF in which the roots are the resulting free nodes, in a dynamic programming fashion.Formally,   (, , ℓ) can be expressed as - , and -  (, ℓ, ℎ) denotes the maximum number of free nodes when ℓ symbol leaves are allocated at level ℎ of an HHF consisting of  HHTs, and ℎ is the first level having leaves.The computation of   will be discussed next.
For 2-HHTs,   (2, 1, ℓ) is exactly the cost of a uniform 2-HHT using the algorithm presented in Section 4.Moreover, considering general uniform HHTs, the cost of an optimal uniform HHT with ℓ symbol is at least min{  (, 1, ℓ) : 1 ≤  ≤ ℓ}. (5.2) Figure 5 depicts the strategy being adopted in the computation of   .The computation of   (, ℓ, ℎ) will also be carried out by a dynamic programming algorithm.First, note that   (, ℓ, ℎ) equals the maximum number of free nodes when ℓ symbols are distributed among the leaves of  full HHTs with height ℎ.So, the strategy to yield the recurrence is as follows.First, suppose that ℓ 1 symbols are to be allocated into a single HHT.Therefore, the remaining ℓ − ℓ 1 symbols have to be allocated among the leaves of  − 1 full HHTs with height ℎ.Both allocations must be done in such a way that the number of free nodes is maximized.The former can be computed with the aid of the formula given in (4.1).The latter can be determined using recursion.Formally, we have Precomputing the values of   requires a matrix whose number of elements is (ℓ 2 log ℓ), as  is limited by ℓ and ℎ is limited by ℎ(ℓ).Moreover, as the processing of each cell of such matrix requires (ℓ) steps, precomputing the values of   takes time (ℓ 3 log ℓ).Assuming that the values of   are available at constant time, precomputing the values of   depends on a matrix whose number of elements is (ℓ 2 ), as  ≤ 4 by Theorem 5.2 and the remaining parameters are limited by ℓ.Furthermore, since processing each cell of such a matrix requires ( 1  2 ) = (ℓ log ℓ) steps, evaluating   takes time (ℓ 3 log ℓ).Therefore, the proposed lower bound can be computed in time (ℓ 3 log ℓ) and space (ℓ 3 ).

Experimental results
In this section, we describe some experimental results which have been performed in the context of the previous sections.
We have conducted three experiments.The first one is related to the algorithm described in Section 4. It compares uniform [2]-HHTs with the lower bound described in Section 5.The second experiment compares general [2]-HHTs with the Huffman trees aiming to enlighten the tradeoffs of both strategies.In the third experiment we have implemented a backtracking that finds an optimal uniform Hamming-Huffman tree for 1 ≤ ℓ ≤ 38.
Implementations of such algorithms were executed on a notebook having a CPU Core i7, with 8 GB RAM, running Ubuntu 16.04 OS.The algorithms were implemented in C++.The results are presented next.
All the programs related to this section are available at [17].

Uniform [3]-HHT optimality hypothesis
As the first experiment, we tested the hypothesis that uniform [3]-HHTs are indeed optimal.In this case, for all 1 ≤ ℓ ≤ 4096, we have compared the costs of the algorithms in Sections 3 and 4 with those produced by the algorithm in Section 5. Some values of this comparison are presented in Table 1.The first column represents the number of symbols.The second shows the cost of the corresponding [2]-HHT.The third represents the cost of the lower bound described in Section 5.This column is divided into two parts.The first is the minimum  value that minimized the cost of the resulting tree and the second is the cost of the tree.The last column of the table gives the relative differences between the costs presented in the last two columns.These costs and their  7. Difference in percent between the cost of optimal uniform [2]-HHTs and the cost of the lower bound of uniform -HHTs for ℓ symbols.relative differences are depicted in Figures 6 and 7, respectively.By observing the table and the figures one can note that, for symbols with uniform probability of occurrence, the cost of [2]-HHTs are very close to the ones of the lower bound, for all tested values of ℓ.The difference between these costs was no more than 2.1%.Moreover, all the trees obtained by the lower bound have symbol leaves in at most three different levels.

[2]-HHTs efficiency
In the second experiment, we have compared the costs and the error detection capabilities between [2]-HHTs and Huffman trees.The goal of this experiment is to present the tradeoffs of using [2]-HHTs instead of Huffman trees.In this comparison, we analyze their differences in compression and error detection rates.
Considering the compression, we have performed two tests.First, we compared the costs of uniform [2]-HHTs with the costs of uniform HTs.The second test compares the costs of [2]-HHTs and HTs for the Zipf distribution.The Zipf distribution is well-known for its empirical correspondence with the frequencies of words in natural languages [16].This relation describes that the th most frequent word in an alphabet occurs with frequency 1 .We use this distribution to simulate real-world compressions.Both these comparisons were done for 10 ≤ ℓ ≤ 1111110 and the results are shown in Table 2, which is organized similarly to Table 1.For both cases, the difference in the costs of the trees was inversely proportional to the number of symbols being encoded.Considering uniform trees, this difference converged to around 5% and, for the Zipf's distribution, this difference converged to around 25%.
Concerning error detection, we have compared optimal [2]-HHTs, HTs, and even trees.Notice that the Huffman trees have some sort of error detection capability.This occurs when, at the end of the process of decoding, the last node being processed by the HT is not a leaf.In this case, it means that some bits of the message have been corrupted.For this experiment, we build [2]-HHTs and HTs considering the Zipf distribution.The results reported for even trees are those from [13] in which a similar strategy of testing has been used.We have chosen a value for ℓ, in the range 10 ≤ ℓ ≤ 500 000, in such a way that the related optimal [2]-HHT has the value Number of symbol leaves Number of error leaves maximized.That is, the resulting tree minimizes the proportion of error leaves in comparison with symbols leaves, meaning that such a tree is the one that has the least capacity of error detection.For the given ℓ, we created an optimal [2]-HHT and an HT for ℓ symbols considering the Zipf distribution.For such trees, we have tested random messages with  symbols,  ∈ {10, 25, 50, 100, 250, 500, 1000, 2500, 5000}.For each one of these messages, we introduced  random errors in their bits, for all 1 ≤  ≤ min{, 20}.For each value of , we ran the test one million times, counting how many times the tree could detect the error.The percentage of detection of each tree is presented in Table 3.In this table, one may observe that the error detection capability of HTs seems to decrease as  increases.Also, comparing even trees with the optimal [2]-HHT, one may note that in both trees the error detection capability seems to be proportional to .Moreover, the optimal [2]-HHT seems to have a significantly greater detection capability.For instance, for a message with 500 symbols, the optimal [2]-HHT achieves an error detection rate that is achieved by the even tree only when the message has 5000 symbols.

Backtracking for optimal uniform HHTs
In the third experiment, we used backtracking to build an optimal uniform HHT for all 1 ≤ ℓ ≤ 38.We have concluded that, for these values of ℓ, there is always an optimal Hamming-Huffman tree with at most two levels with symbol leaves.Besides that, in some cases, there is also an optimal tree with more than two levels.For instance, for ℓ = 38, there is also an optimal tree with three levels.Another interesting aspect of this experiment is the fact that, considering optimal uniform Hamming-Huffman trees for 5 symbols, the backtracking obtained the tree depicted in Figure 8 which has a different structure from the one depicted in Figure 1, presented in the literature.

Conclusion
In this work, we have presented a restricted case of the problem of building optimal Hamming-Huffman trees.Namely, the problem of building -Hamming-Huffman trees (-HHTs), which are Hamming-Huffman trees in which the symbol leaves are distributed in exactly  distinct levels.For  ≤ 2, we presented a polynomial time algorithm to solve the problem.We showed that such a case is reduced to the problem of finding an independent  set  with a certain size ℓ of a hypercube   such that the cardinality of the neighborhood of  is minimum, over all such independent sets of size ℓ.The latter is a problem well-studied and has already been solved.For  ≥ 3, we presented an algorithm to evaluate a lower bound on the cost of such trees when the symbols have a uniform probability of occurrence.Moreover, we proved that, for uniform frequencies, an optimal HHT is always a [5]-HHT and that there exists an optimal HHT which is a [4]-HHT.Lastly, we have made some experiments to investigate the optimality of uniform [2]-HHTs and to measure the capabilities of compression and error detection of [2]-HHTs.Considering these experiments, we conjecture that there is always an optimal uniform HHT in which the leaves lie on at most three levels.We formalize this conjecture as follows.
Conjecture.Let Γ be a set of symbols having the same frequency.There exists an optimal Hamming-Huffman  tree associated with Γ such that  is a [3]-HHT.
Also, we conclude that 2-HHTs are indeed a viable solution to compress text data in real-world situations.In comparison with HTs, its cost is around 25% higher but it provides an excellent error detection rate.For instance, for block messages of size 5000, our experiment showed that the error detection rate is around 99.9994% for 2-HHTs.

Figure 4 .
Figure 4. Operation (), over a symbol leaf.The dotted node represents a free node to be used in the encoding of another symbol.

Table 1 .
Comparison between the cost of optimal [2]-HHTs and the lower bound on the cost of -HHTs.

Table 3 .
Comparison between the error detection capabilities of optimal HTs, even trees and [2]-HHTs.