MAXIMUM ENTROPY ON THE MEAN APPROACH TO SOLVE GENERALIZED INVERSE PROBLEMS WITH AN APPLICATION IN COMPUTATIONAL THERMODYNAMICS

In this paper, we study entropy maximisation problems in order to reconstruct functions or measures subject to very general integral constraints. Our work has a twofold purpose. We first make a global synthesis of entropy maximisation problems in the case of a single reconstruction (measure or function) from the convex analysis point of view, as well as in the framework of the embedding into the Maximum Entropy on the Mean (MEM) setting. We further propose an extension of the entropy methods for a multidimensional case. Mathematics Subject Classification. 90C25, 90C46, 62G07, 62P30. Received July 20, 2020. Accepted January 14, 2021.


Introduction
In some problems coming from applied physics, a multidimensional function f taking values in R p ought to be reconstructed given a set of observations. In thermodynamics, information on the function of interest, namely the p components of the function f we wish to reconstruct, are indirectly available. In general the available information consists in the value of integrals that involves the unknown function f and some known weights (λ i ) i=1,...,p . For example, one can consider an interpolation problem when the integration measure consists in Dirac masses. In this case we give at known locations the value of a scalar product between f and λ, see expression (1.2). In the present work, we need to consider more general constraints. Therefore we study a reconstruction problem in which constraints are defined as integrals involving the unknown function f and the weight function λ against suitable measures Φ, see expressions (1.1) and (1.3).
In the sequel we provide a general method for the reconstruction of a p-real valued function from partial knowledge submitted to the general constraints previously discussed. We refer the interested reader to Chapter 2 of [22] for the basic rules of thermodynamics and Chapter 5 of [22] for the description of functions which are ordinarily considered for the reconstruction of thermodynamic quantities.
To be more precise, we consider a R p valued function f (x) = (f 1 (x), . . . , f p (x)) defined for all x in the compact set U ⊂ R d (we assume the interior of U to be non-empty). We set our work in a probability space (U, B(U ), P U ) where B(U ) is the Borel σ-algebra and P U is the given reference measure. In such framework, we wish to reconstruct f over U and such that the reconstructed quantity satisfies the N following integral constraints where Φ l are N positive (known) finite measures on (U, B(U )) and λ i are known continuous weight functions.
The expression of integral constraints as in (1.1) allows to express a wide range of problems. For example, one can consider some pairs (x l , z l ) ∈ U × R for l = 1, . . . , N , and wish to solve the following interpolation equations Expression (1.2) can be obtained from (1.1) by choosing dΦ l (x) = δ x l (dx) the Dirac measure located at x l for all l = 1, . . . , N . Therefore, the integral constraints (1.1) become interpolation constraints. When U is a subset of R, one can also involves the l first moments of f i by taking dP U (x) = dx and dΦ l (x) = x l dP U (x). Namely, integral constraints (1.1) become in this case In our work, the z l represent N ideal real-valued measurements. In the case of noisy observations, a relaxed version of problem (1.1) can be considered. The aim is then to reconstruct p real-valued functions on U such that where K l is an interval in R. In the sequel K will denote the product of the N intervals K l . In the general case, problem (1.3) is ill-posed and has many solutions. In our work, we propose to choose among the solutions the function f that maximises I γ the γ-entropy of the function f defined by where γ is a strictly convex function from R p to R. In this framework, the reconstruction problem we consider can be rephrased as max I γ (f ) The resolution of problem (F p Φ,γ ) is conducted in two steps. We consider a dual problem on (signed) measures as a first step. The second step consists in solving a discrete approximation of the dual problem. This approach is summarised in Figures 1 and 2 (see on the next page). The Figure 1 presents the approach in the case p = 1 and λ(x) = 1, ∀x ∈ U , which has already been treated in [10]. Figure 2 presents the case p = 1, which is the extension of [10] treated in this work. The resolution we propose involves the embedding into a more complicated framework. Let us sketch the description of this framework. Let V be a Polish space and P V be the reference measure on V . Unless it is specified, V is a compact space. The resolution we propose involves a transfer between U and V . A more precise description of such transfer will be given in Section 2 (unidimensional case) and Section 3 (multidimensional case).
We first recall how the Maximum Entropy (ME) method is put in action. Originally, the ME method aims at the reconstruction of a probability measure P when dealing with information on the expectation under P of some random variables. We give below a first example. Example 1.1. When V = R, one may want to reconstruct a probability measure P such that the quantity t k dP (t) for some k ∈ N is equal to given values m k .
More precisely, define the entropy of a probability measure P with respect to the measure P V as if P P V and log dP dP V ∈ L 1 (P ) −∞ otherwise, (1.5) where P P V means that P is absolutely continuous with respect to P V . ME method derives as solution the probability P ME which maximises the entropy, provided that the information for the reconstructed probability measure meets the information asked.  [10] to solve inverse problems.
In information theory and statistics, one usually considers the opposite of the entropy, that is the so-called Kullback-Leibler divergence of P with respect to P V which is defined by dP dP V dP if P P V and log dP dP V ∈ L 1 (P ) +∞ otherwise. (1.6) Equivalently the ME method derives as a solution the probability measure P ME which minimises the Kullback-Leibler divergence from the reference measure P V under the constraints. Reference measure P V can be interpreted as a prior measure. The Kullback-Leibler divergence defined in (1.6) is called the I-divergence in [6,7]. The author also calls I-projection the probability measure that maximises the entropy (1.5) on a convex set of probability measures. Further in [8] an axiomatic justification for the use of the ME method is provided.
In a more general case, the entropy problem can target a reconstruction of a signed measure. In this case, the γ-divergence D γ , defined in the expression (1.7), is considered instead of the Kullback-Leibler divergence D KL (1.6). The authors in [2,3] have studied the minimisation of the γ-divergence D γ under linear constraints. Let F be a signed measure defined on V . The classical Lebesgue decomposition of F with respect to P V is F = F a + F s with F a P V the absolutely continuous part and F s the singular part. F s is singular with respect to P V means that it is concentrated on a setṼ such that P V (Ṽ ) = 0. We recall as well the Jordan decomposition of measure F s F s = F s,+ − F s,− with F s,+ and F s,− two positive measures mutually singular. The γ-divergence D γ is then defined as follows where the integrand γ is a convex function and F a , F s,+ and F s,− are as defined previously. The scalar quantities b ψ and a ψ , with a ψ < b ψ , are the endpoints of ψ domain with ψ the convex conjugate of γ defined by Taking back the Example 1.1, the reconstruction problem can be put in the frame of an optimisation problem as (M 1 ϕ,γ ) defined in Figure 1. In this example, the objective function is the Kullback-Leibler divergence D KL , that is the criterion D γ when γ is the convex function defined on R + by y → y log(y) − y − 1. The scalar quantities a ψ and b ψ are respectively equal to −∞ and +∞ which leads to a reconstruction with no singular parts. The moment constraint can be written as V ϕ(t)dF (t) ∈ K by taking K = {m k } and ϕ : t → t k . Finally, one has to add the constraint V dF (t) = 1 to ensure that the reconstructed measure is a probability measure.
Notice that the expression in (1.7) contains terms depending on the singular part F s of measure F . Those terms may not be considered in the γ-divergence (1.7) depending on the convex function γ used, see [5] and Example 1.1.
More generally the author in [18,19] studies the characterization of the optimal signed measure which minimises (1.7) under linear constraints. Integral functionals with normal convex integrand, i.e. integrals for which the integrated function is strictly convex with respect to one of its variable, are studied. See more comments on normal convex integrand in Chapter 14 of [28]. See also [20] for a systematic treatment of convex distance minimisation.
We now recall that many usual optimisation problems (M 1 ϕ,γ ) can be set in an entropy maximisation problem frame, as proposed in the early paper [23]. This general embedding in ME is called the Maximum Entropy on the Mean (MEM) method and has been developed in [11,13]. The method is based on a suitable discretisation of the working set V . The reference measure P V is approximated by a point-measure supported by n pixels, t 1 , . . . , t n which are deterministic points in V such that P n = 1 n n i=1 δ ti → P V . By associated to each pixel t i a real random amplitude Y i , we defined the random point-measure F n by By construction F n P n . Notice that we choose to present the simple case of real random amplitudes Y i for this introduction, but one can consider more complicated constructions. We will do so in the Section 3.3 where the amplitudes Y i will be vectors in R p .
In the MEM problem, one wants to determine the "optimal" distribution Q to generate vectors of n real random amplitudes (Y 1 , . . . , Y n ). Optimal distribution Q must be such that the constraints considered in problem (M 1 ϕ,γ ) applied to the random point-measure F n is met on average, that is that Taking back Example 1.1, the constraints applied to the point-measure approximation become We assume that the random amplitudes are independent. Let Π denote their prior distribution so that the reference distribution for (Y 1 , . . . , Y n ) is the tensor measure Π ⊗n . In order to build the optimal distribution, we minimise the Kullback-Leibler divergence (1.6) with respect to the prior Π ⊗n under the constraints (1.9). When such "optimal" distribution exists, we will denote the solution by Q MEM n . Then let F MEM n be defined as follows (1.11) Notice that unlike F n , the quantity F MEM n is no longer random. Let the log Laplace transform of probability measure Π be denoted by ψ Π with D ψΠ the domain of ψ Π . We denote by γ Π the convex conjugate of the log-Laplace transform ψ Π 1 . We hope that the reconstruction F MEM n is a good approximation (in a sense we will further specify) of the solution of the corresponding continuous problem (M 1 ϕ,γ ), for which convex function γ is the function γ Π . The properties of the minimising sequence (F MEM n ) n have been studied in [13]. The authors in [14] deal with a multidimensional case, that is estimating a vector of reconstructions when dealing with information on generalised moments of each components. In [15], Bayesian and MEM methods are proposed to solve inverse problems on measures. More details about the MEM method will be provided in Section 2 for the case p = 1 and Section 3.3 for the case p = 1. In particular we explain how to choose reference probability measure Π so that the criterion D KL (·, Π ⊗n ) for discrete problem (M 1,n ϕ,Π ) is a good alternative to criterion D γ (·, P V ) of continuous problem (M 1 ϕ,γ ). Back to the function reconstruction problem, an extension of the MEM method to solve generalised moment problems for function reconstruction as in (F 1 Φ,γ ) is proposed by [10] in the case p = 1 and λ(x) = 1, ∀x ∈ U . The method uses a transfer principle which links the function to reconstruct to a corresponding measure. The transfer relies on the use of suitable kernels. Such transfer is particularly useful when considering measures Φ in the constraints equation (1.3) that might not all be absolutely continuous with respect to the reference measure P U .
Our work has a twofold purpose. We first make a summary of entropy maximisation problems in order to reconstruct a single measure and, by extension with the linear transfer, entropy maximisation problems in order to reconstruct a single function. We propose a global synthesis of the entropy maximisation methods for such reconstruction problems from the convex analysis point of view, as well as in the framework of the embedding into the MEM setting. We then propose an extension of the entropy methods for a multidimensional case. Such extension is the main contribution of this work. We study the MEM embedding for the function reconstruction problem that is proposed in [10] in the extended case of inverse problems, that is when p = 1 and the λ i are any known bounded continuous functions. We provide a general method of reconstruction based on the γ-entropy maximisation for functions submitted to generalised moment and interpolation constraints as in (1.3).
This paper is organized as follows.
In Section 2, we recall in a global synthesis of entropy maximisation methods some results for the specific case of a single function reconstruction problem (F 1 Φ,γ ) or a single measure reconstruction problem (M 1 ϕ,γ ). First we describe how the transfer principle works, that is how a function problem (F 1 Φ,γ ) can be linked to the measure problem (M 1 ϕ,γ ) in Section 2.1. Then we recall some results about the resolution of the γ-divergence minimisation problem (M 1 ϕ,γ ) in Section 2.2. In Section 2.3, we take a specific look at problem (M 1 ϕ,γ ) when the convex function γ is the function y → y log(y) − y − 1. This specific problem is the ME problem which will be denoted by (M 1 ϕ,ME ). We then extend the class of studied optimisation problems by giving the construction of the MEM problem setting and we provide some properties of the MEM reconstruction in Section 2.4. Finally in Section 2.5, the setting of some usual optimisation problems into the entropy maximisation frame is reminded.
The main contributions of this paper are the results presented in Section 3. It consists in the study of the entropy methods for the reconstruction of a multidimensional function submitted to very general constraints such as an integral inverse problem as in (1.3). We extend the approach of [10] for the case p = 1 and any known bounded continuous functions λ i . We study the embedding of the functions reconstruction problem (F p Φ,γ ) into the MEM problem (M p,n ϕ,Π ) framework. Such study is stepped in three independent parts. First in Section 3.1, we study the problem (F p Φ,γ ) in a convex analysis framework. We express the optimal solution thanks to Fenchel duality theorem. This first approach lacks to give a suitable reconstruction when constraint measures Φ are not absolutely continuous with respect to P U . To remedy this issue, we propose in Section 3.2 to transfer the functions reconstruction problem (F p Φ,γ ) to a corresponding measures reconstruction problem (M p ϕ,γ ). The transfer is performed using suitable continuous kernels. Finally in Section 3.3, we set problem (M p ϕ,γ ) obtained by the transfer into a MEM problem framework and we study the reconstructions given by problem (M p,n ϕ,Π ). Applications will be presented in Section 4. We consider first some simple examples of single function reconstructions and then a two-functions case study inspired by computational thermodynamics.

The specific γ-entropy maximisation problem for a single reconstruction
We give in this section some details about the γ-entropy maximisation problem in the case of a single reconstruction, that is that we are interested in a single function or a single measure reconstruction. Those problems have already been studied, see for example in [10] for problem (F 1 Φ,γ ), in [3] for problem (M 1 ϕ,γ ) and in [13] for problem (M 1,n ϕ,Π ). Let us recall some results of these authors. In Section 2.1 we are interested in the link between a function reconstruction problem (F 1 Φ,γ ) and a measure reconstruction problem (M 1 ϕ,γ ). The function reconstruction problem is set as and the measure reconstruction problem as The idea is to set a transformation from measures on V to functions on U . Such transformation is the linear transfer we will further describe in the Section 2.1.
We remind the reader that we consider a number N of constraints. Therefore Φ and ϕ take values in R N . In addition we will consider that ϕ is continuous.
In Section 2.2 we study the γ-divergence minimisation problem under constraints, that is problem (M 1 ϕ,γ ). We recall the results of [3] for the existence of an optimal solution to problem (M 1 ϕ,γ ). We take a closer look in Section 2.3 at the Maximum Entropy (ME) problem. This problem corresponds to problem (M 1 ϕ,γ ) in the special setting when γ is the function y → y log(y)−y −1 and the γ-divergence coincides with the Kullback-Leibler divergence. We recall results on the existence of the optimum and its expression when such minimiser exists.
In Section 2.4, we recall the setting of the nth MEM problem (M 1,n ϕ,Π ) min D KL (Q, Π ⊗n ) We give, when it exists, an expression for the minimiser Q ME n of the nth problem. A function g ME n , related to the expectation of the random amplitudes under Q ME n , can be defined. We will see that under some assumptions, it exists one particular v ∈ R N such that the sequence (g ME n ) n converges to a function g ME ∞ (t) = ψ Π ( v, ϕ(t) ). This limit will be expressed.
Finally in Section 2.5, we give examples of some classical optimisation problems embedding into the MEM framework.

Transfer principle
We briefly recall the idea of the transfer principle developed in [10], in order to put to work the γ-entropy methods in the case of a function reconstruction problem. Let us denote by (U, B(U ), P U ) the probability space where U ⊂ R d is compact (non empty), B(U ) is the associated Borel set and P U is the reference measure. In the case of the reconstruction problem for a single function, one wants to reconstruct over U a function f taking values in R such that f satisfies the integral constraints (2.1) Let γ be a given convex function taking values in R and D γ ⊂ R its domain. Then the γ-entropy is defined by for a function f defined on U and taking values in D γ . In order to chose among the functions that satisfy (2.1), we propose as a selection criterion to maximise the γ-entropy. This means that we consider the optimisation problem ( The method proposed by [10] is to transfer the function reconstruction problem to a measure reconstruction problem thanks to some continuous kernel K. Such kernel links the function to reconstruct to a corresponding signed measure. Recall that V is a Polish space and P V the reference probability measure on V . The idea is that if one can reconstruct over V a signed measure F such that then a regularized function f K can be reconstructed linked to the measure F . To do so, we proceed as follows. We denote by K a continuous kernel defined over U ×V taking values in R. The kernel K is such that measure Φ of the integral constraint (2.1) is linked to a regularized function ϕ K involved in an integral constraint as in (2.2). The relation linking Φ to ϕ K is given by Therefore, for any continuous kernel K, one can reconstruct the regularized function f K associated to F by defining Hence as a consequence of Fubini theorem, if the measure F satifies (2.2), the regularized function f K defined above satisfies (2.1).

γ-divergence minimisation problem
In this section, we recall the results provided by [3] for the γ-divergence minimisation problem for signed measure reconstruction. We set our work on V . Let F be a signed measure defined on V . The classical Lebesgue decomposition of F with respect to P V is F = F a + F s with F a P V the absolutely continuous part and F s the singular part. F s singular with respect to P V means that it is concentrated on a setṼ such that P V (Ṽ ) = 0. Notice that F a and F s are still signed measures. Recall the Jordan decomposition of measure F s F s = F s,+ − F s,− with F s,+ and F s,− mutually singular. Let γ be essentially strictly convex. We denote by ψ its convex conjugate, that is Considering a ψ and b ψ , with a ψ < b ψ , the endpoints of ψ domain, we define the γ-divergence D γ by The problem we consider is the following The authors in [3] study the existence conditions of an optimal solution using convex analysis tools. Their first result is to consider the following dual problem of (M 1 ϕ,γ ) which relies on a Lagrange multiplier In addition, they give conditions for problems (M 1 ϕ,γ ) and (M 1, ϕ,γ ) to have solutions. These results are recalled in Theorem 2.1 below.
The next theorem proposes a more precise characterisation of the solution for problem (M 1 ϕ,γ ) under the same conditions specified in Theorem 2.1. (1) The absolutely continuous part with respect to P V of solution F o of (M 1 ϕ,γ ) is given by where v is the solution of (2) If, in addition, for all v ∈ R N and for all t ∈ V , v, ϕ(t) is in the interior of D ψ , the singular part vanishes.
It can be noted that when D ψ = R, Theorem 2.2 always gives solutions that are absolutely continuous with respect to P V . The condition to have D ψ = R, is to consider a function γ that is such that the ratio | γ(y) y | is equal to ∞ on the edges of D γ , see Lemma 2.1 of [3].
We will see in Section 2.4 that the approach proposed by the embedding in the MEM framework boils down to the same results provided by Theorem 2.2.

Maximum entropy problem
In this section we take a better look at the problem of maximising the entropy of a probability measure under generalised moments constraints, that is the ME problem with ϕ defined on V and taking values in R N . We remind the reader that in this section, the first component of function ϕ is the constant 1 and that K is the product {1} × K 1 × · · · × K N −1 . Notice that for the results recalled in this section only, V does not need to be compact and ϕ has not to be continuous. The definition of the Kullback-Leibler divergence D KL is recalled below The problem proposed in Section 2.2 is a more general setting of the original ME problem. The reconstruction provided by the ME method satisfies the following properties. Those are showed by Shore and Johnson in [29].
(1) Uniqueness: If the solution of ME problem exists, it is unique.
(2) Coordinate independence: The reconstruction is independent of coordinate system choice.
(3) System independence: If the probability space (V, B(V ), P V ) consists in the product space of m probability spaces, the reconstruction over the whole probability space is the tensor product of the reconstructions on each probability space. In other words, if (4) Subset independence: If probability space (V, B(V ), P V ) consists in the union of m probability spaces, the reconstruction over the global space leads to the same measure than the reconstruction problem conditioning on each probability space. In other words, it does not matter whether one treats the information as a subset V j of whole set V in a conditional constraint or in the full system.
We recall in this section some results of [9,13] in order to solve problem (M 1 ϕ,ME ). Results are stated without any proof for the setting of our example.
We give first a definition of the generalised solution for problem (M 1 ϕ,ME ). This definition requires the use of a minimising sequence of D KL (·, P V ), defined as follows. Let (P n ) be a sequence of probability measures.
). Let us consider a sequence of probability measures (P n ) n∈N defined on (V, B(V )) such that (P n ) converges and is a minimising sequence of D KL (·, P V ) and such that for all n, probability measure P n satisfies V ϕ(t)dP n (t) ∈ K.
Then we call generalised maximal entropy solution the measure P MEG that is such that If P MEG also meets the constraint, then it is called the maximal entropy solution of problem (M 1 ϕ,ME ) denoted by P ME .
We will now recall some results on the existence of a generalised solution for problem (M 1 ϕ,ME ) and the shape of the solution when it exists.
Let us first define P K the subset of probability measures that satisfy the constraints of the ME problem (M 1 ϕ,ME ), that is With the previous definition of P MEG , we recall a result of [9] for the existence in our framework of the generalised solution of problem (M 1 ϕ,ME ).
Let us now introduce several definitions involved in the characterisation of the problem (M 1 ϕ,ME ) solution. For all v ∈ R N , we define the quantity where ·, · is the usual scalar product in R N . We denote the domain of Z P V ,ϕ by D P V ,ϕ , that is the subset of vectors in R N that are such that Z P V ,ϕ (·) is finite The following definition describes the so-called exponential familly with respect to probability measure P V . The interested reader may be referred to [1] for more details about exponential models. Definition 2.6. The ϕ-Hellinger arc of P V is a family of measures P v that are defined by The family of measures P v defined as in (2.6) for all v ∈ D P V ,ϕ may also be called the exponential model with respect to P V .
We recall below the Theorem 4 from [7] that characterises the generalised reconstruction P MEG . This theorem describes the reconstruction P MEG as an element of the ϕ-Hellinger arc of P V . More important, the reconstruction problem (M 1 ϕ,ME ), which is infinite dimensional, is transformed into the finite dimension problem (2.7) that considers the vectors v in D P V ,ϕ , see expression (2.5).
Theorem 2.7 ([7], Thm. 4). The reconstruction P MEG belongs to the ϕ-Hellinger arc of P V if and only if it exists one measure P in P K such that P P V . Then, defines The previous theorem does not ensure that the reconstruction P MEG will satisfy the constraints of problem (M 1 ϕ,ME ). Of course, the reconstruction P MEG is more interesting when P MEG belongs to P K . We then have the following corollary for a reconstruction that satisfies the constraints. Such reconstruction is then denoted by P ME .

Corollary 2.8 ([13]
). If there exists a measure P ∈ P K such that P P V and if D P V ,ϕ is an open set, then P MEG is the reconstruction P ME in P K and P ME belongs to the ϕ-Hellinger arc of P V .
As an illustration, we propose the following simple example of a probability measure reconstruction which maximises the entropy with respect to the standard Gaussian distribution N (0, 1). The added constraint is a fixed valued for the first order moment. By giving the first order moment equal to m, the reconstruction we obtain is, as one can expect, the Gaussian distribution centred in m and with unit variance.
Example 2.9. The working probability space is (R, B(R), N ) where N is the standard Gaussian distribution N (0, 1). We wish to reconstruct the probability measure with given first order moment which minimises the Kullback-Leibler divergence. Our problem is rewritten Using the Theorem 2.7 recalled previously the problem becomes (2.8) Classical optimality criterion applied to the maximisation problem (2.8) gives v = m. Then, the Radon-Nikodym derivative, see [24,26], with respect to N for the reconstructed probability measure P ME is equal to P ME is therefore the Gaussian distribution N (m, 1).

Maximum Entropy on the Mean
We recall in this section how the MEM method works. Let us first recall the problem studied. We assume here that V is compact. V is discretised with a suitable deterministic sequence t 1 , . . . , t n such that point probability measure P n = 1 n n i=1 δ ti approximates well the probability measure P V . We denote by Q a distribution that generates a vector (Y 1 , . . . , Y n ) of n real random amplitudes and by F n the random point measure defined by One wants to determine the "optimal" distribution Q to generate Y 1 , . . . , Y n such that point-measure F n meets on average the constraints defined as follows with ϕ a continuous function taking values in R N . We set Π a given reference distribution on R and we study the Kullback-Leibler divergence between the joint distribution Q and the tensor distribution Π ⊗n under some constraints. This is summed up in a more concise way by the following problem min D KL (Q, Π ⊗n ) We assume the support of Π to be ]a; We denote by ψ Π the logarithm of the moment generating function of Π We recall below some results of [13] for the resolution of the nth MEM problem and the convergence of the obtained solution. First Section III.3, Lemma 3.1 of [13] recalled below gives sufficient conditions for the existence of a solution to the MEM problem (M 1,n ϕ,Π ).
Then, for n sufficiently large, the solution Q ME n to the MEM problem (M 1,n ϕ,Π ) is For the convergence result of the MEM reconstruction, we define the function g ME n by g ME and M n (t) is the number of elements in M n (t). The next theorem requires the strong assumption denoted by (H6). The notation ∂ of assumption (H6) refers to the edge of the set.
Then g ME n converges to g ME ∞ g ME Remark 2.13. Let the real random sequence (X n ) be defined by X n = 1 n n i=1 Y i ϕ(t i ) and let us denote by Q n the law of X n . Under the assumptions (H1) and (H2) and provided that ψ Π is sufficiently regular, one can characterise the asymptotic behaviour of Q n . As n tends to infinity, Q n tends to concentrate on the events that belong to the compact set K. That is, let x ∈ K, we have that where the rate function is Here, the approximation is related to the classical large deviations property, see [12]. This remark is more formally given in Section III.4, Corollary 3.5 of [13] as the large deviation property of the sequence (Q n ).
Thereafter in the Section 3, we aim to reconstruct the p components of a vectorial measure F subject to the following integral constraints We will follow the MEM construction given in the present section. For the multidimensional case, the random measure F n will then be vectorial and the sequence (Y i ) i=1,...,n will be a sequence of vectorial amplitudes in R p .

Connection with classical minimisation problems
One can notice that the link between the γ-entropy maximisation problem (M 1 ϕ,γ ) and MEM problem (M 1,n ϕ,Π ). Indeed if one chooses the convex function γ involved in problem (M 1 ϕ,γ ) to be the convex conjugate of ψ Π , we have that max where H ∞ Π,ϕ is the function defined in the Theorem 2.12. In this section we detail some classical minimisation problems set in the MEM embedding. We set our work under the assumptions of Theorem 2.12.
The associated convex criterion to minimise for problem (M 1 ϕ,γ ) becomes Such criterion gives the Kullback-Leibler divergence when θ = 1

Gaussian distribution and least squares minimisation
Let the reference distribution Π be a Gaussian distribution N (m, Its convex conjugate is the function The associated convex criterion to minimise for problem (M 1 ϕ,γ ) becomes which gives the minimisation problem consisting in finding the least squares deviation of Radon-Nikodym derivative dF dP V from constant m.

Exponential distribution and Burg entropy minimisation
Let the reference distribution Π be an exponential distribution E(θ) with mean θ, (where θ > 0). The support of Π is [0; +∞[. The log-Laplace transform ψ of the exponential distribution E(θ) is The associated convex criterion to minimise for problem (M 1 ϕ,γ ) becomes Such criterion gives the Burg-entropy of F when θ = 1, which is the reverse Kullback-Leibler divergence of P V with respect to F 3. The γ-entropy maximisation problem for the reconstruction of a multidimensional function In this section we propose to study the γ-entropy maximisation method for the reconstruction of p real-valued functions with domain U (with U compact and non-empty) when they are subjected to very general constraints such as This study is performed in three independent parts. First in Section 3.1, we study in the convex analysis framework the problem (F p Φ,γ ) recalled below We detail the construction of a dual problem of finite dimension. We are able to express an optimal solution thanks to Fenchel duality theorem when the constraint measures Φ l are absolutely continuous with respect to P U . However, we show that this first approach does not give a suitable reconstruction when the constraint measures Φ l are not absolutely continuous with respect to P U . To remedy this issue, we propose in Section 3.2 to transfer the functions reconstruction problem (F p Φ,γ ) to a corresponding reconstruction problem (M p ϕ,γ ) on signed measures. Such problem is recalled below The linear transfer is performed by means of suitable continuous kernels. Finally in Section 3.3, we set the problem (M p ϕ,γ ) obtained with the linear transfer into a MEM problem framework. We detail the construction of a sequence of random point-measures for the multidimensional framework. We study the reconstructions given by problem (M p,n ϕ,Π ).

The γ-entropy maximisation problem for the multidimensional case in the convex analysis framework
In this section we study the way to reconstruct a multidimensional function subject to an inverse problems by a γ-entropy maximisation approach. We propose to study such approach within the framework of convex analysis. We recall some general definitions and properties further used in the Appendix A.
We frame our work in a probability space (U, B(U ), P U ) where U is compact (non empty) and P U is the reference probability measure. We aim at the reconstruction of p real-valued functions such that provided that f o satisfies the constraints stated in (3.2). We denote by I γ (·) the opposite of the γ-entropy that is To put it in a more concise way, we study problem (F p Φ,γ ) We denote by ψ the convex conjugate of γ, which is defined by The domain of γ (respectively of ψ) is denoted by D γ (respectively D ψ ). We will further make the following assumption denoted by H 1 .
H 1 : γ (respectively its convex conjugate ψ) is a differentiable, closed, essentially strictly convex function for all interior points on its domain y ∈ int D γ (respectively in D γ ). The minimum of γ(y) is 0 and is attained at some y 0 = (y 1 0 , . . . , y p 0 ) such that y 0 ∈ int D γ . The convex function γ is such that the ratio γ(y) ||y|| tends to infinity on the edges of D γ . As in the case of the single function reconstruction (recalled in the Sect. 2), we will need to define the multidimensional analogue of the γ-divergence of signed measures. Such γ-divergence features some terms that depend on the singular parts. As in the case in one dimension, the singular part vanishes when γ(y) ||y|| tends to infinity on the edges of D γ . The assumption on the ratio γ(y) ||y|| is then taken for the sake of simplicity. Let E be a convex set of measurable R p -valued functions, the minimum of I γ (·) on the convex E will be denoted as The first result we have is a characterisation of the minimum of I γ (·) over E with respect to a specific convex functional. Define f 1 and f 2 two functions. Provide γ(f 1 ) and γ(f 2 ) are finite P U -a.s., we define the γ-Bregman distance (see [4]) of function f 1 and f 2 on U as One can remark that B γ (f 1 , f 2 ) 0, ∀f 1 , f 2 with finite I γ values by the convexity of γ. The next theorem characterises the minimum of I γ (·) over a convex set E of functions with respect to the Bregman distance.
In addition, f o is unique P U -a.s. and any sequence of functions f n ∈ E, for which I γ (f n ) → I γ (E), converges to f o in P U .
Proof. We adapt the proof of [9] in the multidimensional case proposed in this section. The proof relies on an identity that holds for all function f ∈ E with finite I γ (·) value and that a I γ -minimising sequence of function is in some weak sense a Cauchy sequence. In the following, ||.|| will denote the Euclidean norm on R p .
First notice that for all α ∈]0; 1[ and all f , f 1 ∈ E such that I γ (f ) and I γ (f 1 ) are finite, the following equality holds Let denote by f 2 the function f 2 = αf + (1 − α)f 1 . One can first notice that by developing and rearranging those terms, we have that Therefore, only the I γ (·) parts remain in the sum of αB γ (f, f 2 ) and (1 − α)B γ (f 1 , f 2 ), that is and equality (3.5) holds.
Let (f n ) ⊂ E be a minimising sequence of I γ and such that for all n, I γ (f n ) < ∞. Then (f n ) is a Cauchy sequence in probability, that is lim n,m→∞ See Lemma C.1 for the proof. Then, there exists a subsequence (f n k ) of (f n ) in R p such that f n k converges a.s. to a function f o in P U measure and such f o satisfies the inequality in (3.4). Indeed by replacing f 1 by f n k in (3.5), it becomes the last inequality coming from the positivity of the Bregman distance and from the fact that αf +(1−α)f n k ∈ E by the convexity of E and therefore I γ (αf + (1 − α)f n k ) I γ (E). As f n k is a I γ -minimising sequence, the last term in (3.6) tends to 0 when k goes to infinity. Finally, taking a sequence (α k ) converging slower than f n k to 0, by Fatou's lemma This proves the existence of f o in inequality (3.4).
Given N real-valued positive measures Φ 1 , . . . , Φ N , their Lebesgue decompositions are given by Φ l = φ l P U + Σ l , for l = 1, . . . , N where their Radon-Nikodym derivatives with respects to P U are denoted by φ l . Measures Σ l are singular with respect to P U which means that they are concentrated on a setŨ such that P U (Ũ ) = 0. We denote E φ,K the subset of functions as follows (3.7) The next theorem leads to a description of the optimal function (f 1,o , . . . , f p,o ) which is solution on E φ,K of I γ (·) minimisation problem. The result is obtained thanks to Fenchel's duality theorem for convex functions and is a generalisation in higher dimension of Theorem II.2 from [10].
Theorem 3.2. Suppose there exists a R p -valued function f ∈ E φ,K and such that it satisfies Let L be the subspace of R N such that for given (φ 1 , . . . , φ N ) . The minimum of I γ (·) over the set E φ,K can be expressed by Then, for v o ∈ R N which maximises (3.9), the minimiser of I γ (·) over E φ,K is Proof. Outline of the proof is as in [10] with a multidimensional approach: (1) Expression of the minimum in (3.9) is given thanks to Fenchel's duality theorem.
and v o the vector of R N which maximises (3.9), one must verify that T belongs to the interior of D ψ .
(3) Candidate function f o must be such that it satisfies the Bregman inequality (3.4) of the Theorem 3.1.
Let h be such that: In order to apply Fenchel duality theorem, we need that dom k ∩ dom h = ∅. Equivalently, that means that there exists z 0 ∈ dom k such that z 0 ∈ K ∩ L. From Theorem 3.2 assumptions, there exists f ∈ D γ ∩ E φ,K . By applying Lemma 3.3 there exists a closed convex setD included in D γ and a functionf such that it belongs tõ (3.12) By definition z 0 ∈ L, as I γ (f ) < ∞, z 0 ∈ dom k and asf ∈ E φ,K , z 0 ∈ K. Therefore dom k ∩ dom h = ∅ and Fenchel-Moreau duality theorem can then be applied, see [27,28]. With the superscript denoting the convex conjugate, using Fenchel-Moreau duality theorem, we have the following equality inf The convex conjugate of k can be expressed as follow: . By definition of h, its convex conjugate is: Therefore we have equality (3.9) From the assumptions on ψ, τ i (x, v o ) belongs to the interior of D ψ and therefore one can differentiate ψ at Inequality (3.15) comes by using (3.14) and noticing that for any τ ∈ D ψ γ (∇ψ(τ )) = τ T ∇ψ(τ ) − ψ(τ ) (3.16) and that for any Let us denote by D ∇ the set of γ gradient To apply Fenchel duality Theorem in the previous Theorem 3.2, one needs to prove that the domains of the two convex conjugates functions k and h defined in the proof do not have an empty intersection. In order to prove so, the following lemma is required. Lemma 3.3 features a functionf which belongs to a closed convex set D in the interior of γ domain. Given a function f in the interior of D γ , we prove the existence off for which the constraint values U Then, there existf andD such thatf is a function defined on U ,f ∈D and withD a closed convex set of R p such thatD int D γ .
Proof. Let L be the subset of vectors of R N defined by Let (D n ) be a sequence of closed convex sets such that for all n, D n int D n+1 , meaning that D n is strictly growing to its limit D γ . Let T n = {x ∈ U : f (x) ∈ D n }. Let L n be the subset of vector in R N such that That implies that N l=1 v n,l φ l (x) = 0 P U -a.s. on T n . Taking a convergent subsequence v n k with limits v such that ||v|| = 0, we have N l=1 v l φ l (x) = 0 P U -a.s. on U which goes against ||v|| = 0. For δ > 0, defines C δ by The affine hull of C δ equals L and 0 ∈ int C δ in the relative topology of L.
We denote by f n the projection onto D n of f at x. Therefore U approaches 0 as n tends to infinity. For δ > 0 there exists then n 1 (δ) such that for all n n 1 (δ), belongs to C δ . Therefore, it exists h defined on U such that ||h|| δ on T n0 and h = 0 on T c n0 and such that For such h, we setf = f n + h.f belongs toD withD = D n ∩ D δ n0 and The next proposition describes under which conditions the infimum of the Theorem 3.2 is reached for an optimal function in E φ,K . Proof. Recall that v o belongs to the interior of dom m implies that k is differentiable in v o with its gradient d with components d l defined for all l = 1, . . . , N by By Corollary 23.5.3 of [27], the subgradient of h (−v) at v = v o , denoted by ∂h is included in (−K). As the relative interiors of dom h and dom k have a non-empty intersection set, Theorem 23.8 of [27] implies on the sum of convex function subgradients that ∂g = {d} + ∂h . As g reaches its maximum at v o , ∂g is a subset which contains 0, which implies that d ∈ K.
The following Corollary is deduced from the Theorem 3.2 when dealing with measures Φ that are not absolutely continuous with respect to P U .
Let us define the analogue of E φ,K for measures Φ that are not absolutely continuous with respect to P U . E Φ,K is the set of function As a result, we see that when considering measures Φ with a singular part, the optimum function defined in the Theorem 3.2 does not meet in fact the constraints (3.2). The corollary stresses on the fact that the problem is ill-posed when dealing with measures Φ which are not absolutely continuous with respect to the reference measure P U .
Corollary 3.5. Suppose there exists a function f ∈ E Φ,K and such that it satisfies Let L, m be as defined in the Theorem 3.2 and Φ = φP U + Σ. The minimum of I γ (·) over E Φ,K can be expressed by for someK different of K.
Therefore the optimal solution f o defined in (3.10) no longer meets the constraints in (3.2).
dΦ(x) = c and Φ = φP U + Σ. Then, . We apply the Theorem 3.2 to get expression (3.18). Corollary 3.5 points out the necessity of a different approach for solving problem described in (3.2), particularly when dealing with Φ not absolutely continuous with respect to the reference measure on U , as it is the case for solving interpolation problems.

Linear transfer principle for the multidimensional case
Following equivalence of problem solutions introduced in [10], the inverse problem on functions described in (3.2) can be treated as an inverse problem on measures. Sets and generic elements will be distinguished with V and t when discussing the measure reconstruction problem. Measures considered thereafter are always finite real-valued measures. The set of all finite measures on a set V will be denoted by M(V ). The aim is then to reconstruct p real-valued measures F i ∈ M(V ) such that with ϕ i l being given real-valued functions on V , K l ⊂ R, for all i = 1, . . . , p and l = 1, . . . , N . Let us first denote the following assumption H 2 : V is a compact metric space, P V is a probability measure having full support, all ϕ i l are continuous and for each i = 1, . . . , p, (ϕ i l ) l=1,...,N are linearly independent. Given the assumption H 1 , a solution to problem (3.19) can then be chosen by taking as optimal solution (F 1,o , . . . , F p,o ) the p-real valued measure which minimises the γ-divergence with respect to the reference measure P V provided that F o meets the constraints (3.19). The opposite of the γ-entropy I γ and the γ-divergence are linked by the following relation ( [3], Thm. 2.7) where F i a are measures absolutely continuous with respect to P V . Let us denote the set of p-real valued measure with F 1 , . . . , F p ∈ M(V ) meeting the constraints described in (3.19) by the set The Theorem 3.6 below describes the γ-divergence minimiser (F o,1 , . . . , F o,p ) such that (F o,1 , . . . , F o,p ) belongs to S ϕ,K . The Theorem 3.6 is the analogue of the Theorem 3.2 when dealing with measure reconstruction. Theorem 3.6. Under assumption H 2 , suppose we have p measures absolutely continuous with respect to P V such that (F 1 , . . . , F p ) T ∈ S ϕ,K and such that they satisfy Then there exist p real-valued measures F 1,o , . . . , F p,o , absolutely continuous with respect to P V and such that (F 1,o , . . . , F p,o ) minimises D γ (·, P V ) over S ϕ,K . Their Radon-Nikodym derivatives f i,o with respect to P V are defined by Proof. Direct from the Theorem 3.2.
Having recovered the p measures F i,o described in Theorem 3.6, a regularized reconstruction f o is possible via the linear transfer principle. As a matter of fact, one can linearly transfer the constraints (3.2) to the constraints (3.19) by using suitable kernel K. Let K i (·, ·) be measurable bounded real-valued functions on U × V for i = 1, . . . , p such that . . , p , ∀l = 1, . . . , N. (3.21) Then the Fubini theorem links the two sets of constraints as it follows It is then clear that if with (F 1,o , . . . , F p,o ) defined as in Theorem 3.6. The advantage of the kernel transfer method is that if it occurs that some measures Φ are not absolutely continuous with respect to the reference measure -as it is the case when considering Dirac measures for example -, the linear transfer principle provides continuous functions ϕ for problem (3.19) by choosing kernel K to be continuous.
The choice of K is influenced by the prior knowledge on expected properties for the regularized solution f K , see the applications in Section 4 for some examples.

The embedding into the MEM framework for the multidimensional case
In this section we study the MEM method in the multidimensional case. We first detail the construction of the random point-measure involved in the reconstruction problem.
As before in Section 2, define a sequence of discrete probability measures (P n ) n as follows with (t j ) j=1,...,n a deterministic sequence of points in V such that the weak limit of probability measures P n is the reference probability measure P V . For each i = 1, . . . , p, a real valued random variable Y i j is associated to t j . The random variable Y i j can be seen as a random amplitude for a signed measure F i at location t j . Then let F i n , for i = 1, . . . , p be p random measures such that F i n P n and such that they are defined for all t ∈ V by For all j = 1, . . . , n the p real-valued vector of random variables Y j = (Y 1 j , . . . , Y p j ) T is sampled from a reference distribution Π and we denote by Π ⊗n the joint distribution of the n independent, identically sampled vectors Y n of dimension p. Replacing F i by the discretised measure F i n , the measure constraint (3.19) can be rewritten in the following way 1 n MEM method consists then in finding the optimal joint distribution Q MEM n such that it minimises the divergence from the reference distribution Π ⊗n and such that the discrete constraints (3.23) are met on average. The MEM problem can be rewritten as min where Q n ϕ,K defines the set of distributions Q which generates n × p random amplitudes Y i j such, under Q, the constraints are satisfied on average, that is where Y j is the jth sample (Y 1 j , . . . , Y p j ) of amplitudes for the p random measures F 1 n , . . . , F p n . Let us denote the following assumption H 3 : function γ considered in the γ-divergence problem (F p Φ,γ ) is the function γ Π that is such that its conjugate function ψ Π has its domain equal toR p and corresponds to the logarithm of the moment generating function of Π ψ Π (τ 1 , . . . , τ p ) = log R p exp τ T y dΠ(y). (3.25) Notice that the components of ∇ψ Π are then ∂ψ Π ∂τ i (τ 1 , . . . , τ p ) = y i exp τ T y − ψ Π (τ 1 , . . . , τ p ) dΠ(y). (3.26) Provided that it exists y j = y 1 j . . . y p j T in the interior of Π domain for j = 1, . . . , n such that 1 n n j=1 p i=1 ϕ i l (t j )y i j ∈ K l , l = 1, . . . , N, (3.27) by standard theory of the ME method the minimiser Q MEM n of K(Q, Π ⊗n ) exists and it belongs to the exponential family through Π ⊗n spanned by the statistics 1 n n j=1 p i=1 ϕ i l (t j )y i j for l = 1, . . . , N . Its expression is given by and v o n is the maximiser of the discrete dual problem The next theorem describes the convergence of (F MEM n ) n sequence to the solution of the γ-divergence minimisation problem on signed measures (M p ϕ,γ ).
Theorem 3.7. Under assumptions H 2 and H 3 , suppose there exists p measures (F 1 , . . . , F p ) T ∈ S ϕ,K such that they satisfy Then the sequence (F MEM n ) converges weakly to (F 1,o , . . . , F p,o ) the minimiser of D γ (S ϕ,K , P V ) which Radon-Nikodym derivatives f i,o with respect to P V are defined by To link the problem studied previously with the constraint of function reconstruction problem (F p Φ,γ ) recalled below one can consider the analogue of constraint (3.23) for the function to reconstruct. As problem (F p Φ,γ ) and problem (M p ϕ,γ ) can be linked by the linear transfer studied in the Section 3.2, there exists a regularized function f i K , associated to the reconstructed measure F i and chosen kernel K i , that is defined by Then the function of interest f i K can be approximated by a random function f i n defined for all x ∈ U by Then if distribution Q MEM n is a solution of problem (M p,n ϕ,Π ), the regularized solution f MEM n,K defined above meets an approximation of the constraints of problem (1.3). Such approximation is given by

Reconstruction of an univariate convex function
The first example considered is the reconstruction of a real-valued convex function of one variable, f : U → R which is solution of an interpolation problem with U = [−1; 1]. We give N = 3 interpolation constraints. The function has a minimal value y 0 which is reached for a value x 0 ∈] − 1; 1[. The pair (x 0 , y 0 ) consists in the first interpolation constraint. The two other points will be denoted by (x 1 , y 1 ) and (x 2 , y 2 ) where x 1 and x 2 belong respectively to the interval ] − 1; x 0 [ and ]x 0 ; 1[, where the reconstructed function will be respectively decreasing and increasing. The set of interpolation values will be denoted by z = (y 1 − y 0 , y 2 − y 0 ). For a reason explained in the following, the interpolation constraints are expressed as the increment from the minimum value.
In this example, we consider the log Laplace transform associated with the Poisson distribution for the convex criterion to minimise. The objective function (3.29) associated with the MEM problem becomes The objective function (4.1) has an analytic minimum when the number of discretisation points n is equal to 1. The analytic solution is Otherwise, one can use the polynomial approximation of the exponential function. Solving the MEM problem is then reduced to finding the root of a polynomial.
In [17] the authors proved that is a kernel which leads by the linear transfer to an increasing convex function with m − 1 derivatives and which is equal to 0 in x 0 . In our frame, the kernel used for the linear transfer is This will lead to a reconstructed function which reaches its minimum value 0 in x 0 , which is decreasing on the interval [−1; x 0 ] and increasing on the interval [x 0 ; 1]. In the end, the reconstructed function is given by The reconstructed function we obtain is displayed in Figure 3.

Reconstruction of a bivariate polynomial function
The second example considered is the reconstruction of a polynomial function of two variables, f : U → R which is solution of an interpolation problem with U = [0; 1] × [0; 1]. We will choose an increasing number N of interpolation points thanks to a Latin Hypercube Sampler in [0; 1] × [0; 1], the domain of f . We denote by z the values of function f to interpolate at the design points.
In this example, we consider the log Laplace transform associated with the standard Gaussian distribution N (0, 1) for the convex criterion to minimise. The objective function (3.29) associated with the MEM problem becomes ( v, ϕ(t j ) ) 2 .

(4.2)
We choose n = 100 and the n discretising points are sampled from a Latin Hypercube Sampler in V = [0; 1] × [0; 1]. The objective function (4.2) has an analytic solution provided that the number of discretising points n is much larger than the number of constraints N . Notice that it is also required that the N components of the moment function ϕ are linearly independent. However this condition is already an assumption in H 2 used for the Theorem 3.6. Optimal v is solution of the linear problem We consider the symmetric gaussian kernel which is largely used as a covariance kernel in krieging problems which leads to solutions infinitely differentiable. The kernel we use for a pair of points x and t where ( The best parameter θ is chosen using cross-validation, namely we choose the value of θ which minimises with z k the kth value in the interpolation problem located at point x k and f (k) K θ is the reconstructed function obtained in removing (x k , z k ) from the data set. This leads to a two step optimisation problem as matrix A depends on the value of θ and so does the optimal multiplier v. Therefore in the following matrix A will be denoted with a subscript A θ .
In order to determine the optimal parameter θ, we solve iteratively the following two stage problem Step 1: v θm v = z (k) for all k = 1, . . . , N.
The superscript (k) is used to note that the kth experiment has been removed, that is that the kth line has been removed from A θm for the matrix A (k) θm and the kth value from z for the vector of observation z (k) . The notation f (k) m+1,K θ corresponds to the reconstructed function at the mth iteration which is solution of the interpolation problem in which the kth experiment has been removed.
The reconstructed function we obtain is displayed in Figure 4.

Applications in the case p = 1
We consider thereafter some toy models inspired from computational thermodynamics. At first we explain how to compute the so-called phase diagram. Then two toy models are solved using the method described in Section 3.

Phase diagram description in Computational Thermodynamics
The application considered here derived from the assessment problem in Thermodynamics and the CALPHAD (CALculation of PHAse Diagram) method [16,21,22,30]. The CALPHAD method consists in the parametric reconstruction from partial information of thermodynamic quantities, which are Gibbs energy functions and their derivatives. Data at hand for the reconstruction are thermodynamic quantities and phase diagram data. The thermodynamic quantities consist in linear transformations (e.g. first or second derivative) of the function which is ought to be reconstructed.
The phase diagram is a map of the chemical species spatial arrangements. The internal energy of such arrangements varies with state variables, which are the chemical composition, the temperature and the pressure. Establishing the phase diagram of a chemical system means to partition the domain of permissible state variables Figure 5. Connections between the phase diagram (f) of a system A and B with two phases (the phase α for low temperature and the phase L for high temperature) and the inner energy (a)-(e) for each phase for different temperature values. Figures (a)-(e) display the Gibbs energy for phases α and L at a given temperature with respect to the relative composition of chemical element B over the sum of chemical elements A and B. For temperature T 1 and T 2 , phase L has the lowest energy whereas for temperature T 4 and T 5 phase α has the lowest energy. At temperature T 3 , there exists a common tangent to the two energy functions. Therefore between composition C 1 and C 2 , a combination of phases α and L is the most stable. Figure reference [25].
in several areas, each area featuring one or several stable phases. A stable phase is the phase with the lowest energy. Figure 5 displays an example of a phase diagram. In order to determine the different divided areas of the diagram, one must compute the minimising convex hull of the list of the p functions involved in the system. This problem is performed for all temperature range. For the composition range where the minimising convex hull is confounded with a single energy function f i , the corresponding area is the stability area of phase i. Such area is labelled by i in the phase diagram.
The phase diagram in Figure 5 is for a binary system A and B. Such phase diagram is called binary phase diagram as it features two chemical elements: A and B. In this example, the state variables which can vary are the temperature and the relative composition of B with respect to the total composition (that is of A and B), from solely element A at the left edge to solely element B at the right edge. The pressure is fixed.
The phase diagram data consist in locations in the map of stable phases, namely either temperatures or compositions where a change in the set of stable phases occurs or information on the stable phases for a given composition and a given temperature.
The novelty of the work presented here is that the reconstruction occurs in a non-parametric frame, contrarily as the usual reconstruction frame in thermodynamics. Figure 6. Phase diagram corresponding to an ideal solution. There are two phases: the phase 1 is the stable phase for low temperature and the phase 2 for the high temperature. Behaviour with respect to the temperature of phases 1 and 2 are known when the composition is 0 or 1 for the temperature range they are respectively the stable phase.
Thereafter, the reconstruction of thermodynamic quantities is expressed as the inverse problem (F p Φ,γ ). The constraints that the solution (f 1 , . . . , f p ) must satisfy is recalled below

Phase diagram with an ideal solution
In this section we study the case of a phase diagram with an ideal solution. It consists in the reconstruction of two functions, viz. the Gibbs energy functions associated with the two phases, outside of the domain where they are known.
The inverse problem associated to the functions to reconstruct is built in accordance with the phase diagram. The phase diagram with an ideal solution is usually displayed as in Figure 6. It features two phases and three divided area. The first area at the bottom of the diagram is the area associated with phase 1 which is the stable phase at low temperature. The second area at the top of the diagram is the area associated with phase 2 which is the stable phase at high temperature. The last area between the two first areas features a mix of phase 1 and phase 2.
Such problem may not be well-posed as there can exist an infinity of pairs of functions leading to the same phase diagram. In the following the univariate case is treated, which corresponds to the case of a given temperature value. Then functions solely depend on the relative composition of element B with respect to the total composition.
In the univariate case, the pair of functions f 1 (x), f 2 (x) is ought to be reconstructed for all x ∈ [0; 1]. In this case, the phase diagram constraints reduce to the following definition. Given x 1 , x 2 ∈ [0; 1] with x 1 < x 2 . The interval [x 1 ; x 2 ] consists in the compositions for which phase 1 and phase 2 coexist. Therefore functions λ 1 and λ 2 are defined as follows and else.
Observations for x < x 1 give values of solely f 1 and for x > x 2 give values of solely f 2 . In the interval [x 1 ; x 2 ], one can observe λ 1 (x)f 1 (x 1 ) + λ 2 (x)f 2 (x 2 ) (4.3) which does not provide direct information about the behaviour of f 1 nor f 2 on this interval. The two functions to be reconstructed ought to be convex twice differentiable. Therefore the same assumptions are taken than in the case treated in Section 4.1.1 and the same method is applied. Figure 7 displays the results obtained for the reconstruction of the target functions.
Function f 1 is well reconstructed in this example but there exists a set of possible reconstructions for function f 2 . An extra constraint is required to have a unique solution for f 2 .
(2) We call subgradient at x ∈ R p of a convex function γ -and write ∂γ x -all vectors s in R p which satisfy ∀y ∈ R p , γ(y) γ(x) + s, y − x .
(3) A function γ is essentially strictly convex if γ is strictly convex on all convex subset of its subgradient domain. (4) A function γ is essentially smooth if the interior of γ domain D γ is not empty, γ is differentiable on the interior of D γ and its gradient ||∇γ|| tends to infinity when approaching to the edge of D γ . (1) If γ is closed convex, then its biconjugate, that is the conjugate of convex conjugate ψ, is γ itself.
(2) A function γ being essentially strictly convex is equivalent to its convex conjugate being essentially smooth.
(3) Let γ be a convex function and let its domain be not empty. If γ : R p → R is such that γ(y) ||y|| −→ y∈∂Dγ +∞ then its convex conjugate has full domain, that is D ψ = R p .

Appendix B. Bregman distance bounds
This section generalises some results of [9] on the bounds of Bregman distance between two functions f and f 1 in the case of R p -valued functions. Proof. Let f 1 and f 2 be two functions in E with finite I γ values. Let denote by b γ the integrand of B γ that is b γ (f 1 (x), f 2 (x)) = γ(f 1 (x)) − γ(f 2 (x)) − ∇γ(f 2 (x)) T (f 1 (x) − f 2 (x)).
Let C ∈ B(U ) be such that min x∈C (||f 1 (x)||, Lemma B.2. Let (U, B(U ), P U ) be a probability space and E be a convex set of functions with values in R p . γ is an essentially strictly convex twice differentiable function on R p and its Hessian matrix is strictly definite positive on R p . Recall the Bregman distance for f 1 , f 2 having finite I γ values ∀K > 0, it exists β > 0 such that for all f 1 , f 2 ∈ R p and C ∈ B(U ) with C ⊂ {x : ||f 2 (x)|| K}, then Let u K and v, two vector from R p such that ||u K || = K and v belongs to the p-ball of center u K and of radius K, denoted by B(u K , K). Then, using Taylor's theorem for the decomposition of γ(v) centered in u K , we have where R γ (v, u K ) is the remainder term of order o(K). Then for all where ε γ,K is the smallest eigen value of γ Hessian matrix located at any point of B(u K , K). Given the assumptions on γ, ε γ,K is strictly positive. Therefore, for all x ∈ C for which ||f 2 (x)|| K B γ (f 1 , f 2 ) = U b γ (f 1 (z), f 2 (z))dP U (z) C∩{x:||f1(x)|| L} b γ (f 1 (z), f 2 (z))dP U (z) C∩{x:||f1(x)|| L} 8L 2 ε γ,K dP U (z)  Let (f n ) ⊂ E be a I γ -minimising sequence, then (f n ) is a Cauchy sequence in probability P U , meaning that which proves that (f n ) is a Cauchy sequence in probability P U .