Z-EQUILIBRIUM IN RANDOM BI-MATRIX GAMES: DEFINITION AND COMPUTATION

. This paper deals with bi-matrix games with random payoffs. Using probability tools, we propose a solution based on the concept of Z-equilibrium. Then, we give sufficient conditions of its existence. Further, the problem of computation of this solution is transformed into the determination of Pareto optimal solutions of a deterministic bi-criteria minimization problem. Finally, we provide illustrative numerical examples.


Introduction
Bi-matrix games played an important role in the development of game theory.Many real-world conflict situations are analyzed by bi-matrix games such as the prisoner dilemma game.A bi-matrix game is characterized by two matrices  and  representing the payoffs of the players, players I and player II, respectively.A nice feature of matrix games is that Nash equilibrium [19] always exists in pure or mixed strategies and it can be computed via the resolution of a quadratic optimization problem.
Nash equilibrium is the most prominent concept of solution in game theory.However, it may not exist in pure strategies and may not be pareto optimal, that is, it may be dominated.In this paper, we consider the concept of Z-equilibrium introduced by Zhukovskii [29].In contrast to Nash equilibrium, it always exists in pure strategies in finite games and it is always Pareto optimal.This equilibrium is said to be "active" in the sense that, any deviation of a player from her/his Z-equilibrium strategy, the other player has a specific punishing strategy that decreases or maintains her/his payoff.Whereas, in Nash equilibrium, the reaction of a player to any deviation of the other player is the same.Moreover, it is interesting to note that the set of Z-equilibria is a subset of the -core of Aumann [3] in two-player games.Moreover if, we consider a stronger version of the -core where a coalition can block a solution if it can guarantee greater or equal payoff for all its members with a strictly greater payoff for at least one member, then the Z-equilibrium generalizes the -core in -person games with  > 2.
A complete study of Z-equilibrium in continuous and deterministic games is given in Zhukovskii [29] (see a formal definition in Sect.2) and Zhukovskii and Tchickry [30].Ferhat and Radjef [11] have generalized Z-equilibrium to multiple criteria games in mixed strategies.In Bouchama et al. [5] an equivalence between the solution of a constraint satisfaction problem and the Z-equilibrium of its associated game is established.
However, in many real games situations, it is very difficult to determine an exact value of the payoffs.Therefore, some approaches to Z-equilibrium in the case of lack of precision and certainty on the payoffs are considered in Larbani and Lebbah [16], Larbani and Achemine [15], Achemine et al. [1], Nessah et al. [20] and Achemine et al. [2].
The first work on Z-equilibrium in games involving uncertainty is due to Larbani and Lebbah [16].They considered games with uncertain payoffs of the form   (, ), where  is an unknown parameter that varies in some set  ⊂ R  .They introduced a concept called ZS-equilibrium.Further, Larbani and Achemine [15] introduced and investigated the notion of ZP-equilibrium.This concept is generalized to fuzzy games in Achemine et al. [1].Nessah et al. [20] considered games with uncertain parameters, where the players can form coalitions, the introduced concept is called coalitional ZP-equilibrium.Recently, Achemine et al. [2] investigated Z-equilibrium in the class of bi-matrix games with uncertain payoffs in the sense of Liu [17].Liu uncertainty theory is different from probability theory; it is based on credibility measure that measures fuzzy events that are subjective in nature.
In many practical situations, the players' payoffs are better modeled using random variables.Wholesale electricity markets are good examples of this area [8,9,18,27].One way to handle such games is using expected payoff criterion [9,10,13,21,27,28].The expected payoff criterion is not suitable when the random payoff has a large variance.In this case, it is more interesting to consider payoffs that can be obtained with a certain confidence level.Such situations are modeled using chance-constraints games.The first works on chance-constrained games concern zero-sum chance-constrained games.These games were developed by Blau [4], Cassidy et al. [6], Charnes et al. [7], and Song [26].Nash equilibrium in bi-matrix games and in -person finite games have been investigated using chance-programming in Singh et al. [25] and Singh and Lisser [24].
Z-equilibrium has not been investigated in games with random payoffs.The contribution of this paper is to initiate the study of Z-equilibrium in games with random payoffs.As a first step, we consider a bi-matrix game in which the payoffs are random variables.The main difficulty in the study of these games is the comparison between payoff values associated with different strategies of the players.Using probability tools, for such games, we introduce a concept of equilibrium based on Z-equilibrium and we establish sufficient conditions for its existence.Furthermore, we show that the computation of this equilibrium can be transformed into the computation of a Pareto optimal solution of a bi-criteria optimization problem.We use chance-constrained programming to formulate Z-equilibrium.However, our approach differs from the existing ones.Following the satisficing principle of Simon [23], we first ask the players to provide satisfaction levels in terms of payoffs, then we formulate new payoffs as probabilities of achieving those levels.In existing works, payoffs are formulated as values that are achieved with given confidence levels.
The rest of the paper is organized as follows.The next section is devoted to the description and the introduction of the proposed solution called RZ-equilibrium (random Z-equilibrium).In Section 3, we present sufficient conditions for the existence of this equilibrium.In Section 4, we show that the computation of the RZ-equilibrium can be formulated as a bi-criteria optimization problem that can be solved using methods of multiple criteria optimization.A numerical example is given in Section 5. Section 6 discusses related work.Section 7 concludes the paper.

Z-equilbrium in a deterministic game
To help the unfamiliar reader understand the Z-equilibrium [29], we recall its definition for a deterministic strategic two-person game.
Consider the two-person strategic game where  = {1, 2} is the set of players;   ⊂ R  ,   ∈ N * ,   the set of strategies of the -th player,  = 1, 2.   :  1 ×  2 −→ R, is the payoff function of the -th player.The aim of each player is to maximize her/his payoff function.
Notation.In the following, we use the notation: for all ( 1 ,  2 ), ( said to be a Z-equilibrium for the game  if and only if the following two conditions hold. (1) Remark 2.2.-The condition 1 of Definition 2.1 means that for any deviation   of the -th player ( ∈ {1, 2}) from her/his equilibrium strategy, the other player can punish her/him by choosing a specific strategy   − that prevents him/her from being better off.This condition guarantees the stability of Z-equilibrium.-Condition 2 of Definition 2.1 means that  * is Pareto optimal for the players, that is,  * is not dominated in payoff space.It is interesting to note that the set of Z-equilibria is a subset of the -core of Aumann [3] in two-player games.Further, if we consider a stronger version of the -core where a coalition can block a solution if it can guarantee a greater or equal payoff for all its members with a strictly greater payoff for at least one of them, then the Z-equilibrium generalizes the -core in -person games with  ≥ 2. -A Nash equilibrium that is Pareto optimal is a Z-equilibrium.Indeed, each player can use her/his Nash equilibrium strategy to punish the other player for deviating from the equilibrium.
Note that the profile ( 2 ,  2 ) is a Nash equilibrium.It is easy to see that ( 1 ,  1 ) is a Z-equilibrium.That is, Z-equilibrium captures the cooperative profile in the prisoner dilemma game.Note that experimental evidence has shown that in 50% of the cases, players choose the cooperative profile ( 1 ,  1 ) (Z-equilibrium) rather than Nash equuilibrium in a two person prisoner dilemma game Sally [22].When the game is repeated, following Z-equilibrium, player I (resp.player II) can punish player II (resp.player I) by selecting  2 (resp. 2 ) to stabilise the game at ( 1 ,  1 ).
Example 2.4.Consider the following bi-matrix game. ) where Pareto optimal for the players, that is, this pair is not dominated in payoff space.
As the game has no Nash equilibrium, the players can adopt Z-equilibrium as a solution for its desirable properties, especially if the game is repeated.
The following theorem guarantees the existence of Z-equilibrium [29].
Theorem 2.5.Assume that (i) the sets of strategies  1 and  2 are non empty and compact; (ii) the functions  1 and  2 are continuous on Then,  has at least one Z-equilibrium.

The bi-matrix game with random payoffs
In the classical bi-matrix games, the payoffs of the two players are real numbers.They are precisely known.However, real-life decisions problems often involve randomness.Neglecting randomness in modeling when it exists, may lead to poor quality decisions.Therefore, in this section, we focus on bi-matrix games, where the payoffs are random variables.
We consider a bi-matrix game in which the players have exactly defined their pure strategies but are uncertain about the induced payoffs.In the following, we assume that this uncertainty is modeled by random variables.
The random payoff matrices to the row player I and column player II are ̃︀  = [̃︀   ] and ̃︀  = [ ̃︀   ], respectively.We denote the sets of mixed strategies of players I and II, which represent weights assigned to their pure strategies, by }︃ and respectively, where  represents the transpose operator.They can also be interpreted as probabilities that players choose their particular pure strategies.Then, a mixed strategy game with random payoffs is given as follows The payoffs induced when players I and II choose the mixed strategies  ∈  and  ∈  are   Ã and   B, respectively.
In this game, it is assumed that the players are rational and each of them knows the set of strategies of the other player.It is also assumed that each player knows the distribution of every random entry in ̃︀  and ̃︀ .The aim of each player is to maximize her/his payoff.
As a solution for the game (2.1), we propose a concept based on the notion of Z-equilibrium [29], which takes into account the random aspect of the game.For this purpose, we formulate the payoff of each player using a chance constraint.Following Simon [23] satisficing principle, for predetermined satisfaction levels  1 (resp. 2 ) ∈ R, we assume that player I aims to maximize the probability of the random event {︁  : (2) There is no strategy profile (, ) ∈  × , such that Note that in existing literature of finite random games Singh et al. [25] and Singh and Lisser [24], in equilibrium definition, the maximum achieved value with a given confidence level is considered as payoff, while we use probabilities of achieving a given satisfaction level as payoff.We discuss this aspect in Section 6 in more detail.

Existence of the RZ-equilibrium
In this section, the problem of the existence of RZ-equilibrium is investigated.Using probability tools, sufficient conditions for the existence of RZ-equilibrium are established in two important cases: (i) the entries of the payoff matrices are normally distributed random variables; (ii) the entries of the payoffs matrices are Cauchy distributed random variables.

Payoffs following normal distribution
In the following theorem, we present sufficient conditions of RZ-equilibrium existence in the game (2.1) when payoffs follow a normal distribution.Then for given levels ( 1 ,  2 ) ∈ R × R, we have the following chain of equalities where As is a zero mean unit variance Gaussian variable, that is  (0, 1), then where Φ is the standardized normal distribution function. .
The function (, ) ↦ −→ Φ( 1 1 (, )) is continuous on  × .Similarly, we prove that )︁ is also continuous on  × , where The existence of RZ-equilibrium of the game (2.1) at ( 1 ,  2 ) levels is equivalent to the existence of a Zequilibrium in the game Since all the conditions of Zhukovskii theorem [29] (see Thm. 2.5 in Sect.2) are satisfied in this game, we conclude that at least one RZ-equilibrium at levels ( 1 ,  2 ) exists.

Computation of RZ-equilibria
In this section, we show that the problem of determination of a RZ-equilibrium can be transformed into a problem of computation of a Pareto optimal solution of a bi-criteria minimization problem.Then, from this problem, we derive an algorithm for the computation of RZ-equilibria.In order to deal with the problem of computation of a RZ-equilibrium, we recall that a bi-criteria optimisation (minimization/maximization) is an optimization problem that involves two objective functions.The general formulation of a bi-criteria optimization problem is where  is the feasible set of decision vectors,  ⊂ R  ,  ∈ N * ; In the rest of this paper, we use the following notation of the bi-criteria optimization problem.
The concept of solution in bi-criteria optimization is based on Pareto optimality.

Payoffs following normal distribution
2 (, ).First, we prove that  =  , considering that Φ is continuous and strictly increasing on R, and  1 and  1 are continuous on the compact set  ×  and  1 > 0, we obtain the equalities Then, In the same way, we have Next, we prove that a Pareto optimal solution of the bi-criteria maximization problem ⟨ , )︁⟩ is a Pareto optimal solution for the bi-criteria )︁⟩ .For all (, ) ∈  , the vector inequality , is impossible is equivalent to for all (, ) ∈  , the vector inequality Then, using the properties of the standardized normal distribution Φ, it is also equivalent to for all (, ) ∈  , the system of inequalities )︁ , is impossible, which concludes the proof.

Algorithm
To find Pareto optimal solutions of problem (4.1), we use the scalarization approach by choosing a pair of weights ( 1 and  2 2 , such that  1 +  2 = 1.Thus, to compute RZ-equilibria, we just need to solve a deterministic optimization problem.We present the computation of RZ-equilibria in the form of an algorithm as follows.

Payoffs following Cauchy distribution
Theorem 4.3.Under the assumptions of Theorem 3.2, ( * ,  * ) is a RZ-equilibrium of the game (2.1), if and only if it is a Pareto optimal solution of the bi-criteria minimization problem where

}︁
and Using the Cauchy distribution properties, the proof of this result is analogous to that of Theorem 4.2.
Example 5.2.In order to show the applicability of the proposed approach in the case where the payoffs follow Cauchy distribution, let us assume that in the game (2.1) the payoffs matrices of player I and player II are, respectively ) )︂ .
Assume that the conditions of Theorem 4.2 are satisfied for the given satisfaction levels  1 and  2 .

Related work
Z-equilibrium in games with uncertain payoffs in the form   (, ), where  is a strategy profile and  is a parameter with unknown behavior has been investigated in Larbani and Lebbah [16].In these games, the payoffs are not completely uncertain, they are represented by a parameter only.In the present work, the payoffs do not involve any parameter.In this sense, it is more general.Recently, Achemine et al. [2] have investigated Z-equilibrium in bi-matrix games with uncertain payoffs in the sense of Liu [17].The present work fundamentally differs from this work as it deals with games with random (uncertainty of probability type) payoffs.The difference between probability theory and Liu uncertainty theory is that the latter is based on a credibility measure that is introduced to measure the credibility of a fuzzy event, while the former deals with random phenomena.Fuzzy set theory mainly deals with subjective uncertainty i.e., imprecision in human judgment and evaluation of events and phenomena.Therefore, the scopes of application of the two papers differ considerably as they model two different types of uncertainty.Thus, compared to Achemine et al. [2], the present paper is a new theoretical and application contribution.
Further, the literature on games with random payoffs is mainly concentrated on the investigation of Nash equilibrium, its existence and computation.We mention and compare our work to two prominent works on this area of research, Singh and Lisser [24] that deals with bi-matrix games with random payoffs and Singh et al. [25] that investigates -person finite games.
The main differences between the present paper and the mentioned two papers are that (i) the two papers deal with existence, and computation of Nash equilibrium, while ours deals with Z-equilibrium, and (ii) in the two papers the payoffs are defined by fixing a probability (confidence) level, then finding the maximum payoff that can be obtained for this level; formally, they use the formulas respectively, where   is a given confidence level.In our paper, we use the formula where   is a given satisfaction level in terms of payoff.As (6.1) and (6.2) are based on the same idea, we will compare (6.1) to (6.3) because they deal with the same type of games, bi-matrix games.In (6.1), the used payoff to define Nash equilibrium is defined as the supremum of the set of   values such that P (︁   Ã ≥   )︁ ≥   .Then a RZ-equilibrium in the sense of Singh and Lisser payoffs can be defined as follows.Definition 6.1.For predetermined confidence levels  1 ,  2 , a pair ( * ,  * ) is called a RZ-equilibrium (random Z-equilibrium) of the game (2.1) at ( That is, using Singh and Lisser's definition of payoffs, the players must first provide confidence levels (values of the probability parameters   ,  = 1, 2), then if a pair ( * ,  * ) of mixed strategies is a Z-equilibrium, the payoffs    ( * ,  * ),  = 1, 2 satisfy the confidence levels   ,  = 1, 2, that is P In our work, we proceed in the other way around, we first ask the players to provide their satisfaction levels,   ,  = 1, 2, in terms of payoffs and then define the RZ-equilibrium in terms of probabilities of events where payoffs satisfy those levels.Then, if a pair ( * ,  * ) of mixed strategies is a RZ-equilibrium, the payoffs are expressed in terms of probabilities    ( )︁ .Our approach that is based on considering probabilities as payoffs in defining RZ-equilibrium makes theoretical and practical sense.In fact, asking the players to provide a satisfaction level, we use the Simon's "satisficing principle" [23] which means that in non-trivial real-life decision problems, decision-makers look for satisficing alternatives rather than maximizing ones.As uncertain payoff games are highly complex because they involve strategic uncertainty and payoff uncertainty, this principle is highly relevant and appropriate.Further, it also makes sense to express the attainment of the given satisfaction level in terms of probability as the payoffs are of probability uncertainty type.Consequently, as we are in a game context, it makes sense to define Z-equilibrium in terms of the probability of attainment of satisfaction levels.proof of Thm.(2) For  1 = 5 and  2 = 4, the RZ-equilibrium is ( * ,  * ) = ((0, 1), (0, 1)).Then, Clearly, the probabilities decrease as   ,  = 1, 2 increase.Thus, starting from  1 =  max ,  2 =  max , players can get RZ-equilibrium with desired probabilities by increasing or decreasing   ,  = 1, 2.
Finally, an advantage of the payoffs in Definition 2.6 is that they are simpler as they are expressed by probabilities only, while the payoffs in Definition 6.1 are expressed by probabilities and the "sup" operation.Thus, the existence conditions and computation of RZ-equilibrium would be more difficult via the latter definition than via the former.
Next, we provide an example with RZ-equilibrium and Nash equilibrium.)︂ ,

-𝑝𝑝𝑝𝑝
the payoff   ̃︀  of the row player follows a Cauchy distribution with location parameter   (, ) =      and scale parameter   (, ) =      ; -the payoff   ̃︀  of the column player follows a Cauchy distribution with location parameter   (, ) =      and scale parameter   (, ) =      .
is continuous, we deduce that these two functions are continuous on  × .Proceeding as in the proof of Theorem 3.1, we deduce the existence of at least one RZ-equilibrium of game (2.1) at ( 1 ,  2 ) levels.