OPTIMAL REINSURANCE AND INVESTMENT STRATEGIES FOR AN INSURER UNDER MONOTONE MEAN-VARIANCE CRITERION

This paper considers the optimal investment-reinsurance problem under the monotone mean-variance preference. The monotone mean-variance preference is a monotone version of the classical mean-variance preference. First of all, we reformulate the original problem as a zero-sum stochastic differential game. Secondly, the optimal strategy and the optimal value function for the monotone mean-variance problem are derived by the approach of dynamic programming and the Hamilton-JacobiBellman-Isaacs equation. Thirdly, the efficient frontier is obtained and it is proved that the optimal strategy is an efficient strategy. Finally, the continuous-time monotone capital asset pricing model is derived. Mathematics Subject Classification. 49L20, 93E20, 91B30. Received January 7, 2020. Accepted July 28, 2021.


Introduction
The investment and reinsurance are increasingly crucial issues for insurance companies. For this reason, lots of mathematical models were proposed to help derive the optimal strategies. With these strategies, the insurance companies can make profits and manage their risk exposures. In particular, because of its tractability and intuitiveness, the mean-variance optimization problem has received great attention in recent years. The essence of the mean-variance problem is to minimize the variance of the prospect while keep the expected prospect fixed, that is, the mean-variance problem is a constrained optimization problem as follows: with ξ ≥ ξ 0 , where P is a given probability measure and f is an uncertain prospect influenced by the strategy u. ξ 0 is the riskless prospect, that is, the expectation of a prospect whose holder avoids any possible risk.
If the expectation E P [f ] = ξ varies, the corresponding optimal strategy and the minimized variance change accordingly. Hence, the points (V ar P [f ], E P [f ]) draw a half parabolic curve in the plane, which is called the efficient frontier.
that is, g mv f if and only if U θ (g) ≤ U θ (f ). We call mv the mean-variance preference, because the functional U θ (·) coincides with the penalty formulation of the mean-variance problem. Unfortunately, the mean-variance preference has a major drawback that it fails to be monotone. It may happen that an asset that has strictly higher return could have lower score with mean-variance preference. Henceforth, investors who follow the meanvariance preference may prefer less than more, which violates the most fundamental principles of economic rationality. Especially, the monotonicity of preference is a crucial assumption in financial theory, without which the arbitrage argument cannot be established (see [6] and [21]). In fact, the non-monotonicity can be bypassed only under a very strict assumption about the probability distribution of prospect process, that is, the prospect f must be bounded by E P [f ] + 1 θ almost surely. In most literature of continuous-time portfolio selection, the insurer's wealth process is assumed to be governed by dX(t) = [r(t)X(t) + u(t) T B(t) + a(κ − κ r )]dt + u(t) T σ(t)dW (t), with initial value X(0) = x 0 , where u(·) is an R n+1 -valued adapted strategy process representing the reinsurance and investment strategies. The objective of the insurer is to maximize the expectation of its terminal wealth X(T ). It is easy to see that X(t) is an Ornstein-Uhlenbeck process, and the terminal wealth X(T ) may not be bounded by E P [X(T )] + 1 θ . In this case, using the mean-variance preference functional U θ as the utility score is rather irrational.
In order to overcome the lack of monotonicity, [18] introduced an amended version of the mean-variance preference named monotone mean-variance preference. It is based on the variational preferences of [17]. The monotone mean-variance preference is the minimal monotone modification of the mean-variance preference. It not only fills the gap of non-monotonicity, but also maintains the basic intuition and tractability of the mean-variance preference.
Specifically, the monotone mean-variance preference mmv is defined via the following utility score: where Q ranges over all absolutely continuous probability measures with square integrable density with respect to P , and C(Q||P ) is the relative Gini concentration index. Readers who are interested in more details of the monotone mean-variance preference are referred to [18]. In this paper, we focus on maximizing the insurance company's monotone mean-variance preference utility rather than the classical mean-variance preference utility.
To the best of our knowledge, there are few research results for the optimal monotone mean-variance problem. Trybu la and Zawisza [24] studied a continuous time portfolio choice problem where the coefficients of stocks prices are assumed to be functions of a stochastic process. The discounted terminal wealth process are considered. They obtained the optimal portfolio and the value function when the coefficients are specified. They assumed Q is an equivalent probability measure with respect to P . For a large class of portfolio choice problem, [23] further proved that, when the risk assets are continuous semimartingales, the optimal portfolios and value functions of the classical mean-variance preference and the monotone mean-variance preference coincide. For a general semimartingale model, [25] gave several results about the relationship of the classical mean-variance preference and the monotone mean-variance preference. In this paper, we consider not only financial assets, but also insurance and reinsurance. This is the first time that the monotone mean-variance objective is used for the optimal reinsurance problem. Moreover, in this paper, Q need not be restricted to be equivalent to P , but allowed to be absolutely continuous. We start from a family of absolutely continuous probability measures and prove that the objective function reaches a minimum point at an equivalent probability Q * . Moreover, the explicit optimal value function and the optimal strategy are both obtained.
This paper is organized as follows. In Section 2, we introduce the wealth process and the monotone meanvariance optimization problem. To simplify our problem, we connect our optimization problem with a two-player zero-sum game, and the optimal strategy of the insurer lies in the Nash equilibrium. In Section 3, we give the classical formulation of the stochastic differential game, by the theory of absolutely continuous probability measure in [11] and [16]. The Hamilton-Jacobi-Bellman-Isaacs(HJBI) equation for this game is also given. Section 4 consists of all results of the paper. In Subsection 4.1, the value function is given explicitly by solving the HJBI equation. The optimal strategy is also given. In Subsection 4.2, we present the efficient frontier of the monotone mean-variance problem. This is proved to coincide with the efficient frontier of the classical meanvariance problem. In Subsection 4.3, when only the financial market is considered, a monotone CAPM based on the monotone mean-variance preference is obtained.

Model setting
In this section, we first introduce our wealth dynamic and then give a monotone mean-variance criterion based on [18].

The wealth process
Let (Ω, F, P ) be a complete probability space. Let S(t) = (S 1 (t), S 2 (t), . . . , S n (t)) denote the prices of n stocks. The price of i-th stock at time t is with initial value S i (0) = s i . P i (t) is the return of the i-th stock: where b i (t) and σ ij (t) are both deterministic functions mapping from [0, T ] to R. W i (t) i=0,1,...,d are i.i.d.
R-valued standard Brownian motion under probability P , which describe the risk of the financial market. Suppose that the financial market is arbitrage free, that is, n ≤ d. If n < d, the financial market is incomplete.
To simplify the notations, we write b S (t) = (b 1 (t), b 2 (t), . . . , b n (t)) T ∈ R n , σ S (t) = (σ ij (t)) ∈ R n×d , P (t) = (P 1 (t), P 2 (t), . . . , P n (t)) T ∈ R n , Thus the vector form of return is given by Besides, investor can also put money in the risk-free bank account. Suppose that the interest rate is a deterministic function r(t), and this risk-free asset at time t satisfies the ordinary differential equation with initial value S 0 (0) = 1.
Assumption 2.1. We assume that r(·), b S (·) and σ S (·) are all bounded deterministic functions and σ S (·) satisfies the following nondegeneracy condition for some constant δ S > 0 and the identity matrix I ∈ R n×n .
Next, we introduce the insurance risk model. Let C(t) be the claim process of the insurer which is governed by the drifted Brownian motion where a and σ 0 are two constants. W 0 (t) is a standard Brownian motion modeling the insurance risk. Suppose that W 0 (t) is independent of W j (t) for any 1 ≤ j ≤ d. We note that the above model of the claim process is a diffusion approximation of the classical Cramér-Lundberg model. There has been much work on the diffusion approximations, such as [7,9,10], etc. The parameters in the above model have the following interpretation where λ is the intensity of a Poisson point process N (t), U i is the size of the i-th claim. All the {U i } are independent and identical distributed and are also independent of N (t). The insurance premium is paid continuously at the rate of where κ > 0 is the relative safety loading of the insurer. Therefore, without reinsurance, the surplus process is given by Denote by W (t) = (W 0 (t), W 1 (t), W 2 (t), . . . , W d (t)) T , and let F := {F t |t ∈ [0, T ]} be a P-completion of the right continuous filtration G : The insurer can invest its wealth in risky assets or put it in a risk-free bank account so as to manage its capital market risk. The insurer can also purchase proportional reinsurance product or acquire new business so as to manage its insurance risk. Let X(t) denote the wealth process of the insurer. Let u i (t) be the amount of money invested in the i-th risky asset at time t, and X(t) − n i=1 u i (t) be the amount of money put in the bank account. We allow u i (t) to be greater than X(t), or being negative. If u i (t) > X(t), it means that the insurer borrows money from bank. If u i (t) < 0, it means that the insurer short sells the i-th risky asset. Let u 0 (t) be the retention level of reinsurance at time t. We allow u 0 to be greater than 1. As in [3], u 0 (t) > 1 means that the insurer acquired new business. Let κ r be the relative safety loading of the reinsurer. Usually κ r ≥ κ, but when κ r = κ we call it the cheap reinsurance.
If the insurer purchases 100(1 − u 0 (t))% proportional reinsurance, it should pay the premium at the rate of (1 + κ r )(1 − u 0 (t))a to the reinsurer, and the reinsurer should undertake 100(1 − u 0 (t))% of the claim from insurer. Thus, the surplus process of the insurer after purchasing proportional reinsurance is given by with initial value R(0) = x 0 . The insurer's wealth process X(t) is given by , By Assumption 2.1, r(·), B(·) and σ(·) are all bounded functions and satisfy: for some constant δ > 0 and the identity matrix

Monotone mean-variance objective function
In this section, we give a concise introduction of the monotone mean-variance preference derived from [18]. The monotone mean-variance preference utility is defined by and C(Q||P ), defined by is the relative Gini concentration index (or χ 2 -distance) which enjoys properties similar to those of the relative entropy (see [15]). Let G θ ⊂ L 2 (P ) be the domain of monotonicity of the classical mean-variance utility U θ , in other words, a subset of L 2 (P ) where the Gateaux differential of U θ is positive. By Lemma 2.1 of [18], G θ is given by which implies that f ∈ G θ only if f is an almost surely bounded random variable under P and the deviation In this paper, the terminal wealth X u (T ) of the insurer takes place of the prospect f . Since X u (T ) may be unbounded for some admissible strategies, the widely used classical mean-variance utility fails to be monotone. Specifically, there may exist two strategies u and v which satisfy that . Therefore, the monotone mean-variance utility is a more rational objective function for insurers' purposes. The aim of the insurer is to find the optimal strategy u * (·) which can maximize V θ (X u (T )).
Example 2.2. Consider a company whose risk-aversion is θ = 2. Let η ∼ N (10, 1) and ε ∼ U (0, 12) be two independent random variables where η is the profit from the operation of the company and ε is an opportunity of arbitrage in the financial market. Then the company's total wealth is X u = η + uε, where u is the portfolio. For the cases u = 0 and u = 1, we have Obviously, X 1 is always greater than X 0 but the company shall chose the portfolio u = 0 because U θ (X 1 ) is less than U θ (X 0 ). Let us consider a companion two-player zero-sum game as follows: The player one wants to maximize J θ (u(·), Q) with its strategy u(·) over U[0, T ] and the player two wants to maximize −J θ (u(·), Q) with his strategy Q over ∆ 2 (P ).
In this case, We also set Definition 2.5. If there exists an F-adapted process u * (·) ∈ U[0, T ] and a probability measure Q * ∈ ∆ 2 (P ) such that then we call the pair (u * (·), Q * ) a Nash equilibrium (non-cooperative equilibrium) for Problem (G).
Lemma 2.1 shows that the optimal strategy lies in the Nash equilibrium of the companion two-player zero-sum game. We only need to solve Problem (G) for the solutions of Problem (MMV θ ).

The structure of Y and q
In this section, we will characterize Q by a mean one nonnegative square integrable martingale. In this case, the choosing of Q is equivalent to the choosing of a control variable q. We first let for any bounded and F t -measurable function X(t). Conversely, if {Y (t) : t ∈ [0, T ]} is a nonnegative F-adapted square integrable martingale under P with E P Y (t) = 1, then, for any t ∈ [0, T ], the probability measure Q t defined via (3.2) belongs to ∆ 2 (P ), and {Q t , t ∈ [0, T ]} satisfies the following consistency condition: Proof. Suppose that Q ∈ ∆ 2 (P ). It is obvious that Y (t) defined by (3.1) is a nonnegative martingale. We note that where the above inequality is due to Jensen's inequality. Thus Y (t) is square integrable. By the definition of Radon-Nikodym derivative, (3.2) is proved. Conversely, let Y (t) be a nonnegative square integrable martingale with unit expectation. It is proved in Claim 6.1 of [8] that Q defined via (3.2) is an absolutely continuous probability measure. By the square integrability of Y (t), Q ∈ ∆ 2 (P ). The consistency condition is due to the martingale property of Y (t).
Let Y 2 (P ) be the set of all F-adapted nonnegative continuous square integrable martingales under P with E P Y (t) = 1. Thus, Q ∈ ∆ 2 (P ) if and only if Y (t) ∈ Y 2 (P ). The monotone mean-variance objective (2.4) can be formulated as An equivalent problem is to maximize The player one wants to maximize J θ (u(·), Y (·)) with its strategy u(·) over U[0, T ] and the player two wants to maximize −J θ (u(·), Y (·)) with his strategy Y (·) over Y 2 (P ).
Proof. The proof is straightforward from Lemma 3.1.
Next, we try to write Y (·) ∈ Y 2 (P ) in the form of the stochastic exponential (please refer to [11,16] and [8]). By the Brownian martingale representation theorem, for any Y (·) ∈ Y 2 (P ), there exists an R d+1 valued F-adapted process H(t) such that Because Y may hit zero at finite time, we denote by ζ = lim n→+∞ ζ n , where then Y (ζ) = 0 on {ζ ≤ T }. Let q(t) be a R d+1 valued process satisfies According to the definition, T 0 q(t) T q(s)ds = ∞ may hold in a set of positive measure, so q(·) cannot be an integrand of stochastic integral. But we can define a generalization of stochastic integral as follows (for details, please refer to Subsection 4.2.9 in [16]): . By Lemma 6.2 in [16], Y (·) admits the representation By Lemma 6.3 in [16] and the proof of Lemma 6.2 of [16], Y (·) is the unique nonnegative continuous solution to stochastic differential equation: Conversely, for any R n+1 valued F-adapted process q(·) satisfying T 0 q(s) T q(s)ds ≤ n 2 , and τ = lim n τ n . By Lemma 6.3 in [16], SDE (3.4) has an unique nonnegative continuous solution if and only if the following two conditions are satisfied If Y (·) is the nonnegative continuous solution to (3.4), then it is a nonnegative local martingale, and by Fatou's lemma, where σ n is a sequence of F-stopping times reducing Y (·). Hence, Y is a supermartingale but may not be a martingale. In this case, the measure Q defined via (3.2) may not be a probability measure. To make Q a probability measure, we have to restrict q(·) to the following smaller set Q[0, T ], so that Y (·) is a martingale:  Proof. The proof is straightforward from the above arguments.

Stochastic differential game
In this section, we consider the following classical formulation of stochastic differential game (Problem (P sxy )). This type of stochastic differential game has been studied by [26] and [20]. Through the dynamic programming principle, a nonlinear differential equation named Hamilton-Jacobi-Bellman-Isaacs equation (see theorem 2.5.2 in [26] and theorem 3.2 in [20]) is proposed to solve this stochastic differential game.
We consider the following family of stochastic differential games with different values of initial times and states:

6)
where E P s,x,y [·] represents E P [·|X u (s) = x, Y q (s) = y]. The player one wants to maximize J u,q (s, x, y) with its strategy u(·) over U[s, T ] defined below and the player two wants to maximize −J u,q (s, x, y) with its strategy q(·, ·) over Q[s, T ] defined below.
The state processes are given by with initial values X(s) = x, Y (s) = y. By Theorem 2.5 and theorem 2.9 of chapter 5 in [12], SDE (3.7) has a unique strong solution for any u(·) ∈ U[s, T ]. If (u(·), q(·)) is the candidate Nash equilibrium of Problem (P sxy ), it is important to verify that the solution of SDE (3.8) is a square integrable martingale.
Proof. The proof is straightforward from corollary 3.4.
We only consider here u(·) and q(·) are both Markov feedback control, that is, u(t) = u(t, X(t), Y (t)) and q(t) = q(t, X(t), Y (t)). The infinitesimal generator of (3.7) and (3.8) is given by The following verification theorem is an analogue of the theorem 2.5.2 in [26] or the theorem 3.2 of [20].
By Lemma 2.6, Theorem 3.8 (1)-(3) are equivalent to the following equalities which gives us the following compact form of HJBI equation:

Main results
This section consists of all the main results of this paper. In Subsection 4.1, the explicit optimal strategy and value function are given. In Subsection 4.2, the efficient frontier of monotone mean-variance problem is given. In Subsection 4.3, the monotone CAPM is presented in the absence of insurance.

Efficient frontier
Although the objective function used in this paper is not the classical bi-objective mean-variance preference, we can still obtain a set of means and variances of the terminal wealth process when letting the risk aversion θ vary from 0 to ∞. We name this set the efficient frontier. Denote by O := {u θ ∈ U[0, T ] : u θ is the optimal strategy for Problem (MMV θ ), θ ∈ [0, +∞)}.
Definition 4.6. The following set is called the efficient frontier of the optimal monotone mean-variance problem {(V ar P X u (T ), E P X u (T )), u ∈ O}.
Let u(·) = 0 and solve the SDE (3.7), then we can find the riskless wealth of insurer which is given by We set X 0 = X 0 (T ).
Theorem 4.7. (efficient frontier)If E P X(T ) > X 0 , the efficient strategy u is given by (4.2); if E P X(T ) ≤ X 0 , the efficient strategy is given by u(t) = 0, ∀t ∈ [0, T ]. The efficient frontier of the monotone mean-variance problem is given by Remark 4.8. The efficient frontier in Theorem 4.7 is actually the same as the efficient frontier of the classical mean-variance problem given by Theorem 8 in [4].
We first give some lemmas before proving this theorem.
Lemma 4.9. Define Q * by the arguments in Lemma 3.1 where Y is replaced by Y * . Let u * (·) be the optimal strategy. Then, we have Lemma 4.10. The variance of the wealth process under optimal strategy u * (·) is given by where Q * is defined in Lemma 4.9 and Proof. By Theorem 4.4 and (4.17), we have V ar P X * (t) = 1 θ 2 e 2 T t (ρ(s)−r(s))ds V ar P Y * (t) Hence V ar P X * (t) = 1 which proves (4.18).
Proof of Theorem 4.7 is as follows Proof. If E P X(T ) ≤ X 0 , taking expectation of (3.7) and comparing it with (4.15), we have the equivalent inequality as follows We claim that (u θ (·), q θ (·)) = ( 0, 0) is the Nash equilibrium of Problem (P 0x01 ) under constraint (4.20). Indeed, we have and thus where X 0 (T ) is a deterministic function given by (4.15). For any admissible q(·) ∈ Q[0, T ], we have For any admissible u(·) ∈ U[0, T ] satisfying the constraint (4.20), we have Therefore, by Lemma 2.6, when E P X(T ) ≤ X 0 , the efficient strategy is given by u θ = 0 and the optimal probability measure is given by Q e ≡ P . Since X 0 (T ) is deterministic, V ar P X 0 (T ) = 0 and E P X 0 (T ) = X 0 (T ) = X 0 . Thus, the efficient frontier is the point (0, X 0 ). If E P X(T ) > X 0 , then the efficient strategy u θ is given by the optimal strategy u * of (4.2). In fact, u * satisfies that u * (t) > 0, for ∀t ∈ [0, T ] and The insurance surplus process can be written as dR(t) = κadt + σ 0 dW 0 (t) = κadt + σ 0 (d W 0 (t) − aκ r σ 0 dt) = a(κ − κ r )dt + σ 0 d W 0 (t).
Thus, when κ = κ r it is a martingale under Q * . The return of stock process can be written as = r(t) 1dt + σ S (t)d W S (t)).
Noting that r(t) is the riskless interest rate, S i (t)/S 0 (t) is a martingale under Q * .
Remark 4.16. If a = σ 0 = 0 (no insurance in the model), Q * is an equivalent martingale measure.
Consider the following equality It coincides with the first-order conditions (B.16) in [18]. Using the term from [18], Y * (t) is called the equilibrium pricing kernel.
The following proposition is an analogue of proposition 5.1 in [18].
Proposition 4.17. If u θ is the optimal strategy for Problem (MMV θ ), then θ ζ u θ is the optimal strategy for Problem (MMV ζ ), in other words, u ζ = θ ζ u θ .
To distinguish the investment and reinsurance from optimal strategy (4.2), we denote by which implies that u m S is a market portfolio in financial market, that is, the portfolio holder does not invest any of its wealth in the risk-free asset. In what follows, we assume that only the financial market is considered (a = σ 0 = 0). We denote by X m the wealth process of the market portfolio holder, that is, Proof. Note that