INVERSE DATA ENVELOPMENT ANALYSIS WITH STOCHASTIC DATA

The inverse Data Envelopment Analysis (InvDEA) is an exciting and significant topic in the DEA area. Also, uncertain data in various real-life applications can degrade the efficiency results. The current work addresses the InvDEA in the presence of stochastic data. Under maintaining the efficiency score, the inputs/outputs-estimation problem is investigated when some or all of its outputs/inputs increase. A novel optimality concept for multiple-objective programming problems, stochastic (weak) Pareto optimality in the level of significance α ∈ [0, 1], is introduced to derive necessary and sufficient conditions for input/output estimation. Furthermore, the performance of the developed theory in a banking sector application is verified. Mathematics Subject Classification. 90C05, 90C29, 90C39, 90C90, 90B50, 47N10. Received March 10, 2021. Accepted August 27, 2021.


Introduction
Data envelopment analysis (DEA), as an excellent nonparametric efficiency measurement method, has been employed to determine the efficiency scores of a class of homogeneous decision-making units (DMUs) under several inputs and outputs [7,13]. DEA has been first proposed by Charnes et al. [4] and is based on Farrell [16] studies. DEA has been extensively utilized in several areas and employed in various applications, including water, energy, power, banking, agriculture, mining, finance, transportation, communication, sport, and education [33]. It is worth noting that although DEA is one of the most powerful tools used for efficiency evaluation, some researchers debate on the three main components of the DEA technique as follows [3,34,35]: (i) DEA is a nonparametric technique, so the usual conclusions comparing with a parametric function are impossible. In other words, formulation for any parametric function related to the production including cost, or profit function evaluation of marginal products, marginal costs, and partial elasticities is impossible almost. (ii) The conventional DEA is based on linear programming (LP). One of the important assumptions of these models is the explicit definition of constraints and the positive Lagrangian coefficients. But regression analysis is a linear model based on defined probability space with the estimated parameters and without any other constraints on the multipliers. (iii) According to the concepts of DEA, there is no chance for random noise in nature. In other words, when a deviation from the frontier is observed, it is called inefficiency in the traditional DEA. The third component is the main reason for our study.
The inverse DEA (InvDEA) is a primary and significant problem in the DEA area, where its idea has been initially introduced by Zhang and Cui [47]. In this work, the DMU's input increments are determined for its corresponding output increments in the presence of CCR efficiency-fixed constraints. In InvDEA, an efficiency level is chosen as a strategic purpose, and the primary purpose is to estimate the necessary input and output levels to attain this desired efficiency level. In contrast, in conventional DEA models, the DMU's efficiency score is usually determined under given input and output levels. Based on Ghobadi and Jahangiri [22] studies, the multiple-objective programming (MOP) tools have been utilized to estimate inputs (outputs) under growing outputs (inputs) and maintaining the efficiency score [23,45]. The previous studies on DEA verified different theoretical and practical perspectives of the InvDEA, such as merging units [17,46], restructuring units [1], and preserving efficiency [42].
Data uncertainty is a primary and significant problem in employing the traditional DEA. Although conventional DEA models assess the DMUs with crisp data, the DMUs' evaluation under imprecise or vague data by an analytical DEA structure called uncertain DEA (UDEA) has been recently considered. Data reliability in conventional DEA models cannot be attained in some practical situations. Indeed, a variety of real-life applications encounter uncertain data, which can affect the efficiency results. Different strategies have been presented in the literature to deal with these inaccurate and ambiguous data. According to Peykani et al. [38] studies, the uncertain theory-based approaches in the UDEA could be categorized into the following five classes: (i) The fuzzy DEA (FDEA) approach; (ii) The bootstrap DEA approach; (iii) The imprecise DEA approach; (iv) The robust DEA (RDEA) approach; and (v) The stochastic DEA (SDEA) approach. Fuzzy DEA (FDEA) is utilized under the vague inputs/outputs situation. Namely, the FDEA approach is employed when input/output data contain fuzzy numbers or linguistic variables [37]. The FDEA literature is divided into six categories, including the tolerance method, the fuzzy ranking method, the α-level based method, the possibility method, the fuzzy arithmetic method, and the fuzzy random/type-2 fuzzy set method [12]. Bootstrap DEA strategy is utilized in cases where the number of DMUs is not adequate, and the estimated efficient frontier is different from the actual one [41]. The imprecise DEA approach is utilized in situations where inputs/outputs include inaccurate data in the form of the interval, ratio bounded, and weak and strong ordinals [48]. As a novel strategy in the UDEA literature, RDEA is based on the RO method, an appropriate and common approach employed to deal with discrete and continuous optimization problems. This strategy does not require notable historical data and probability distribution function and ensures the solution feasibility for all feasible amounts of uncertain variables in the considered convex uncertainty set [38]. Stochastic DEA (SDEA) is the last UDEA method utilized for the performance evaluation of DMUs under uncertain data. In various realistic situations, managers deal with units with inaccurate data. In the mentioned cases, analyzers may assume inaccurate data as stochastic parameters. However, various data features can be discovered while working with stochastic variables assuming the possibility of unpredicted events. The primary benefit of working with stochastic data in DEA is the ability to predict the upcoming efficiencies. According to the DEA literature, different theoretical and practical perspectives of the SDEA models have been investigated by various researchers, such as Huang and Li [25], Jahanshahloo et al. [26], Li [31] and Sengupta [39]. It should be noted that most of the constructed models are based on nonlinear programming. In a new and different study, Behzadi and Mirbolouki [2] considered symmetric error structure for inputs and outputs and presented an envelopment stochastic input-oriented DEA model under constant returns to scale (CRS) assumption of the production technology. The main advantage of the mentioned model is its linearity. This model can calculate the relative efficiency of units at a significant specific level and identify efficient DMUs.
It is notable that the aim of a conventional InvDEA model is to estimate the input and output levels are deterministic types, which does not take the random errors of data into account in the production process. Due to the inherent complexity and competition of the real world, InvDEA problems often involve some random or ordinal variables. However, previous InvDEA studies have been undertaken in a deterministic environment, which cannot answer such situations. Therefore, it is necessary to incorporate the stochastic data into InvDEA.
It is worth noting that the stochastic InvDEA estimates inputs and outputs such that production frontiers that incorporate both inefficiency and stochastic error, which moves the frontiers closer to the bulk of the producing units. Then, the estimated inputs and outputs of DMUs are improved comparing to the deterministic InvDEA models. The main contribution of this paper is to provide a theoretical and practical framework of the stochastic InvDEA with random data.
This study presents a theoretical and practical framework of InvDEA under random data and focuses on the key thematic areas of resource allocation management, handling investment analysis problems, the implication of new performance strategies, and the motivations. In other words, InvDEA models are proposed in this study to manage resource allocation and investment analysis issues. Our proposed models are constructed to estimate inputs and output changes to keep the decision-making unit's efficiency score and other DMUs unchanged. InvDEA models are constructed for the first time to deal with stochastic data. The managers can employ these models to detect efficient DMUs from inefficient ones and design useful approaches to improve the efficiency score. These approaches can be considered as new approaches in resource allocation management and handling investment analysis problems. It is worth noting that these approaches do not change the efficiency scores. Moreover, a novel optimality principle is presented in this study for stochastic MOP (SMOP) problems, called stochastic (weak) Pareto optimality in the level of significance α ∈ [0, 1]. The mentioned points can be utilized to solve the assumed InvDEA problems. A banking sector application is provided to demonstrate the credibility and capability of the presented approach in a practical situation. The main contributions of the current work are summarized as follows: -For the first time, an InvDEA model is constructed under random data.
-For the first time, the inverse input-oriented SDEA model is constructed and employed for resource allocation problems. -Similarly, an inverse output-oriented SDEA model is constructed to deal with investment analysis problems. -A novel optimality principle is presented for SMOP, named stochastic (weak) Pareto optimality in the level of significance α ∈ [0, 1]. -Stochastic (weak) Pareto optimality in the level of significance α ∈ [0, 1] is utilized in InvDEA.
The article's remainder is organized as follows: Section 2 introduces the literature review on SDEA. Section 3 presents some principles of SDEA. The fundamental results of this work are stated in Section 4. The mentioned section offers new inverse models. A banking sector example is provided in Section 5 to demonstrate the performance of the developed theory. Finally, a short conclusion and future work aspects are presented in Section 6.

Literature review on SDEA
In the conventional DEA models, deterministic data are considered. However, in practical applications, we may face some stochastic data, including costs, prices, and production, depending on the financial crisis, social and political features, inflation, and natural components. In other words, due to the intrinsic uncertainties in many practical applications, stochastic estimations are inevitable in the efficiency evaluation procedure. Therefore, the expansion of DEA models can be considered a necessity. SDEA contains efficiency analysis strategies based on production economic principles like convexity or ray unboundedness of the included stochastic inputs and outputs. Moreover, they employ statistical principles or distributional considerations to estimate stochastic inefficiency rather than a deterministic or a stochastic production possibility set (PPS).
According to Olesen and Petersen [36] studies, the SDEA theory-based approaches can be classified into the following two categories: (i) The semi-parametric stochastic frontier analysis (SFDEA) approach; and (ii) The chance-constrained DEA (CCDEA) approach. The stochastic frontier analysis is a regression analysisbased method in which econometric models like Cobb-Douglas production functions are utilized to predict the production frontier [30]. CCDEA approach is usually utilized for stochastic inputs/outputs with a determined probability distribution function for the uncertain parameters [5]. To deal with stochastic data in DEA, Thore [43] developed DEA-based chance-constrained programming. Cooper et al. [6] explicitly modeled a stochastic PPS to propose an SDEA method. Cooper et al. [5] generalized DEA models with inputs and outputs described with normal random variables. Moreover, they presented stochastic efficient DMU. Li [31] and Huang and Li [25] introduced the efficiency dominance of a DMU using probabilistic comparisons of inputs and outputs with other DMUs, which are evaluated through solving a chance-constrained programming problem. Then, some DEA models have been constructed with stochastic data by various researchers, such as Khodabakhshi [29], Behzadi et al. [3], Lotfi et al. [34] and Hosseinzadeh Lotfi et al. [24].
Behzadi and Mirbolouki [2] introduced the symmetric error structure and proposed a linear form of stochastic CCR model. In the current work, the inverse version of the SDEA model addressed in Behzadi and Mirbolouki [2] is considered to extract necessary and sufficient conditions for input and output estimation under preserving efficiency score.
CCDEA and semi-parametric SFDEA seems to be competing methods within the SDEA field. SFDEA's proponents may claim that this method is superior to CCDEA due to its consistency with essential principles in production theory and its strong statistical basis, including a basic data production procedure. In contrast, CCDEA's proponents may claim this method's advantage since DMU-specific distributions of noise and inefficiency can be easily incorporated, and multiple inputs/outputs can be explicitly handled. Moreover, a recent study performed by Simar [40] provides a data production procedure for a multivariate, cross-sectional, nonparametric stochastic frontier model employed in the CCDEA context. According to Olesen and Petersen [36] studies, although the stochastic PPS estimators in the context of SFDEA and CCDEA have different features in the modeling of noise and stochastic inefficiency, all have one output and multiple inputs.

Stochastic DEA
Consider a set of n DMUs, {DMU j : j ∈ J = {1, 2, . . . , n}}, such that DMU j produces output levelỸ j = (ỹ 1j ,ỹ 2j , . . . ,ỹ sj ) t by consuming input levelX j = (x 1j ,x 2j , . . . ,x mj ) t . Let the input and output levels of DMU j , j ∈ J, be normal-stochastic numbers as follows: where x ij and σ 2 ij for all i ∈ I and j ∈ J, are mean and variance of inputs, respectively. Also, y rj and ψ 2 rj for all r ∈ O and j ∈ J, are mean and variance of outputs, respectively. It is worth noting that the mean and variance of data are rarely unknown. Therefore, we try to estimate these parameters. For this purpose, it is a suitable estimator, if the estimator is an unbiased and efficient estimator. According to the stochastic DEA literature, if x t ij ,ŷ t rj | i ∈ I, r ∈ O, t = 1, 2, . . . , p is a random sample with size p for the DMU j , j ∈ J, then are unbiased estimators for "x ij " and "σ 2 ij ", and are unbiased estimators for "y rj " and "ψ 2 rj ". Clearly, if the sample size is large enough, thenx ij (ȳ rj ) andσ 2 ij (ψ 2 rj ) are the best estimators for mean and variance of inputs (outputs).
The following input-oriented SDEA problem is considered to estimate the relative efficiency in the level of significance α of the unit under assessment DMU o , o ∈ J [8]: In this model, (µ, θ) ∈ R n × R is the variables vector. Also, α ∈ [0, 1] is a predefined error level and P means "Probability". Moreover, δ 1 , δ 2 , δ 3 ∈ {0, 1}. Clearly, the production technology is under the following conditions: According to the DEA literature [8], if Φ is cumulative standard normal distribution function, then the stochastic model (3.1) can be simplified to the following non-linear problem: Also, it is clear that if all the input and output levels of the units are uncorrelated, then: The output-version of the model (3.1) in the level of significance α is as follows [8]: (µ, ϕ) ∈ R n × R is the variables vector in model (3.3). Considering Φ as cumulative standard distribution function, the stochastic model (3.3) can be simplified to model (3.4).
If ϕ o (α) = 1 (resp. θ o (α) = 1), then DMU o is called stochastic output (resp. input-) oriented weakly efficient in the level of significance α. Also, Cooper et al. [8] established the following theorem. (i) These models are feasible for all the levels of significance α.
Remark 3.2. Assume that the input and output levels of the DMUs have a symmetric error structure. In fact, the following structure is considered for the inputs and outputs of DMU j , j ∈ J: ∀i ∈ I, where a ij ∈ R ≥0 , b rj ∈ R ≥0 ,ε ij ∼ N (0,σ 2 ), andξ rj ∼ N (0,σ 2 ). In fact,ε ij andξ rj are errors of input and output levels in conflict with the mean values, respectively. If the input and output vectors of the units are uncorrelated andε ij =ε i andξ rj =ξ r for all i ∈ I and r ∈ O, then models (3.2) and (3.4) can be transformed to linear programming (LP) models (3.6) and (3.7), respectively [2]: In the models (3.6) and (3.7), ϕ, θ, µ j , p + i , p − i , q + r , and q − r are variables for all indices.

Stochastic InvDEA
The DMU's efficiency score is usually determined under given input and output levels in conventional DEA models. In contrast, in InvDEA, an efficiency target is chosen as a strategic purpose, and the primary purpose is to find the necessary input and output vectors to attain this desired efficiency target. Initially, the following significant problem has been studied in the framework of InvDEA [23,45]: Question. If the efficiency index of DMU o remains unchanged with respect to other DMUs, but the input/output levels increase, to what extent should the output/input levels of DMU o increase?
This Question was solved using MOP problems by Wei et al. [45] and Hadi-Vencheh et al. [23]. Different theoretical and practical perspectives of the InvDEA has been studied in the DEA literature, such as setting revenue target [10], investment analysis [44], and resource allocation [14] under inter-temporal dependence of data [19,20,28], and fuzzy data [18].
Nevertheless, the constructed models in the mentioned works could not solve the issue of input-estimation and output-estimation in the presence of stochastic data. Therefore, Sections 4.1 and 4.2 attempt to estimate the input/output levels in the presence of stochastic data.

Estimation of input levels
This sub-section extends the following question (proposed by [23]) in a stochastic framework.
The goal of Question 1 is to find the input levelsα o = (α 1o ,α 2o , . . . ,α mo ) on the condition that the efficiency score of DMU o in the level of significance α is still θ o (α). More accurately, Assume DMU n+1 indicates the entity generated after increasing the input and output levels of DMU o . The following model estimates the efficiency index of DMU n+1 in the level of significance α: If the optimal values of the problems (3.1) and (4.1) are the same in the level of significance α, it is said that the efficiency index remains unchanged, i.e.
The following SMOP problem is proposed in the level of significance α to answer the above question or find the input levels.
In this model, (µ,α o ) is the variables vector. θ o (α) is a constant, which represents the optimal value of variable θ in the problem (3.1). Additionally,β o is known and fixed. A novel optimality concept is defined for the SMOP (4.2) based on the special structure of this problem.
where is a non-archimedian infinitesimal.  2) provided that one of the following conditions holds: (ii) There exists at least l ∈ I such that P {α * lo −x lo ≥ } ≥ 1 − α where is a non-Archimedian infinitesimal.
We close this sub-section by discussing when to start using InvDEA. Suppose that the output levels of DMU o are increased fromỸ o toβ o . We consider the following assumption in the level of significance α: or equivalently, By referring to the concepts of statistics, a (1 − α)% confidence interval for the mean inputs and the mean outputs are as follows: wherex ij andσ 2 ij are the mean and variance of inputs, respectively. Also, β rj and ψ 2 rj are the mean and variance of outputs, respectively. Moreover, t ( α 2 ) is a random variable with distribution T and degree of freedom p − 1. Therefore, the lower and upper bounds of inputs and outputs are as follows: According to the pessimistic and optimistic viewpoints [9,15,21], the following models proposed to estimate the upper and lower bounds of efficiency of DMU n+1 with (1 − α)% confidence interval: It is obvious that θ n+1 (α) ∈ θ l n+1 (α), θ u n+1 (α) . With regard to model (4.1) (settingα o =X o ), if the optimal value of this model does not belong to θ l n+1 (α), θ u n+1 (α) , then the proposed method in this sub-section can be applied to estimating of the input levels such that the efficiency score of DMU o remains unchanged in the level of significance α (θ o (α)).
Remark 4.5. According to Remark 3.2, if all input and output levels of the units have a symmetric error structure, then the same method for converting problem (3.2) to the problem (3.6) can be employed to convert the stochastic models (4.1) and (4.2) to the LP and MOLP problems, respectively.

Estimation of output levels
This sub-section is devoted to extending the following question (provided by [45]) in a stochastic framework. Suppose DMU n+1 indicates the entity obtained after increasing the input and output levels of DMU o . The following problem estimates the efficiency score of DMU n+1 in the level of significance α: If the optimal values of models (3.3) and (4.11) are equal, it is said that the efficiency index in the level of significance α remains unchanged, namely To answer Question 2, namely to finding the output levels, the following SMOP model is suggested: max β 1o ,β 2o , . . . ,β so (4.12) where (µ,β o ) is the variables vector in the above problem. ϕ o (α) is a constant value, which describes the optimal value of variable ϕ in the problem (3.3). Moreover,α o is known and fixed. Theorem 4.6 shows how the model SMOP (4.12) can be applied to answer Question 2 in the stochastic framework.  Proof. If Π not be a SWP solution in the level of significance α to problem (4.12), then there is another feasible solution to model (4.12), ∆ = µ,β o = β 1o ,β 2o , . . . ,β so in which, where is a non-Archimedian infinitesimal. Feasibility of ∆ for SMOP (4.12), implies Therefore, there is a positive scalar k o > 1 where In a similar discussion to the final part of Section 4.2, assume the input levels of DMU o are increased from X o toα o . We consider the following statistical assumption in the level of significance α: By referring to the concepts of statistics, a (1 − α)% confidence interval for the mean inputs and the mean outputs are as follows: whereᾱ ij andσ 2 ij are the mean and variance of inputs, respectively. Also,ȳ rj andψ 2 rj are the mean and variance of outputs, respectively. Moreover, t ( α 2 ) is a random variable with distribution T and degree of freedom p − 1. Therefore, the lower and upper bounds of inputs and outputs are as follows: According to the pessimistic and optimistic viewpoints [9,15,21], the following models proposed to estimate the upper and lower bounds of efficiency of DMU n+1 with (1 − α)% confidence interval: It is obvious that ϕ n+1 (α) ∈ ϕ l n+1 (α), ϕ u n+1 (α) . Considering model (4.11) (settingβ o =Ỹ o ), if the optimal value of this model does not belong to ϕ l n+1 (α), ϕ u n+1 (α) , then the presented approach in this sub-section can be used to finding of the outputs under preserving the efficiency index of DMU o in the level of significance α (ϕ o (α)).
Remark 4.8. According to Remark 3.2, if all input and output levels of the units have a symmetric error structure, then the same method for converting problem (3.4) to the problem (3.7) can be employed to convert the stochastic models (4.11) and (4.12) to the LP and MOLP problems, respectively.

An application
The current section provides an application of the presented method in the banking sector. A dataset containing 20 branches of an Iranian commercial bank is considered to verify the realization of the research goals. The dataset containing three inputs and five outputs is reproduced from Behzadi and Mirbolouki [2] and presented in Appendix C (see Tabs. C.1 and C.2). Nevertheless, the proposed method can be utilized for other fields.
There are two main approaches for selecting input and output factors in the literature: the production approach and the intermediation approach. In this study, the intermediation approach is employed. The personal rate (x 1 ), payable benefits (x 2 ), and delayed requisitions (x 3 ) are considered as inputs while the facilities (ỹ 1 ), amount of deposits (ỹ 2 ), received benefits (ỹ 3 ), received commission (ỹ 4 ), and other resources of deposits (ỹ 5 ) are considered as the outputs. The weighted combination of personal qualifications, quantity, education, and other bank branches is considered the cost input. The payable benefits of all deposits to customers in each branch are considered as the payable benefits. Delays in repaying loans and other facilities in each branch are considered as delay claims. The sum of business and single loans in each branch is considered as the facilities. The value of various deposits, including current, and short/long duration accounts in each branch, is considered the number of deposits. The received benefits from the total loans and facilities are considered as the received benefits. The sum received commission of all banking actions, issuance guaranty, money transfer, and others in each branch is considered the received commission.
According to Table 1, for instance, it is evident that DMU02 is efficient in the level of significance α = 0.01 (θ 02 (0.01) = 1). An InvDEA case is described as follows: Suppose that DMU02 describes targets for some criteria, which should be realized in the future. Namely, consider that DMU02 should verify this problem: among all branches, how the input levels, including the personal rate, payable benefits, and delayed requisitions, can increase if its efficiency index in the level of significance α = 0.01 is kept unchanged, yet the output levels, including facilities, number of deposits, received benefits, received commission, and other resources of  deposits increase fromỸ 02 = ỹ 1 02 ,ỹ 2 02 ,ỹ 3 02 ,ỹ 4 02 ,ỹ 5 02 , toβ 02 = β1 02 ,β 2 02 ,β 3 02 ,β 4 02 ,β 5 02 as shown in Table 2. More accurately, the percentage of expected increases in the facilities (ỹ 1 02 ), amount of deposits (ỹ 2 02 ), received benefits (ỹ 3 02 ), received commission (ỹ 4 02 ), and other resources of deposits (ỹ 5 02 ) are given in Table 2. Considering SMOP (4.2) in the level of significance α = 0.01 corresponding to DMU02 and employing the weight-sum approach [11], two SP solutions (two scenarios to increase input levels under preserving the efficiency index) are obtained to estimate the input vector (personal rate, payable benefits, and delayed requisitions) as presented in Table 3. More accurately, the percentage of the required increase in the input levels in each of two scenarios increases input levels under preserving the efficiency index in the level of significance α = 0.01 (personal rate, payable benefits, and delayed requisitions) should be similar to Table 3. More accurately, the percentage of the required increase in the input levels in each of two scenarios increases the input levels under preserving the efficiency index in the level of significance α = 0.01 (personal rate, payable benefits, and delayed requisitions), like to Table 3. Table 3 proposes two patterns to the decision-maker to choose the best decision to generalize DMU02. This means that the decision-maker can perform necessary operations by selecting an appropriate method for extending DMU02. Consider that, based on Theorem 4.3 the efficiency score for the two patterns are equal to one in the level of significance α = 0.01. Table 3 shows that if the output levels of branch DMU02 increase from Y 02 toβ 02 , then the following operations should be performed to retain the efficiency score of this bank: (i) If the first scenario is selected, then the mean and variance of the personal rate input level (input 1) should increase to 5.76% and 18.87%, respectively. In contrast, if the second scenario is selected, the mean and variance of this input should increase to 10.67% and 37.74%, respectively. (ii) If the second scenario is selected, then the mean and variance of the second input factor should increase to 10.49% and 1.66%, respectively, while if the first scenario is selected, without increasing the mean of the payable benefits level, only the variance of this input should increase to 2.07%.  (iii) If the first scenario is selected, then the mean and variance of the delayed requisitions input level (input 3) should increase to 0.11% and 50.00%, respectively, while if the second scenario is selected, then the mean and variance of this input should increase to 0.06% and 100.00%, respectively.
As another intended performance level, another InvDEA is described as follows: Suppose that DMU04 describes goals for some criteria, which should be attained in the future. Namely, consider that DMU04 should study this problem: among all branches, how the input levels (personal rate, payable benefits, and delayed requisitions) increase if its efficiency index in the level of significance α = 0.01 remains unchanged (θ 02 (0.01) = 0.33122), yet the output levels increase fromỸ 04 , toβ 04 as presented in Table 4. More accurately, the percentage of intended increases in the facilities (ỹ 1 04 ), amount of deposits (ỹ 2 04 ), received benefits (ỹ 3 04 ), received commission (ỹ 4 04 ), and other resources of deposits (ỹ 5 04 ) are given in Table 4. Considering SMOP (4.2) in the level of significance α = 0.01 corresponding to DMU04 and adopting the weight-sum approach [11], two SP solutions are produced to estimate the input levels as presented in Table 5. More accurately, the percentage of the required increase in the input levels in each of two scenarios to increase input levels under preserving the efficiency score in the significance level α = 0.01 must be like Table 5. Table 5 proposes two patterns to the decision-maker to select the optimal decision to extend DMU04. This means that the decision-maker can perform required operations by selecting appropriate strategies to spread DMU04. Based on Theorem 4.3 the efficiency score for the two patterns are equal to "0.33122" in the level of significance α = 0.01.
At first, under maintaining the efficiency score, the inputs (outputs)-estimation problem is solved by Hadi-Vencheh et al. [23] and Wei et al. [45] when some or all of its outputs/inputs increase. Afterward, different theoretical and practical perspectives of the InvDEA have been verified in the previous studies, such as Lin [32], Dong Joon [10], Hadi-Vencheh et al. [23], Ghobadi [20], Emrouznejad et al. [14], Jahanshahloo et al. [27], Wegener and Amin [44], Jahanshahloo et al. [28], Gattoufi et al. [17], Ghobadi [21] and Amin et al. [1]. Nevertheless, the constructed models in the mentioned works could not solve the issue of input-estimation and output-estimation in the presence of stochastic data. Thus, there is no available approach to compare our results with that.

Conclusions
In the current work, the InvDEA problems are generalized for input/output estimation in the presence of stochastic data. We used MOP problems to derive the necessary and sufficient conditions for input/output estimation. A novel optimality concept (stochastic (weak) Pareto optimality in the level of significance α ∈ [0, 1]) for MOP problems was introduced to attain this goal. Its performance in attaining the desired efficiency in a banking sector application is investigated to describe the presented InvDEA strategy.
The obtained results give novel relationships among DEA and MOP problems. Besides, the mentioned results can be utilized in real applications, such as resource allocation, sensitivity analysis, preserving (or improving) efficiency values, determining revenue targets, merging the banks, and firm restructuring. Identification of the type of distribution of the efficiency random variable and also obtaining similar models in dynamic and network DEA frameworks can be considered as future research aspects.

Appendix C.
See Tables C.1 and C.2.