The protection and recovery strategy development of dynamic resilience analysis and cost consideration in the infrastructure network

Modern life is becoming more and more convenient, all because of the perfect operation of the infrastructure network. However, if these infrastructure networks encounter interference, the operation of the network system will be delayed or even shut down, often causing huge losses in livelihood, economy, and society. Therefore, how to evaluate the resilience of the network system and provide protection and recovery strategies to deal with attacks that interfere with the system are important issues. This study considers a situation with protection, attack, and recovery strategies, proposes the time-related Binary-Addition Tree-based Resilience Assessment to consider more decision variables and parameters, and further includes the costs in the formulation of the protection and recovery strategies. Moreover, a new performance measure oriented to the degree of network reliability recovery to quantify the resilience of the network system is developed.


List of symbols
n : Numbers of nodes in a network m : Numbers of arcs in a network V : Node set V in a network; V = {1, 2, . . . , n} E : A r c s e t E in a network; E = {a 1 , a 2 , . . . , a m } e i, j : Directed arc from nodes i to j a k : kth directed arc in E D: Success probability of each arc in a binary-state network G(V, E , D) : A graph with the source node 1, the sink node n, V, E, and D. For example, Fig. 2  Total number of recovered arcs T r : Number of recovered arcs per unit time P(a k ) : Success probability of a k for the original network and the full recovery strategy P (a k ) : Success probability of a k for the partial recovery strategy Cr(a k ) : Full recovery cost of a k Cr (a k ) : Partial recovery cost of a k Cpv(a k ) : Protect variable cost of a k Cpf(N p ) : Protect fixed cost of N p Crf(T r ) : Recovery daily fixed cost of T r sumCp : Total protection costs sumCr : Total full recovery costs sumCr : Total partial recovery costs sumCost : SumCost = sumCp + sumCr sumCost : SumCost' = sumCp + sumCr' ϕ D : Dynamic (time-related) network resilience for strategy evaluation ϕ S : Static (time-irrelated) network resilience for arc importance calculation R: Reliability of the network R 0 : Reliability of the original network R d : Reliability of the disrupted network R d (t) : R d at recovery time point t with full recovery cost R d (t) : R d at recovery time point t with partial recovery cost R r : Reliability of the recovered network I (a k ) : Importance weight of a k X : State vector and its ith element is the state of a k for k = 1, 2, . . . , m; i.e. X = {X(a 1 ), X(a 2 ), . . . , X(a m )}
Therefore, when building a resilient network system, in addition to considering network resilience performance, the related costs should also be considered at the same time. In terms of cost, the cost of this study is mainly divided into fixed cost and variable cost. The definitions of fixed costs and variable costs are provided in Table 1. Three studies (Jia & Zhang, 2020;Sharma et al., 2020;Zhu et al., 2021) specifically mentioned N c in the research. Sharma et al. (2020) further subdivided N c 's labor cost and discussed in more detail. Smith et al. and González et al. (González et al., 2016;Smith et al., 2020) considered the cost of flow and the cost of recovery space. The cost related to N c and recovery space is classified under the fixed cost of the recovery strategy in this study. Moreover, to develop a more comprehensive protection and recovery strategy, it is necessary to discuss the given parameters and decision variables in a more  Hosseini et al. (Hosseini et al., 2016a, 2019, the resilience quantitative assessment approaches contain stochastic and deterministic approaches. In stochastic approaches, the most common approaches include probabilistic models such as Bayesian networks (Hosseini & Barker, 2016;Hosseini et al., 2016b;, probabilistic solution discovery algorithm-based model (Zhu et al., 2021), and Copeland score (Xu et al., 2020), and structural-based models such as simulation (Pant et al., 2014;Hosseini & Barker, 2016;Hosseini et al., 2016b;Monroe et al., 2018;Rocco et al., 2018;Xu et al., 2020), optimization (González et al., 2016;Gomez & Baker, 2019;Jia & Zhang, 2020;Sharma et al., 2020;Smith et al., 2020;Zhu et al., 2021), and fuzzy logic models. In deterministic approaches, there are the transport diversity (Hsieh & Feng, 2020), Binary-Addition Tree (BAT)-based model (Su & Yeh, 2020), and the Binary-Addition Tree-based Resilience Assessment (BATRA)based model proposed by this paper.
It can also be seen from Table 1 that although most of the existing studies use simulation or optimization to solve the resilience of the network system, this research still applies a deterministic approach to solve the resilience of the network system. The simulation solution obtained by the simulation approach is not an exact solution, and it is not intuitive enough for decision makers. In the optimization model, it is usually necessary to fix multiple parameters to obtain the best recovery strategy for a specific protection strategy and attack strategy (Gomez & Baker, 2019). However, in fact, we cannot really predict what kind of attack strategy decision makers will face. Therefore, we believe that in the design and planning stage of the network system, we must consider all reasonable protection strategies, the most likely attack strategies, and the most effective protection strategies. Also, this research tends to focus on discussing more decision variables in protection, attack, and recovery strategies, and provides the whole process of analysing variables to be clearly and completely presented to decision makers for reference. Therefore, this study adopts a deterministic approach to solve the resilience and cost of the network system.
In order to achieve a comprehensive set of protection and recovery strategies to resist external attacks on network systems and make it easier for decision makers to understand the strategy development process, we propose a time-related BATRA based on BATRA (Su & Yeh, 2020). To build a resilient network, the proposed time-related BATRA sequentially generates combinations of relevant decision variables and state vectors for protection, attack, and recovery strategies.
Unlike BATRA, which unconditionally enumerates all strategies, the proposed time-related BATRA provides more available decision variables (i.e. importance weight of arcs, number of arcs to be protected, number of arcs to be attacked, number of arcs to be recovered, etc.). Then, according to these decision variables, the state vector values of the corresponding strategies are listed. In addition, the proposed time-related BATRA considers a more realistic successive recovery (i.e. the number of arcs recovered per unit time) rather than a one-time recovery, which is why it is called time-related.
In the time-related BATRA, we propose a new resilience performance measurement, which is based on the resilience evaluation when considering both protection and recovery strategies proposed by Zhu et al. (2021). Our newly proposed resilience metric can also reflect changes in the network reliability during the recovery process and evaluate the network resilience. The difference is that the resilience metric we propose is based on the degree of network reliability recovery each time, rather than the degree of network reliability still loss each time. Because we found that the assumption that the network is recovered to its original state must be satisfied in order to correctly use the resilience metric based on the degree of network reliability still loss each time (Zhu et al., 2021). We want to give decision makers the choices of full recovery or partial recovery, so we need to escape the assumption that the network can only be recovered to the original state. Therefore, we adopt our newly proposed resilience metric.
To produce a series of protection, attack, and recovery strategies, a large number of calculations are inevitable in a complex network or even a simple network. The foundation of all complex networks starts with a binary-state network. Calculating the reliability of a binary-state network is an NP problem (Colbourn, 1987;Aven, 1988;Levitin, 2005;Hao et al., 2020;Yeh, 2020a), and so is the extended network resilience. In order to cope with the complexity of calculation, both the proposed timerelated BATRA and the aforementioned BATRA are developed based on BAT, which has excellent performance in binary-state networks. In addition, BAT has the advantages of being simple and easy to understand and customize, which is easy for decision makers to grasp and accept (Yeh, 2020a).
The main contributions of this research can be summarized into three aspects. First, we consider more decision variables and parameters in the protection, attack, and recovery strategies, and further include the costs in the formulation of the protection and recovery strategies. Second, we develop a new performance measure oriented to the degree of network reliability recovery to quantify the resilience of the network system. Third, we propose the time-related BATRA to determine decision variables, obtain protection, attack, and recovery strategies, and calculate the final network resilience and related cost.
The remainder of this article is arranged as follows: Section 2 describes the problem and formulates its performance index and decision variables. Section 3 presents the proposed method with an illustrated example. In Section 4, a case study is analysed through the proposed method. Finally, Section 5 provides the conclusion.

Problem Description
This section is divided into four parts to complete the description of the research problem, namely problem background, proposed resilience metric, decision variables and related constraints, and assumptions.

Problem background
This research aims to build a resilient network system to resist external attacks, so that the network system can operate normally and meet demands. The resilient network system is modeled as a two-terminal binary-state network G(V, E, D) with the source node 1 and the sink node n, where V = {1, 2, . . . , n} represents the set of nodes, E = {a 1 , a 2 , . . . , a m } represents the set of arcs, and D represents the success probability of each arc in a binary-state network. The arcs, indexed by k = 1, 2, . . . , m = |E|, are used to represent the network components, with their states represented by state vector X = {X(a 1 ), X(a 2 ), . . . , X(a m )}, where X(a k ) = 1 if arc k is functioning and X(a k ) = 0 otherwise. The arcs contained in E are assumed to be imperfect and may be broken, but the nodes contained in V are assumed to be perfect and will not be broken.  (Pant et al., 2014;Hosseini et al., 2016a;Zhu et al., 2021).
The sequential decision-making process between our side and the attacker is our protection strategy, the attacker's attack strategy, and our recovery strategy. As for the information shared by our side and the attacker, the protected arcs will not be destroyed by the attack, and the network structure, i.e. the number of arcs and the number of nodes. However, the attacker has no way of knowing the probability distribution D of the arcs in the network and our protection strategy. In the protection strategy, the total number of arcs to be protected, N p , needs to be determined. Then, the order of the arcs to be protected is determined according to N p and the importance weight of arc k, I(a k ). The state vector used to represent the arc to be protected in the lth protection strategy is where Xp l (a k ) = 1 if arc k is functioning and Xp l (a k ) = 0 otherwise. Also, the corresponding protection costs will be calculated.
In the attack strategy, the total number of arcs to be attacked, N a , needs to be determined. The state vector used to represent the arc to be attacked in the qth attack strategy is Xa q = {Xa q (a 1 ), Xa q (a 2 ), . . . , Xa q (a m )}, where Xa q (a k ) = 1 if arc k is functioning and Xa q (a k ) = 0 otherwise. Then, according to Xp l and Xa q , the failed status v of the network after the attack can be obtained, which is In the recovery strategy, the total number of arcs to be recovered, N r , and the number of recovered arcs per unit time, T r , need to be determined. According to N r , T r , and I(a k ), we can determine the order of the arcs to be recovered. The state vector used to represent the arc to be recovered in the wth recovery strategy for the network failed status v at t time point is if arc k is functioning and Xr vw (t)(a k ) = 0 otherwise. Notice that only the attacked arc will be recovered, and the network can recover perfectly to its original status or recover imperfectly to its available status with corresponding costs. During the recovery process, the network reliability under different recovery stages can be obtained. Based on the changes in the recovery degree of the network reliability, the corresponding network resilience performance and the total costs can be obtained.

Proposed resilience metric
For a resilient network system, the network can be divided into three states according to the time horizon as shown in Fig. 1 (Pant et al., 2014;Hosseini et al., 2016a;Zhu et al., 2021): the original state before disruption (t 0 ≤ t < t d ); the recovery state after disruption (t d ≤ t < t r ); and recovery completion state (t r ≤ t), where t is the current time point, t 0 is the time point of the original state with reliability R 0 , t d is the time point right after disruption with reliability R d , and t r is the time point right after recovery with reliability R r . R 0 is the network reliability when the network is operating normally. When a disruptive event occurs at t d , the network reliability will be reduced to R d . The network has been in a damaged state or recovery state until the recovery process is completed. Therefore, the network reliability during the network recovery process is represented by R d . If the full recovery strategy is used, the network reliability at time t is represented by R d (t); if the partial recovery strategy is used, the network reliability at time t is represented by R d '(t). In Fig. 1, the recovery process is a stepped shape, because the recovery strategy includes recovering a certain number of arcs per unit time, which is T r . Moreover, in this article, the network may be completely recovered to its original state with full recovery costs, or it may be partially recovered to an acceptable state with partial recovery costs; that is, R r is not necessarily equal to R 0 .
Regarding the calculation of network resilience, Su and Yeh (2020) integrated the previous literature and proposed a resilient performance indicator that simultaneously considers the loss and recovery of reliability, such as equation (1), and applies it to the proposed BATRA to analyse network resilience. However, equation (1) mentioned in Su and Yeh (2020) only considers onetime recovery; that is, it does not consider T r . Since equation (1) is not related to the time, we name equation (1) a static resilient performance indicator in this article. In our proposed time-related BATRA, before generating protection, attack, and recovery strategies, equation (1) will only be used to calculate the importance weights of arcs, I. Equation (1) represents the ratio of the degree of recovery to the degree of loss, which has been proven for its effectiveness (Pant et al., 2014;Hosseini & Barker, 2016;Hosseini et al., 2016a, b;Su & Yeh, 2020;Zhu et al., 2021). Zhu et al. (2021) proposed equation (2) to calculate the change in the degree of network reliability loss during the recovery process, as shown in the nongray area between the t d and t r sections in Fig. 1. Since equation (2) is considered from the degree of reliability loss, the smaller the value, the higher the network resilience. However, we found that to correctly use equation (2) proposed in Zhu et al. (2021), it is necessary to assume that the network must be recovered to its original state with reliability R 0 . However, the real recovery situation may be limited by cost, and it may not be able to perfectly recover the network to its original state. We expect to provide decision makers with the option of partially recovering the network; thus, we propose equation (3).
To calculate the network resilience, what we are concerned about is to calculate the change in the degree of recovery of the network reliability during the recovery process, as shown in the gray area between the t d and t r sections in Fig. 1 and the content of the square brackets in equation (3). Equation (3) is also multiplied by T r to show the effect of different T r on resilience performance. Since equation (3) is considered from the degree of reliability recovery, the larger the value, the higher the network resilience. Also, there is no need to assume that the network must be recovered to its original state. In equation (3), a network resilience of around 100 means that the network is recovered to an operational state. This equation (3) is the dynamic resilient performance indicator we proposed, and it is named dynamic because it is related to time. In our proposed time-related BATRA, after generating protection, attack, and recovery strategies, equation (3) will be used to calculate the final network resilience.

Decision variables and related constraints
In this section, we summarize and explain the main decision variables and related constraints in the decision-making process of this research. The real-valued decision variables are importance weight of the kth arc, I(a k ), total number of arcs protected, N p , total number of arcs attacked, N a , total number of arcs recovered, N r , and number of arcs recovered per unit time, T r . Also, the vectorformed decision variables are the state vector for the protection strategy l, Xp l , the state vector for the attack strategy q, Xa q , and the state vector for the recovery strategy w for the network failed status v at recovery time point t, Xr vw (t). The vector-formed decision variables can be defined when the real-valued decision variables are determined. Constraints related to the real-valued decision variables are listed and explained as equations (4)-(10) below.
I(a k ) is calculated first among all decision variables, and its value is between 0 and 1, as shown in equation (4). For smart protectors, they will choose a protection strategy whose least sum of protection cost is less than the sum of recovery variable cost and larger than n c , as shown in equations (5) and (6). n c is the number of arcs with the least attack to disconnect the network. n c is found beforehand and can be regarded as a known parameter.
In equation (5), summation over "a" goes from 1 to N p means the sum of Cpv(a) from the smallest to the N p smallest. If the sum of recovery variable cost (right-hand side of equation 5) is over the least sum of protection cost (left-hand side of equation 5), then the implication will be that the protection strategy only selects at most to N p arcs, rather than selecting all the arcs of the network. In equation (6), the upper bound of N p is the number of arcs, |E|, and the bound of N p is n c . For smart attackers, they will choose to attack n c to 2 * n c arcs to disconnect the network more effectively, as shown in equation (7). Both the protector and the attacker know that the protected arc will not be broken when attacked. Therefore, a smart attacker will only attack n c to 2 * n c arcs as the constraint in equation (7), where n c is the number of arcs with the least attack to disconnect the network and 2 * n c means that the attacker believes that the protector will protect the n c arcs. In order to increase the success rate of disconnecting the network, the attacker may attack more arcs at most to twice the n c . N p , N r , and T r have a mutual influence restriction relationship, as shown in equations (8)-(10). After we obtain N p and N a , we can obtain the corresponding N f . The upper and lower bounds of N f depend on N p as in equation (8). According to N f , the upper and lower limits of N r are therefore determined as in equation (9). Then, according to N r , the upper and lower limits of T r are therefore determined as in equation (10).
The related usage of equations (4)-(10) will be used and explained in detail in Sections 3.2 and 3.3 as well.

Assumptions
The following are the assumptions used in the research: 1) The arcs may be broken, but the nodes will not. 2) There are two states of each arc: working or failed.
3) There are no parallel arcs or loops in the network. 4) The success probability of each arc does not affect each other. 5) The network can recover perfectly to its original status or recover imperfectly to its available status with corresponding costs. 6) The attacker can see the network structure, but does not know the protection strategy.
7) The main purpose of the attacker is to disconnect the network with the least attack on arcs. 8) Both the protector and the attacker know that the protected arc will not be broken when attacked. 9) Only the attacked arc will be recovered. 10) This paper only considers the cost of our side (protector and recover), not the cost of the attacker.
Relevant explanations about these assumptions are provided as follows: The failure state usually occurs in the arc component or the node component in the network system. Table 1 summarizes the location of the failure status of different studies. Also, this study assumes that the failure will occur on the arc as assumption 1.
As can be seen from Table 1, most of the studies (Pant et al., 2014;González et al., 2016;Hosseini & Barker, 2016;Hosseini et al., 2016b;Liu et al., 2016;Monroe et al., 2018;Rocco et al., 2018;Gomez & Baker, 2019;Hsieh & Feng, 2020;Jia & Zhang, 2020;Sharma et al., 2020;Smith et al., 2020;Su & Yeh, 2020;Zhu et al., 2021) discussed the failure state of network components in a binary state. If the component failure status is Boolean value (i.e. two states, False and True), then the component capacity, whether it is deterministic or stochastic, as long as the component is failed, will be reduced from the original capacity to the capacity that cannot make the network system work normally. Therefore, these studies do not specifically discuss or emphasize capacity. Therefore, this study also assumes that there are two states of each arc component: working or failed as assumption 2.
Moreover, among the studies in which the failure state of these network components is a binary state, some failure clearly defines the component state as a binary state, and some are not mentioned in the article. Therefore, this study clearly defines the component state as a binary state. In addition, Xu et al. (2020) proposed a multistate component failure state, and the multistate component capacity. Also, Zhu et al. (2021) is the only known literature that has made an in-depth discussion on component capacity and flow, and proposed a resilient recovery strategy that redistributes flow when overloaded is allowed.
Assumptions 3 and 4 are the basic assumptions for the networks in this research. According to the actual situation, the failure state may not be perfectly recovered, so this research assumes that the network can be recovered to perfect or imperfect as assumption 5. Assumptions 6-9 define the basic rules of network protection, attack, and recovery in this paper. Also, we focus on discussing the cost of our side (protector and recover) as assumption 10.

Solution Algorithm
The time-related BATRA we proposed is based on the BATRA proposed by Su and Yeh (Su & Yeh, 2020) in 2020. Before generating protection, attack, and recovery strategies, we use BATRA to first calculate the importance weight of arcs in the network. Then, follow the logic of the binary-addition tree in BATRA and BAT to sequentially generate the strategies. Finally, we use the dynamic resilient performance indicators we proposed to evaluate each strategy combination. In order to make it easier for readers to understand our proposed time-related BATRA, in this section we will first provide the algorithm background related to BATRA and BAT, then introduce our proposed time-related BA-TRA, and finally use a simple example to illustrate our proposed time-related BATRA.

Algorithm background
After BAT was formally proposed by Yeh in 2020 (Yeh, 2020a), because of its easy customization and high problem applicability, many researches and extended applications are vigorously developing based on it. For example, there are BAT and maximumstate PageRank on predicting and modeling wildfire propagation areas (Yeh & Kuo, 2020), the bounded BAT for the binary-state network reliability problems (Yeh, 2020b), an improved and enhanced BAT for the reliability of the acyclic multistate information network , the BAT-based algorithm on the one-batch preempt multistate multirework network (Hao et al., 2021), the all-pairs BAT on calculating homogeneity-arc binarystate undirected network reliability , the BATRA on the static network resilience issue (Su & Yeh, 2020), and the time-related BATRA on the dynamic network resilience issue proposed by this research.
BAT mainly contains four parts, namely the binary-addition tree, the Path-based Layered-Search Algorithm (PLSA), the reduction methods, and the probability calculations of the connected vectors (Yeh, 2020a). The binary-addition tree lists all possible state vectors in a binary adding way. Then, PLSA filters out connected vectors from these found state vectors. The reduction method is used to reduce unnecessary calculations for unconnected state vectors in the binary-addition tree or PLSA. Finally, the event probability related to the connected state vector is calculated and summed to obtain the network reliability. Among them, the binary-addition tree can be said to be the main spirit of BAT and the calculus logic we emulate. Because the binary-addition tree can generate state vectors with simple binary logic, it is extended to generate protection, attack, and recovery strategies. The other parts of BAT can effectively assist us in calculating network reliability problems. In addition, BAT has also been extended to use in multistate networks (Hao et al., 2021;. In other words, there are multiple state BAT to generate multiple state vectors. Such multistate BAT is also used in this study to generate decision variables in the recovery strategy. Moreover, the reason why these recent researches, including ours, are based on and adopt BAT is also that it can directly and efficiently obtain the network reliability (Yeh, 2020a) that is an NP-hard problem (Colbourn, 1987;Aven, 1988;Levitin, 2005;Hao et al., 2020Hao et al., , 2021Su & Yeh, 2020;Yeh, 2020aYeh, , b, 2021. In any case, the time-related BATRA we proposed is based on BATRA, and BATRA is developed based on BAT (Su & Yeh, 2020). In the beginning, BATRA inherited the calculation methods of BAT to obtain the reliability of the original network. Next, BA-TRA uses the binary-addition tree idea in BAT to exhaust all the state vectors that the network may be disrupted and all the corresponding recovery state vectors, and then calculates the network reliability of the corresponding state. Finally, it calculates the corresponding recovery cost, component importance, and static network resilience. It is worth noting that the state vectors considered in BATRA are all listed; that is, the constraints in Section 2.3 in this study are not considered by BATRA. Therefore, the state vector considered in BATRA includes many extreme network disruptive states or ineffective recovery strategies. This is also the focus of this study to improve BATRA.

Proposed time-related BATRA
The procedure of the proposed time-related BATRA is presented as the following Algorithm 1.
Algorithm 1: Procedure of the time-related BATRA Input: G (V, E, D), arc protect and recovery costs. Output: Network protect, attack and recovery strategies with dynamic network resilience and costs. STEP A. Apply BATRA algorithm to find the importance weight of each arc. STEP A1. Obtain original network reliability R 0 . STEP A2. Obtain vulnerability R d . STEP A3. Obtain recoverability R r . STEP A4. Calculate importance weight of each arc by equation (1). STEP B. Decide total number of protected arcs and the protected arcs with protect costs. STEP C. Decide total number of attacked arcs and the attacked arcs. STEP D. Decide total number of recovered arcs, and the number of arcs recovered per unit time with recovery costs. And then obtain the corresponding network resilience by equation (3).
In the beginning, it is necessary to provide algorithm information about the network, including the network structure G(V, E, D), the cost of protecting the arcs, and the cost of recovering the arcs. The algorithm is expected to produce results for decision makers containing protect, attack, and recovery strategies with dynamic network resilience and costs. The steps of the algorithm are described as follows.
STEP A is based on BATRA and is for calculating the important weight of arcs and the reliability of original network. There are STEP A1 to STEP A4 in the STEP A. STEP A1 is based on BAT to reduce the number of impossible solutions and connectivity verifications, implement the binary-addition tree to list state vectors, apply PLSA to verify the connectivity of found state vectors, and obtain the original network reliability. STEP A2 to STEP A4 are following BATRA to calculate the importance weight of each arc by the static network resilient performance indicator as equation (1), where R d is the vulnerability of disruptive network and R r is the recoverability of the recovered network. In this study, R 0 , R d , and R r are calculated in the same way as in BATRA (Su & Yeh, 2020).
Based on the computation results obtained from STEP A and the logic of binary-addition tree, STEP B to STEP D are developed. STEP B decides the total number of protected arcs based on equation (5), and then lists the protected arcs with protect costs by using the importance weight of arcs obtained from STEP A. STEP C decides total number of attacked arcs based on equation (7), and then lists the attacked arcs by using binaryaddition tree. STEP D decides total number of recovered arcs based on equation (9), determines the number of arcs recovered per unit time based on equation (10), and lists the recovered arcs with recovery costs by using binary-addition tree. Then, it obtains the corresponding dynamic network resilience by using equation (3). As a result, the time-related BATRA can provide a decision-making process that is easy for decision makers to understand and implement. Also, based on the final output, i.e. the dynamic network resilience and the total cost required, decision makers can choose a strategy that meets their needs.

Illustrated example
The proposed time-related BATRA algorithm is illustrated and detailed through the example network shown in Fig. 2. Figure 2 is a binary-state network with four nodes and nine arcs (Yeh, 2020a). There are arc success probability, arc protect costs, and  Fig. 2.

a P (a) C r ( a) P'(a) C r ' ( a)
a 1 = e 1,2 0 . 8 3 0 0 . 6 2 0 a 2 = e 1,3 0 . 9 4 0 0 . 7 3 0 a 3 = e 2,3 0 . 7 3 0 0 . 5 2 0 a 4 = e 2,4 0 . 8 4 0 0 . 6 2 0 a 5 = e 3,4 0 . 9 5 0 0 . 7 4 0 arc recovery costs in Fig. 2. Arc success probability is provided in Table 2. Arc protect cost that contains the variable cost corresponding to each arc is in Table 3 and the fixed cost affected by N p is in Table 3. Arc recovery cost that contains the variable cost corresponding to each arc is in Table 2 and the daily fixed cost affected by T r is in Table 3. The fixed costs are increased as the number of protected arcs or daily recovered arc increases. Notice that the information in the above tables related to Fig. 2 is only listed to five arcs, since in STEP A1, the original nine arcs are reduced to five arcs. Following is the procedure of the solution. Solution: STEP A. Apply the BATRA algorithm to find the importance weight of each arc.

STEP A1. Obtain original network reliability R 0 .
At the very beginning, BATRA is based on BAT and executes lessening number of impossible solutions (i.e. Fig. 2 is lessened to Fig. 3), listing 2 5 all possible state vectors, lessening number of connectivity verifications, verifying connectivity of state vectors, and computing reliability of found 15 connected vectors and the original network in sequence. We can obtain R 0 = 0.941 68.

STEP A2. Obtain vulnerability R d .
Based on binary-addition tree, five scenarios with different one failed arc can be found. Then, continuing to use found 15 connected vectors from STEP A1; the network vulnerability R d can be obtained for these five scenarios. Notice that here we only care about one failed arc in the network.

STEP A3. Obtain recoverability R r .
Based on binary-addition tree, according to scenarios with different one failed arc, there are different recovery strategies that can be generated. That is, for each scenario, there are two related recovery strategies: "not to recover" and to "recover." The point to note is that if we want to recover the arc, we will incur cost. Therefore, considering the cost, not recovering the arc, and enabling the network to operate normally is also a type of recovery strategy. Via continuing to use found 15 connected vectors from STEP A1 and choosing the strategy with the largest ratio of recovery reliability to cost, the network recoverability R r can be obtained for these five scenarios. STEP A4. Calculate importance weight of each arc using equation (1).
After obtaining R 0 , R d , and R r , the importance weight of each arc can be calculated using equation (1). For example, the detailed numbers to find the importance weight of each arc from STEP A2 to STEP A4 in STEP A can be summarized as Table 4. The column "I" in Table 4 is the calculation result by equation (1), stands for the importance weight of each arc, and will be used in the following steps.   Based on binary-addition tree table from STEP A1, we can decide total number of protected arcs and the protected arcs with protection costs. In STEP A1, we can also obtain the least number of arcs to disconnect the network, n c . Therefore, the protector (our side) will protect the number of arcs greater than or equal to n c . However, since protecting arcs will incur costs, the smart protector will choose to protect the number of arcs, N p , which the least sum of protection costs is less than the sum of recovery variable costs of the five arcs as the constraint in equation (5).
For example, n c = 2 in Fig. 3, so at least two arcs must be protected by equation (6). Also, the sum of recovery variable costs of the five arcs is 190 in Table 2 by summing full recovery variable costs for five arcs. To protect two arcs, the protection cost is at least 55 in Table 3 by summing variable and fixed protected costs for two arcs. To protect three arcs, the protection cost is at least 105 in Table 3 by summing variable and fixed protection costs for three arcs. To protect four arcs, the protection cost is at least 220 in Table 3 by summing variable and fixed protection costs for four arcs. However, since the minimum cost of protecting four arcs, 220, is greater than the maximum recovery variable (1, 1, 0, 0, 1) 110 cost, 190, the protector will choose not to protect more than four arcs, but only two or three arcs. Therefore, here we first list the number of possible protected arcs. Then, according to the number of protected arcs and the importance weight of the arcs, the combination of the protected arcs is listed and the corresponding protection cost is calculated. For example, according to the order of the importance weight of the arcs from high to low, the first three arcs with the highest importance weight are a 2 , a 5 , and a 1 . When protecting two arcs, smart protectors will give priority to protecting a 2 and a 5 . When protecting three arcs, smart protectors will give priority to protecting a 2 , a 5 , and a 1 . Therefore, there will be two protection strategies and corresponding protection costs, as shown in Table 5. In Table 5, N p is the number of protected arcs, Xp l is the state vector for the protection strategy l, and sumCp is the total protection costs. STEP C. Decide total number of attacked arcs and the attacked arcs.
Based on binary-addition tree table from STEP A1, we can decide total number of attacked arcs and the attacked arcs. In Step A1, we can also obtain n c . We assume that the attacker (our enemy) is smart and can know the network structure and the number of arcs in the network, but does not know the probability value of the arcs in the network and the protection strategy of the protector. In other words, the attacker does not know the probability of each arc, and therefore cannot know which arc is the most important. However, the attacker can also calculate the value of n c by himself to determine the number of attacking arcs. The main purpose of the attacker is to disconnect the network from the arc with the least attack. We believe that the assumption of a smart attacker is more reasonable.
Both the protector and the attacker know that the protected arc will not be broken when attacked. Therefore, a smart attacker will only attack n c to 2 * n c arcs as the constraint in equation (7), where n c is the number of arcs with the least attack to disconnect the network and 2 * n c means that the attacker believes that the protector will protect the n c arcs. In order to increase the success rate of disconnecting the network, the attacker may attack more arcs. Therefore, for the protection strategy found in the previous step, the corresponding attack strategies of various possible combinations will be generated according to the number of attacking arcs of the attacker.
Then, after converting 0 and 1 to 0, and −1 converts to 1, it can correspond to the calculation result of the failed arc scenario in STEP A. In other words, the previous calculation results can be used to save the time needed to calculate the reliability of the network after the recovery strategy is implemented. For example, under the two protection strategies in the previous step, after facing different attack strategies, the transformed state vector for the network failed status v, Xd v ', is shown in Table 7. In Table 7, N f is the number of failed arcs and has the constraint in equation (8).
For example, when protecting two arcs and facing an attack strategy, the number of arcs that may fail is between 0 and 3 as shown in the column "For Xp 1 with all Xa l " in Table 7. When protecting three edges and facing an attack strategy, the number of arcs that may fail is between 0 and 2 as shown in the column "For Xp 2 with all Xa l " in Table 7.
STEP D. Decide total number of recovered arcs, and the number of arcs recovered per unit time with recovery costs, and then obtain the corresponding network resilience using equation (3).
In this step, we need to decide total number of recovered arcs, and the number of arcs recovered per unit time with recovery costs. Then, we can obtain the corresponding network resilience using equation (3). According to the protection strategy and recovery strategy in the previous steps, we can obtain the arcs that need to be recovered, and then formulate the recovery strategy. The recovery strategy includes the total number of arcs recovered and the number of arcs recovered per unit time, both of which will incur recovery costs. The costs related to recovering the arcs are shown in Tables 2 and 3.
In the previous step, for different protection strategies, the number of possible failed arcs can be obtained, and then the multistate binary-addition tree can be used to list the total number of recovered arcs, N r , and the number of recovered arcs per unit time, T r , and get different recovery strategies, where N r = 1,2,. . . , N f and T r = 1,2,. . . , N r as the constraints in equations (9) and (10).
After generating a recovery strategy using the multistate binary-addition tree, we can intelligently recover the arcs according to the importance weights of the arcs obtained by STEP A, and calculate the dynamic network resilience according to the proposed formula in equation (3).
For example, following the previous STEP C, Tables 8 and 9 are different recovery strategies with changes in network reliability during the recovery process, R d (t), total recovery costs, sumCr, and dynamic network resilience, ϕ D . According to the importance weight of the arc calculated by STEP A, the order of arc recovery is a 2 , a 5 , a 1 , a 4 , and a 3 . Table 8 is full recovery strategies, and Table 9 is partial recovery strategies. In Tables 8 and 9, since the total number of failed arcs N f is 3, the maximum value of N r and T r is 3. It can be seen from Tables 8 and 9 that when the number of recovery stages is fewer and the number of recovery arcs is greater, the dynamic network resilience is greater, but the recovery cost is also greater.
At this step, we can obtain different protection strategies, different attack strategies, and dynamic network resilience under different recovery strategies, as well as the cost of our protection and recovery of the network. The decision maker can get the result of the combination of the importance weight of the arcs and all the strategies, and make relevant decisions. For example, if only based on the importance of arcs, decision makers can quickly determine the priority of protecting and recovering arcs. Based on the results of all the strategy combinations, we can look at all strategies and get a strategy combination (N p , N r , and T r ) with high dynamic network resilience and low cost.
For example, if the dynamic network resilience is required to be as large as possible, the top 10 full recovery and partial recovery strategies can be obtained as shown in Tables 10 and 11, respectively. It can be seen from Table 10 that only the costs of the 4th, 7th, 8th, and 10th full recovery strategy are less than or equal to 150. From Table 11, it can be seen that only the costs of the 1st, 5th, and 10th partial recovery strategies are greater than 150. It is worth noting that the ϕ D = 200 means that the network is not failed when it is attacked under the protection strategy, so there is no need to recover the network.

Case Study
This study uses the proposed time-related BATRA to analyse a case: the wildfire wireless detection sensor network, which is an important infrastructure network. The example network in Fig. 2 with only four nodes and five arcs is also called the bridge problem and can be regarded as the basic problem of any network system. Different from the example network in Fig. 2, the case study example in Fig. 4 is the wildfire wireless detection sensor network. Figure 4 is a simplified drawing based on the corresponding positions of the wireless sensors or related equipment in the forest. In other words, the distance information between the sensors or devices is not presented in Fig. 4. Figure 4 shows the wildfire wireless detection sensor network with 11 arcs and 8 nodes. Wildfire wireless detection sensor networks are usually bidirectional. However, only the transmission path of the message when the fire occurs at the source node 1 is considered here, so the transmission path of the message in Fig. 4 is given. After the fire broke out, the relevant data (e.g. temperature, humidity, grace, etc.) collected near node 1 will pass through node 2, 3, . . . , 7, and finally to sink node 8. Node 8 can be seen as an equipment for forest managers to observe wildfires. Arc success probability is provided in Table 16. Arc protect cost that contains the variable cost corresponding to each arc is in Table 17 and the fixed cost affected by N p is in Table 17. Arc recovery cost that contains the variable cost corresponding to each arc is in Table 16 and the daily fixed cost affected by T r is in Table 17. The unit of protection and recovery costs is 10 3 US dollar.
The proposed time-related BATRA is applied to analyse the dynamic network resilience. The model was run on a personal computer with an Intel Core-i7 CPU @ 2.20 GHz and 8 GB of RAM. The code was implemented in RStudio Version 1.2.1335. The following is the analysis procedure.
Input: G (V, E, D), arc protect and recovery costs in Fig. 4 and Tables 16 and 17.
Output: Network protect, attack, and recovery strategies with dynamic network resilience and costs. STEP A. Apply the BATRA algorithm to find the importance weight of each arc. STEP A1. Obtain original network reliability R 0 .
In Fig. 4, there are 2048 different possible state vectors found by binary-addition tree. After PLSA, 639 connected state vectors are verified. Then, the original network R 0 = 0.991 88 can be acquired by reliability calculation of the connected state vectors. We also can obtain n p = 3 and n c = 2 in Fig. 4.

STEP A2.
Obtain vulnerability R d .
There are 11 different scenarios with different failed arcs in Fig. 4. By using the connected state vectors derived from STEP A1, R d for different failed arcs can be obtained and summarized in Table 18. STEP A3. Obtain recoverability R r .
By considering full or partial recovery variable cost and the largest ratio of recovery reliability to cost, we can obtain R r as shown in Table 18. STEP A4. Calculate importance weight of each arc using equation (1).
After obtaining R 0 , R d , and R r , the importance weight of each arc, I, can be calculated using equation (1) and as shown in Table 18. From Table 18, we can obtain the importance of arcs from high to low as: a 4 , a 8 , a 9 , a 10 , a 11 , a 5 , a 1 , a 2 , a 6 , a 7 , and a 3 . STEP B. Decide total number of protected arcs and the protected arcs with protect costs.
After obtaining I, we can decide total number of protected arcs and the protected arcs with protection costs using equations (5) and (6). Table 19 shows all the Xp i obtained in Fig. 4. In Table 19, we can find that there are five possible values from 2 to 5 for N p , their corresponding Xp i , and their related SumCp. For example, according to the order of the importance weight of the arcs from high to low, if N p = 2, the first two arcs with the highest importance weight are a 4 and a 8 . If N p = 3, the first three arcs with the highest importance weight are a 4 , a 8 , and a 9 , and so on. STEP C. Decide total number of attacked arcs and the attacked arcs.
According to N f obtained in the previous step, we can find N r and T r using equations (9) and (10), and their combinations by using the multistate binary-addition tree. Then, we can intelligently recover the arcs according to the importance weights of the arcs obtained by STEP A, and calculate the dynamic network resilience according to the proposed formula in equation (3). There are totally 9405 different recovery strategies with their dynamic network resilience, ϕ D , and total recovery costs, sumCr, from the combinations of N p , N a , N r , and T r . The total CPU running time is 493.44 seconds.
For example, Tables 21 and 22 show the recovery strategies for Xd 66 ' under Xp 1 to Xp 5 , where Xd 66 ' = (0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0) and represents failed a 5 , a 6 , and a 7 , and Xp 1 to Xp 5 are shown in Table 19. Table 21 is full recovery strategies, and Table 22 is partial recovery strategies. Note that to save table space, we abbreviate (0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0) as 00 001 110 000 in Tables 21  and 22. According to the importance weight of the arc calculated by STEP A, the order of arc recovery for Xd 66 ' is a 5 , a 6 , and a 7 , which can be seen via the changes of Xr vw (t). Also, each Xr vw (t) has its corresponding R d (t). In Tables 21 and 22, since the total number of failed arcs N f is 3, the maximum value of N r and T r is 3. It can be seen from Tables 21 and 22 that when both T r and N r are greater, the dynamic network resilience is greater, but the recovery cost is also greater.
At this step, we can obtain different protection strategies, attack strategies, and dynamic network resilience under different recovery strategies, as well as the cost of our protection and recovery of the network. The decision maker can get the result of the combination of the importance weight of the arcs and all the strategies, and make relevant decisions. For example, if only based on the importance of arcs, decision makers can quickly determine the priority of protecting and recovering arcs.
If the dynamic network resilience is required to be as large as possible and the cost is as small as possible, the top 10 full recovery and partial recovery strategies can be obtained as shown in Tables 23 and 24, respectively. It is obvious from Tables 23 and 24 that both N r and T r have the largest ϕ D = 400.103 or 353.7295 when the maximum value is 4. Both N r and T r are at the maximum value of 4 because in this example, the maximum value of N f is 4. Also, when N r and T r are both 4, the total cost will vary with Xd v ' or N p , because it is affected by the cost of the arcs that need to be recovered or protected. It can also be found from the table that N a is all 2, 3, or 4, because even if N a corresponds to different N p , the same Xd v ' may still be obtained. However, just observing at the top 10 in this example can only provide the above discussion. Therefore, then we will present the results of all 9405 strategy combinations via figures and discuss them further.
Figures 5 and 6 provide scatter diagrams of dynamic network resilience ϕ D and sumCost with T r , N p , sumCp, N r , sumCr, and N f in full and partial recovery strategies, respectively. It can be seen from Fig. 5a that when the resilience is above 100, the resilience is obviously affected by T r . For example, when the resilience is around 100, T r is all 1. When the resilience is around 200, T r is all 2. When the resilience is around 300, T r is all 3. When the resilience is around 400, T r is all 4. However, when the resilience is below 100, the effect of T r on the resilience is not significant.
Regardless of the value of resilience, the effect of T r on sumCost is not significant.
From Fig. 5b and c, we can see that sumCost is obviously affected by N p or sumCp. For example, when sumCost is between about 20 and 40, N p is 2 and sumCp is 13.8. When sumCost is between about 25 and 45, N p is 3 and sumCp is 19.8. When sum-Cost is between about 35 and 55, N p is 4 and sumCp is 29.3. When sumCost is about 45 to 65, N p is 5 and sumCp is 39.8. When sum-Cost is between about 55 and 80, N p is 6 and sumCp is 49.8. N p and sumCp have a high degree of positive correlation, because when N p is determined, sumCp will be determined accordingly. Regardless of the value of sumCost, the effect of N p or sumCp on resilience is not significant.
From Fig. 5d, e, and f, we can see that resilience is also affected by N f , N r , or sumCr, but it does not have a significant effect on resilience like T r . When N r is 4, the resilience is significantly higher than that of N r less than 4. It can be seen that recovering the four arcs can significantly improve resilience. Also, in this example, the protection cost sumCp affects sumCost to a greater extent than the recovery cost sumCr. Therefore, for this example, recovering more arcs can cause the resilience greater and the recovery cost will not increase significantly.
N f , N r , and sumCr have a high degree of positive correlation, because N f determines the maximum value of N r , and when N r is determined, sumCr will also be determined. Regardless of the value of sumCost, the effect of N f , N r , or sumCr on resilience is not significant. In addition, when the resilience is less than 100, there are many points where N f is 4. The reason is that at these points, N f is large but N r is not large enough. Therefore, the network is not effectively recovered to an operational state, resulting in low resilience.
The relationship between the variables or parameters shown in Fig. 6 can also be echoed with the previous discussion on Fig. 5. Moreover, the top 10 best solutions we found in Tables 23 and 24 are all in Figs 5 and 6 with a maximum resilience of 400.103 and 353.7295, respectively.

Conclusion
In this article, we proposed a new performance measure oriented to the degree of network reliability recovery to quantify the resilience of the network system and a dynamic resilient model with more decision variables, parameters, and cost formulations for critical infrastructure network systems against disruptive events. The proposed dynamic resilience performance measure does not need to assume that the system needs to be recovered to the original state, and is more in line with practical applications. In addition, the proposed timerelated BATRA was applied to overcome the complexity of the problem model, obtain protection, attack, and recovery strategies, and calculate the final dynamic network resilience and related cost. An example is provided to illustrate the analysis process of the proposed time-related BATRA. This article also provides a case study of the wildfire wireless sensor network. In the illustrated examples or case studies, the proposed timerelated BATRA not only assists decision makers in the allocation of protection and recovery resources, but also provides the best solutions, including maximizing resilience, minimizing costs, and maximizing resilience and minimize cost, or maximize resilience or minimize cost under given resilience or cost constraints.
In addition, in the case study, we discussed the impact of decision variables on the development of protection and recov-ery strategies and on network resilience and costs. Regardless of the protection or recovery strategy, the importance weight of the arcs does determine the priority of the protection or recovery of the arcs. Network resilience is significantly affected by N f , N r , and T r . If N f , N r , and T r are all large, the network usually has high resilience, because the damaged arcs can be recovered effectively and instantly. If N f is large, but N r and T r are not large enough, the network usually has low resilience, because the damaged arcs are not effectively and instantly recovered. The total cost consists of protection and recovery costs. If the decision maker hopes that the cost is as small as possible, and if the proportion of the protection cost is greater than the recovery cost, then the decision maker should focus on the recovery aspect. Conversely, if the proportion of recovery costs is greater than the protection costs, then decision makers should focus on the protection aspect. The results of this study can help decision makers formulate protection and recovery strategies to ensure that the infrastructure is operated to meet the needs of serving society.
For future research, the predictive maintenance for infrastructure can be applied. Then, the computational efficiency of more complex networks can be further improved. Different from the exact solution obtained by this research method, the complex network tends to obtain a better solution close to the exact solution. Because of the huge amount of calculation in complex networks, more advanced algorithms are usually used, such as optimized heuristic algorithms, hidden Markov models, or a more complex and robust percolation theory. In addition, there are too many variables to be considered in the resilient model. Future research can further incorporate all variables or explore the relationship between factors to make the resilient model more complete. Another important issue is the problem of multi-objective planning. Such a resilient infrastructure model may need to consider more goals in the future, such as static network resilience, dynamic network resilience, protection cost, recovery cost, total cost, etc. How to discuss between multiple goals and choose the most suitable solution for decision makers are important issues.