Abstract

Given the increase in cybercrime, cybersecurity analysts (i.e. defenders) are in high demand. Defenders must monitor an organization’s network to evaluate threats and potential breaches into the network. Adversary simulation is commonly used to test defenders’ performance against known threats to organizations. However, it is unclear how effective this training process is in preparing defenders for this highly demanding job. In this paper, we demonstrate how to use adversarial algorithms to investigate defenders’ learning using interactive cyber-defense games. We created an Interactive Defense Game (IDG) that represents a cyber-defense scenario, which requires monitoring of incoming network alerts and allows a defender to analyze, remove, and restore services based on the events observed in a network. The participants in our study faced one of two types of simulated adversaries. A Beeline adversary is a fast, targeted, and informed attacker; and a Meander adversary is a slow attacker that wanders the network until it finds the right target to exploit. Our results suggest that although human defenders have more difficulty to stop the Beeline adversary initially, they were able to learn to stop this adversary by taking advantage of their attack strategy. Participants who played against the Beeline adversary learned to anticipate the adversary’s actions and took more proactive actions, while decreasing their reactive actions. These findings have implications for understanding how to help cybersecurity analysts speed up their training.

Introduction

The rapidly evolving attack capabilities to deploy increasingly sophisticated cyberattacks of unprecedented speed and scale require well-trained cybersecurity experts (i.e. defenders, analysts) [1, 2]. Cyber analysts are responsible for protecting an organization’s computer network and digital assets. The job of these defenders consists of a wide variety of network-dependent tasks, including the examination of a large number of alerts to identify intrusion activities and determine whether a network is under attack, the detection of flaws in the organization’s security, the development of appropriate protections, and, of course, the mitigation of threats. These activities often include making time-sensitive decisions that may involve disrupting the organization’s work in order to protect their information.

Typically, cyber wargaming and adversary simulation are used to evaluate defense algorithms and strategies and to train defenders against new threats [3, 4]. Wargaming exercises mimic a potential threat to an organization by using threat intelligence to define what actions and behaviors an adversary may use. Wargaming emulators build scenarios that capture certain aspects of tactics, techniques, and procedures, to help test the efficacy of defense and identify vulnerability of the network [5]. Also, human defenders are usually recruited to interact with adversarial-simulated scenarios to help them learn from such an interaction [6, 7].

Despite a growing interest in cyber-defense behaviors in recent years [8–12], our understanding of the cognitive demands faced by cyber analysts is still limited [13]. Many factors in adversarial behavior may influence defense strategies. For example, the aggressor’s personality traits are known to influence their cyberattack behaviors [14, 15]: Machiavellianism was found to be a predictor of stealthy attacks, while narcissism and psychopathy were associated with shorter and more aggressive attacks (i.e. “brute force”).

Human-in-the-loop cyber defense laboratory research is required to study both defensive and offensive cyber operations and to develop training protocols tailored to different types of attack strategies [16]. However, conducting meaningful laboratory research with simulated adversaries to study defender behavior is challenging. Participants with the skills and knowledge required to test highly technical tasks and sophisticated adversaries are hard to find and are often too busy to provide their time to test simulated adversaries [9, 17]. The design of simulated adversaries of high fidelity in terms of techniques also requires extensive threat intelligence collected through long-term tracking and clustering of intrusion activities [18]. Given the continuous evolution of network environments and adversaries, it is also unrealistic to derive a future-proof defense strategy at the granularity of current techniques.

To help mitigate this challenge, researchers have been using simulation tools and simplified games [19]. In the context of cybersecurity, these simplified testbeds are used to study the offensive and defensive sides of cyber deception [11, 20], to understand how the general public classifies phishing emails [15, 21], to investigate how the cybersecurity knowledge of the attacker affects the identification of attacks [22], and to study the behavior of the attacker under different levels of uncertainty about the attacker’s strategy [23]. In this work, we adopt the Intrusion kill chain model [24] to simplify sophisticated cyberattacks into three tactical phases Establish initial foothold, Propagate through network, and Act on objectives [25]. Consequently, countermeasures such as Monitor, Analyze, Remove, and Restore are adopted to disrupt each phase of the attack lifecycle. By pairing defenders with various adversarial strategies constructed with the above tactics, we can learn about the behaviors of human defenders and their processes to address different types of attackers and adapt to dynamic network environments.

However, there is a lack of research on the impact of different adversarial strategies on defense behaviors and the development of defense strategies. Most adversarial cybersecurity games rely on game-theoretic approaches to determine the best defense strategies. These methods often only consider a particular adversary and assume that opponents act “rationally” (i.e. exhibit optimization behavior). These techniques assume the availability of information to adversaries rather than uncertainty and provide individuals with an exact payoff matrix [26, 27]. This leads to a misrepresentation of the reality of the highly dynamic cyber environment, where analysts must work with incomplete and flawed information. While game-theoretic approaches can be useful in determining the optimal defense strategies against known attacks, they provide an unrealistic representation of the attacker’s intentions [28–30]; leading to unrealistic representations that might ultimately perform poorly in dynamic cyber-defense environments against unfamiliar adversaries [30–32].

Goals and research method

In this research, we address the question of how human defenders behave against different attack strategies and how it affects the emergence of defense strategies. We defined two adversarial strategies in a particular but generic network setting. One adversarial strategy (i.e. Meander) was stealthy; and another one was direct and speedy (i.e. Beeline), reflecting two contrastic attack strategies.

In a recent experiment, ref. [33] proposed a cognitive model based on Instance-Based Learning (IBL) theory [34] that acted as a defender. This model was paired with both, the Beeline and Meander adversarial strategies to provide predictions of the potential performance of human defenders. The simulation experiment captured the differences in attack strategies and their effect on defenders outcomes. Mainly, the Beeline strategy resulted in a worst performance for the model than the Meander strategy. But it showed that the IBL defender was able to learn over the course of repeated episodes of the defense task. While this is an interesting prediction, human data were not available to validate these observations.

We designed an Interactive Defense Game (IDG) in a cybersecurity scenario and conducted a laboratory study to test human defense behavior against the two adversarial strategies. Similarly to ref. [33], we expect participants who face a Beeline strategy to have more difficulty defending their network against intrusions than participants who face the Meander strategy; and we also expect that humans will learn over the course of repetitions of the defense task.

Interactive defense game

The IDG is a web-based interactive cyber-defense game developed to study how human defenders make decisions in a cybersecurity situation. The IDG does not require any installation and can be played remotely using a web browser (Demo of the game: http://janus.hss.cmu.edu:8084/). It provides human participants with a graphical interface to observe network events and analyze the information about a computer network, similar to the way Intrusion Detection Systems (IDS) present network events to human defenders. IDS are common tools to monitor the activities on a network and to help detect possible intrusions or attacks [13].

The task of a cyber defender in the IDG

In the IDG, participants play the role of cybersecurity analysts hired by a fictitious manufacturing company to protect their computer network from external malicious activity. The network we use is a simplified version of common corporate network topologies. It is composed of hosts, staff computers, and servers grouped in subnets. Attackers are trying to gain access to the Operational Server (Op_Server0) to steal information and disrupt production. The easiest way for them to do so is to enter the network through one of the staff computers on the first subnet and progressively make their way up to the critical Op_Server0 by gaining administrator access to every host on their way.

Each host on which an attacker got administrator-level access costs the defenders some points. The goal of the defender is to minimize the number of points lost.

To perform this task, the defenders use the IDG interface shown in Fig. 1. They must actively monitor the activity of the network to try to identify malicious activity and take actions to block the progression of the attacker. The hosts of the network are characterized by the subnet to which they belong, an IP address, and a host name. Additionally, the system provides the defenders with two dynamic piece of information about each host, the Compromise level and the Activity. When targeting a host, the attacker will first try to gain user-level access to the machine, then try a privilege escalation to gain administrator-level access, and progress to the next target in the network. The Compromise level indicates the status of infection of the host. The second dynamic element provides information about the last Activity detected by the system, like scans or exploitation attempts performed by the attacker on this host. However, not all attacker’s activities can be detected by the system. More advanced actions, e.g. privilege escalation attempts and their consequences, are automatically detected. Thus, the defenders have to understand the observable activity and compromised levels to anticipate future actions of the attackers.

Illustration of the IDG user interface.
Figure 1.

Illustration of the IDG user interface.

Based on these observable elements, defenders can select among a set of actions represented in buttons on the bottom right of the screen: Monitor, Analyze, Remove, and Restore. Human defenders can select a host by clicking on its row in the table and then choose one of the four actions to perform on that particular host. Only the Monitor action does not require to select a target, it applies to the whole network.

Then, after clicking on the “next” button, the selected action takes effect, and the defender can see the result (i.e. amount of points lost) from the execution of that action in the “last round” value. A new and updated version of the environment is presented to the human defender, demonstrating the new state (activity and compromised levels) of the network elements. The “last round” outcome provides immediate feedback regarding the effectiveness of the past action, and the “total loss” presents the human defender with a cumulative account of the loss during the game. Each game lasts a fixed number of steps, each step representing one action.

Defense scenario and attack strategies

Human defenders in the IDG are asked to defend a computer network against a red agent. The specific network we used in this scenario is illustrated in Fig. 2.

Topology of the network being defended in the IDG scenario. The red line represents the path any attacker needs to take to access the Operational Server.
Figure 2.

Topology of the network being defended in the IDG scenario. The red line represents the path any attacker needs to take to access the Operational Server.

The network is composed of seven hosts (four computer hosts and three servers) distributed across three subnets. Subnet 1 consists of user hosts that are not critical, subnet 2 consists of enterprise servers designed to support the user activities on subnet 1, and subnet 3 contains the critical operational server and an operational host.

Two types of attack strategy are implemented. They differ by the assumption of the attacker’s prior knowledge and illustrate attack behaviors that may result from differences in the attacker’s personality traits [14, 15]. In the Beeline strategy, attackers route directly through subnet nodes to the Operational Server. The Meander strategy does not assume any prior knowledge of the network from the attacker. Attackers following this strategy wonder through the network, trying to gain privileged access to every host in a subnet before advancing further into the network. As a consequence, the Beeline strategy is a direct, rapid, and targeted strategy that can reach the Operational Server faster than an attacker following the Meander strategy.

The outcome at each step is calculated as shown in Table 1. If the attacker successfully gains administrator access to a user host, the defender loses 0.1 points, while losing administrator access to a server is penalized by −1.0 points. The loss is applied in each step as long as the attacker is not removed from that host or server by the defender. Defenders also receive a negative reward if they have to use the Restore action (−1), because of the important consequences of this action on the system availability. Finally, if the attacker successfully perform the Impact action on the Operational Server, the defender is penalized by −10 points. As Beeline can reach the Operational Server earlier than Meander, it can repeatedly Impact the Operational Server for longer (unless stopped by the defender). As a consequence, and because of the weight accorded to the Impact action, Beeline is potentially more harmful than Meander. For the defender, the implications are a higher theoretical maximum loss against Beeline (−160) than against Meander (−100) (These results are estimated using a completely passive defender. Attackers are able to perform their attack without being disturbed. Beeline then reaches the Operational_server five steps earlier than Meander).

Table 1.

Cost table.

Event or actionSubnetPoint cost
Attacker has administrator access on a HostSubnet 1, 2, 3−0.1
Attacker has administrator access on a ServerSubnet 1, 2, 3−1
Attacker runs IMPACT attack on Operational ServerSubnet 3−10
Defender restore an Host or Server-−1
Event or actionSubnetPoint cost
Attacker has administrator access on a HostSubnet 1, 2, 3−0.1
Attacker has administrator access on a ServerSubnet 1, 2, 3−1
Attacker runs IMPACT attack on Operational ServerSubnet 3−10
Defender restore an Host or Server-−1

This table was provided to the participants during the instruction phase and was accessible anytime during the experiment through a “help” button.

Table 1.

Cost table.

Event or actionSubnetPoint cost
Attacker has administrator access on a HostSubnet 1, 2, 3−0.1
Attacker has administrator access on a ServerSubnet 1, 2, 3−1
Attacker runs IMPACT attack on Operational ServerSubnet 3−10
Defender restore an Host or Server-−1
Event or actionSubnetPoint cost
Attacker has administrator access on a HostSubnet 1, 2, 3−0.1
Attacker has administrator access on a ServerSubnet 1, 2, 3−1
Attacker runs IMPACT attack on Operational ServerSubnet 3−10
Defender restore an Host or Server-−1

This table was provided to the participants during the instruction phase and was accessible anytime during the experiment through a “help” button.

Methods

Experimental design

The goal of this experiment is to compare the behavior of human defenders faced with the two types of attack strategy discussed above: Beeline and Meander.

Given the characteristics of the Beeline strategy that can be faster and more damaging to defenders compared to the Meander strategy, we expected that defenders would initially perform worse against Beeline than against Meander. This hypothesis was preregistered with the Open Science Framework (https://osf.io/u3nfh).

Participants

Participants were recruited through Amazon Mechanical Turk to participate in a cybersecurity study. The study was advertised to last between 35 and 45 minutes. The time it took across participants was M = 47.02 ± 13.16 minutes. Participants received a base compensation of $4.5, and up to $5.6 in bonus payment (M = 3.96 ± 1.39) based on their final score (As the score used in this experiment is negative (loss),the bonus payment was calculated by using the difference to the maximum possible loss and attributing 0.005$ per point: bonus = (total loss + 1120)*0.005).

A total of 120 participants (89 male, 30 female, 1 N/A) aged 21–65 years (M = 36.77 ± 11.00) completed the study. A total of 12 of the 120 participants (10%) had more than 5 years of experience in the network operation and security area and at least a Master’s degree in a related field (In the follow-up survey, participants’ expertise was assessed through two likert-scale questions concerning their highest degree in network operation and security, and the years of experience in this area. A one-way ANOVA on those two groups (experts and novices) reveals no effect of the experience on the total losses [F(1, 2.07) = 4.2693, P = .17, η2 = .71)).

Each participant was randomly assigned to face one of the two adversarial strategies.

Procedure

After giving their informed consent and completing a demographic questionnaire, participants received instructions for the task followed by a short quiz to verify their basic understanding of the task instructions, including the network topology, attacker’s goal, and the loss calculation process. Participants had to correctly answer all the questions before moving on to the next step of the experiment. Participants received feedback on the accuracy of their responses and were allowed to modify their responses if they were incorrect. There was no limit in the number of attempts the participants had to answer the questions correctly. However, we recorded the score of their first attempt and the number of times they tried to answer the questions.

Next, participants watched a video introduction to the IDG, explaining the interface, the game controls, and the dynamics of an episode.

Then, participants performed the task consisting of two phases: (1) a practice session and (2) a main task. The practice session consisted of two short episodes (i.e. games) of 10 steps each. The practice episodes were intended to familiarize participants with the interface and game controls. Each of the practice episodes was associated with one of the attacker strategies; however, since the two attack strategies do not differ significantly during the first 10 steps, the participants did not have enough information to discriminate between the two adversarial strategies during the practice session.

Following the practice session, the participants performed the main task consisting of seven episodes of 25 steps each. No time restrictions were imposed. The experimental conditions were kept constant throughout the episodes, which means that each participant played seven episodes against the same adversarial strategy. The initial state of the network was the same for all participants and for each of the episodes.

Subsequently, participants completed a postexperiment survey composed of two parts: (1) feedback on their performance and perceived strategy, and (2) their experience in computer science and cyber defense. Finally, the participants received their final score and were dismissed. The experimental instructions, quiz, and surveys, along with the data and analysis scripts, can be accessed at https://osf.io/u3nfh.

Outcome and process metrics

We measured the outcome of the defense performance in the IDG using three metrics:

  • Loss: total number of points lost by the defender during the scenario. For reference, the maximum loss per episode resulting from Beeline actions is −160, while it is −100 against Meander.

  • Disruptions: number of server disruptions that occur within each episode. One disruption represents a set of consecutive steps between a successful impact attack on the Operating Server and the successful recovery by the defender.

  • Recovery time: the average number of steps per episode that the defender takes to remove the attacker from the Operational Server after it is disrupted.

We also measured defense process behaviors in addition to defender decisions (i.e. which action is chosen in each step). The attacker actions were also logged for each step and were used to analyze the human behaviors and strategies of defense:

  • Proportion of defense actions: number of times that each of the four defense actions—Analyze, Monitor, Remove, and Restore—is used by a participant within each episode, divided by the length of the episode (25 steps).

  • Proportion of attacker’s targets: number of times each host or subnet is being targeted by the attacker within each episode, divided by the length of the episode (25 steps). This is indicative of the attacker’s path in the network.

  • Proportion of defense strategy: the frequency with which each of three coded strategies of defense have been used (Reactive, Proactive, and Passive) within each episode. Details of calculations of these strategies are presented in the “Defense strategies” section  below.

Results

Outcome metrics

Table 2 presents the average loss, the number of disruptions, and the recovery time of the participants who played against the Beeline attack strategy and those who faced the Meander attack strategy.

Table 2.

Descriptive statistics (mean ± SD) regarding average loss, number of disruptions, recovery time, and success rate per episode.

BeelineMeander
Loss−56.12 ± 50.73−34.76 ± 30.40
Disruptions0.94 ± 0.810.49 ± 0.52
Recovery time (steps)2.75 ± 3.551.31 ± 1.69
BeelineMeander
Loss−56.12 ± 50.73−34.76 ± 30.40
Disruptions0.94 ± 0.810.49 ± 0.52
Recovery time (steps)2.75 ± 3.551.31 ± 1.69

For contextualization, the maximum loss per episode is −160 against Beeline and −100 against Meander.

Table 2.

Descriptive statistics (mean ± SD) regarding average loss, number of disruptions, recovery time, and success rate per episode.

BeelineMeander
Loss−56.12 ± 50.73−34.76 ± 30.40
Disruptions0.94 ± 0.810.49 ± 0.52
Recovery time (steps)2.75 ± 3.551.31 ± 1.69
BeelineMeander
Loss−56.12 ± 50.73−34.76 ± 30.40
Disruptions0.94 ± 0.810.49 ± 0.52
Recovery time (steps)2.75 ± 3.551.31 ± 1.69

For contextualization, the maximum loss per episode is −160 against Beeline and −100 against Meander.

These observations corroborate some expected differences between the two attack strategies in each of the three metrics for outcome performance. In general, the participants lost more points against the Beeline strategy than against the Meander strategy. The average number of disruptions to the operational server within one episode was larger when playing against the Beeline than when playing against the Meander strategy. It also took more steps within an episode to remove the attacker from the operational server when disrupted by the Beeline than the Meander attacker.

We analyzed the outcome metrics over episodes to determine whether the defenders improve with practice against each of the two adversaries. Figure 3 shows the average of each of the three outcome metrics per episode. Generally, we observe more stability over episodes in the participants’ outcomes against the Meander adversary than against the Beeline adversary. In other words, the initially poorer performance of participants against a Beeline adversary improves with more practice with this adversary, while the performance of participants against the Meander adversary does not improve much over episodes.

Outcome metrics over time with standard error of the mean. From left to right: loss; disruptions; and recovery time.
Figure 3.

Outcome metrics over time with standard error of the mean. From left to right: loss; disruptions; and recovery time.

The participants’ losses are lower and relatively more stable against the Meander adversary; however, the participants’ losses are larger against the Beeline adversary, and they decrease with more practice against this adversary. In addition, the average number of server disruptions is initially higher for participants confronted with the Beeline adversary compared to those confronted with the Meander adversary. However, the number of disruptions decreases with more episodes against the Beeline adversary. A similar result is observed in the average recovery time per episode; where the time is longer for participants playing against the Beeline adversary compared to the Meander adversary, but it decreases with more episodes.

These observations were tested using mixed-effects analysis of variance (ANOVAs) that included the adversary as a between-subjects factor, the episode as a within-subjects factor, and their interaction. The results for each of the three outcome metrics are reported in Table 3.

Table 3.

Results of the mixed ANOVAs regarding the effect of adversary type and episodes on outcome metrics.

MetricNumDFDenDFF-valuePP significanceη2
Loss
Adversary1.00117.008.44.004**.06
Episode4.45520.945.99<.001***.01
Adversary:Episode4.45520.943.54.005**.01
Disruptions
Adversary1.0117.0024.24<.001***.10
Episode5.1596.3810.08<.001***.04
Adversary:Episode5.10596.384.34<.001***.02
Recovery time
Adversary1.0117.008.87.004**.06
Episode4.78559.482.09.068.00
Adversary:Episode4.78559.481.62.157.00
MetricNumDFDenDFF-valuePP significanceη2
Loss
Adversary1.00117.008.44.004**.06
Episode4.45520.945.99<.001***.01
Adversary:Episode4.45520.943.54.005**.01
Disruptions
Adversary1.0117.0024.24<.001***.10
Episode5.1596.3810.08<.001***.04
Adversary:Episode5.10596.384.34<.001***.02
Recovery time
Adversary1.0117.008.87.004**.06
Episode4.78559.482.09.068.00
Adversary:Episode4.78559.481.62.157.00

*P <.05, **P <.01, and ***P <.001.

Table 3.

Results of the mixed ANOVAs regarding the effect of adversary type and episodes on outcome metrics.

MetricNumDFDenDFF-valuePP significanceη2
Loss
Adversary1.00117.008.44.004**.06
Episode4.45520.945.99<.001***.01
Adversary:Episode4.45520.943.54.005**.01
Disruptions
Adversary1.0117.0024.24<.001***.10
Episode5.1596.3810.08<.001***.04
Adversary:Episode5.10596.384.34<.001***.02
Recovery time
Adversary1.0117.008.87.004**.06
Episode4.78559.482.09.068.00
Adversary:Episode4.78559.481.62.157.00
MetricNumDFDenDFF-valuePP significanceη2
Loss
Adversary1.00117.008.44.004**.06
Episode4.45520.945.99<.001***.01
Adversary:Episode4.45520.943.54.005**.01
Disruptions
Adversary1.0117.0024.24<.001***.10
Episode5.1596.3810.08<.001***.04
Adversary:Episode5.10596.384.34<.001***.02
Recovery time
Adversary1.0117.008.87.004**.06
Episode4.78559.482.09.068.00
Adversary:Episode4.78559.481.62.157.00

*P <.05, **P <.01, and ***P <.001.

Statistical results indicate that the loss, disruptions, and recovery time of the defenders are significantly different when facing the Beeline or Meander adversary. With the exception of average recovery time, we also found consistent significant effects of the episode and the interactions between the adversary and the episode in the Loss and Disruptions.

Post-hoc one-way ANOVAs for each of the metrics confirm what we observed in the figure: loss and disruptions improved over the course of episodes only when participants confront the Beeline adversary, but not when paired against the Meander adversary. Losses were lower with more episodes only in the Beeline adversary [F(4.29, 278.7) = 7.69, P <.001, η2 =.02] but not in the Meander [F(4.12, 214.1) = 1.256, P =.29, η2 =.01]; and the number of disruptions decreased only in the Beeline adversary [F(4.93, 320.45) = 10.70, P <.001, η2 =.08] and not in the Meander [F(6, 312) = 1.95, P =.07, η2 =.02].

The analyses above demonstrate significant differences in defense outcomes when defenders confront Beeline or Meander adversary. The results suggest that Beeline is initially a significantly more damaging attack strategy than Meander. This makes sense by the definition of the strategy, where the Beeline adversary advances directly through the subnets to the operational sever. However, importantly, participants were able to learn the behavior of the Beeline adversary and improve their defense in a way that the loss and number of disruptions improved with more episodes in the task. Participants were more successful against the Meander strategy; however, they were unable to significantly improve their performance with more episodes.

In what follows, we further analyze the process by which participants behaved over the course of the episodes. We analyze the participants proportion of actions, the dynamics of defense actions over time, and characterize their defense strategies. We also explore the individual differences of these behaviors.

Process metrics

Defense actions

We analyzed the defense actions taken by the participants while executing the task. Table 4 presents the overall average proportion of use of each of the four defense actions—Analyze, Monitor, Remove and Restore—in each of the two adversary strategies.

Table 4.

Descriptive statistics (mean ± SD) regarding the average proportion of command usage per attacker type.

BeelineMeander
Analyze.20 ±.14.19 ±.11
Monitor.36 ±.20.30 ±.19
Remove.32 ±.19.39 ±.22
Restore.19 ±.09.19 ±.09
BeelineMeander
Analyze.20 ±.14.19 ±.11
Monitor.36 ±.20.30 ±.19
Remove.32 ±.19.39 ±.22
Restore.19 ±.09.19 ±.09
Table 4.

Descriptive statistics (mean ± SD) regarding the average proportion of command usage per attacker type.

BeelineMeander
Analyze.20 ±.14.19 ±.11
Monitor.36 ±.20.30 ±.19
Remove.32 ±.19.39 ±.22
Restore.19 ±.09.19 ±.09
BeelineMeander
Analyze.20 ±.14.19 ±.11
Monitor.36 ±.20.30 ±.19
Remove.32 ±.19.39 ±.22
Restore.19 ±.09.19 ±.09

In general, the Monitor and Remove actions seem to be more popular compared to the Analyze and Restore actions among defenders, regardless of the strategy. ANOVAs performed for each adversary group revealed significant differences on the proportion of use of these actions when facing Beeline [F(3, 264) = 17.91, P <.001, η2 =.17) and when facing Meander [F(3, 208) = 18.80, P <.001, η2 =.21]. Post-hoc comparisons using Tukey’s HSD corrections confirm that, regardless of the type of adversary, the proportion of use of Monitor and Analyze; Monitor and Restore; Remove and Analyze; and Remove and Restore were significantly different at P < .001.

Overall, participants in both conditions used Monitor and Remove actions significantly more often than Analyze and Restore (We noted a weak but significant positive correlation between the proportion of Analyze command used and the Cybersecurity background of participants (Spearman rank correlation: Rs =.23, P = .011). “Expert” subjects seemed to be overly focused on the Analyze action. However, the discussion of this result is beyond the scope of this paper).

To observe the dynamics of the use of these defense actions over the course of episodes, we analyzed the proportions of actions on two levels: (1) across episodes, to observe potential learning and progressive establishment of a defense strategy, and (2) within episodes, aggregating all episodes and analyzing across the 25 steps of episodes.

Figure 4 shows the average proportion of actions over the course of the seven episodes. The defender’s behavior appears to be very similar in both adversary strategies across episodes. The main differences observed are that the actions Monitor and Remove are more common than the actions Analyze and Restore. In addition, the action Remove is more common when the defender confronts the Meander than when confronting the Beeline adversary.

Average proportion of defense action usage over episodes with standard error of the mean.
Figure 4.

Average proportion of defense action usage over episodes with standard error of the mean.

However, mixed-effect ANOVAs on the proportion of each of the action types only revealed a significant effect of the episode on the proportion of Analyze action [F(4.33, 506.54) = 8.318, P <.001, η2 =.02] when playing against the Beeline and also the Meander adversaries. No effects of the type of adversary were found for any of the actions.

We also analyzed the proportion of actions performed at each step over all episodes. To highlight the differences between the two adversaries, we calculated the difference between the proportion of actions taken by participants facing the Meander opponent and the proportion of actions taken by participants facing the Beeline opponent. Figure 5 presents this difference.

Difference in average proportion of action usage between Meander and Beeline conditions. A positive value indicates a higher proportion of the command in the Meander condition, and a negative one indicates a higher proportion in the Beeline condition.
Figure 5.

Difference in average proportion of action usage between Meander and Beeline conditions. A positive value indicates a higher proportion of the command in the Meander condition, and a negative one indicates a higher proportion in the Beeline condition.

We observe a larger number of Remove actions initially in the Meander compared to the Beeline, and the larger number of Analyse actions in the Beeline compared to Meander in the first 10 steps. The difference in the proportion of actions is relatively consistent and stable during the first 10 steps. However, after step 10, we observe significant variability in this difference of the proportion of actions, noticing that the participants against the Beeline adversary engage in more Monitor actions than those playing against the Meander.

The proportion of actions against Beeline and Meander was tested for each type of action during steps 1–10, and then during steps 11–25. Table 5 indicates that the only significant difference is in the proportion of Monitor and Remove actions during steps 11–25. The proportion of Monitor actions for participants who confronted the Beeline strategy was higher than those who confronted the Meander strategy. Also, the proportion of Remove actions for participants who confronted the Meander strategy was higher than those who confronted the Beeline strategy.

Table 5.

Results of the ANOVA regarding the effect of adversary type in groups of steps 1–10 and 11–25.

CommandNumDFDenDFF-valuePP significanceη2
1–10
Analyze1.00686.403.53.06.08
Monitor1.00670.470.08.784.03
Remove1.00610.512.61.107.07
Restore1.00685.280.27.601.04
11–25
Analyze1.001014.130.08.78.03
Monitor1.001016.0638.80<.001***.23
Remove1.00992.6024.47<.001***.20
Restore1.001025.171.72.191.05
CommandNumDFDenDFF-valuePP significanceη2
1–10
Analyze1.00686.403.53.06.08
Monitor1.00670.470.08.784.03
Remove1.00610.512.61.107.07
Restore1.00685.280.27.601.04
11–25
Analyze1.001014.130.08.78.03
Monitor1.001016.0638.80<.001***.23
Remove1.00992.6024.47<.001***.20
Restore1.001025.171.72.191.05

***P <.001.

Table 5.

Results of the ANOVA regarding the effect of adversary type in groups of steps 1–10 and 11–25.

CommandNumDFDenDFF-valuePP significanceη2
1–10
Analyze1.00686.403.53.06.08
Monitor1.00670.470.08.784.03
Remove1.00610.512.61.107.07
Restore1.00685.280.27.601.04
11–25
Analyze1.001014.130.08.78.03
Monitor1.001016.0638.80<.001***.23
Remove1.00992.6024.47<.001***.20
Restore1.001025.171.72.191.05
CommandNumDFDenDFF-valuePP significanceη2
1–10
Analyze1.00686.403.53.06.08
Monitor1.00670.470.08.784.03
Remove1.00610.512.61.107.07
Restore1.00685.280.27.601.04
11–25
Analyze1.001014.130.08.78.03
Monitor1.001016.0638.80<.001***.23
Remove1.00992.6024.47<.001***.20
Restore1.001025.171.72.191.05

***P <.001.

To explain these defense behaviors within episodes, we analyzed the types of targets that each of the adversarial strategies attacked in each of the steps aggregated across all episodes. Figure 6 represents the proportion of targets that each of the adversaries attacked on each step.

Evolution of the proportion of attack by target across steps.
Figure 6.

Evolution of the proportion of attack by target across steps.

We observe that both adversaries start by attacking Subnet 1, then move to User 1, then to Enterprise 1, and then to Subnet 2. This similarity of adversarial actions appears during the first eight steps of the game. After these steps, Meander starts to target different hosts, such as “Defender,” while Beeline moves on to Enterprise 2 and then directly to the Operational Server. This illustration explains the differences in the two attack strategies and explains why the human defenders’ actions vary after step 10 and differs in the Monitoring and Removing actions during steps 11–25.

The analysis of defense actions provides evidence of an evolution in the dynamics of defender’s decisions throughout the experiment. We propose that this behavior is the result of the participant’s learning to defend against their opponent, which explains the performance improvement observed in Fig. 3. There are at least two possibilities to evaluate whether participants improved their understanding of the opponent’s strategy and the optimality of their decisions.

Given the sequential nature of the game, the optimality of a decision at a specific step should be defined based on the effect that each particular action will have in future steps. Thus, it is possible but not trivial to calculate such “optimal” decision at each step. For each action taken by each individual participant, one would need to calculate the sequence of 25-n actions that would result in the lowest loss by the end of the episode at each step n and for all the future steps. This is a computationally intensive model and not a trivial optimization algorithm that we considered but decided not to pursue.

Instead, we looked to characterize defense strategies and developed a set of defense heuristics, that may inform human behavior.

Defense strategies

To capture the level of understanding of the opponent’s strategy and to identify defense actions that would be cognitively plausible, we developed a set of defense heuristics and classified the defense actions into three groups of strategies: Reactive, Proactive, and Passive strategies.

In the cyber literature, proactive and reactive strategies usually refer to the general approach institutions have for their cybersecurity, i.e. anticipating future threats versus patching security flaws that could expose them to known threats [35–38]. Here, as we focus on the operational level rather than the organizational one, we categorized each individual decision and action according to the following definition:

  • The passive strategy represents defense actions that have no direct effect on the state of the network or slowing or stopping the progress of the adversary in the network.

  • The reactive strategy represents actions that result in an improved state of the network, such as the recovery of infected hosts. These are actions that the defender takes after hosts have already been attacked by the adversary and defense points have been lost.

  • The proactive strategy is characterized by preventive actions. These are actions that reflect an anticipation of the next adversarial move or a prediction of the intention of the adversary, in a way that the defender is able to block the progression of the attack.

Table 6 presents the set of high-level heuristics used to categorize defense actions into one of the three strategies. Using the defender action, the state of the network (e.g. is the defender targeting a host that is or has been attacked), and the effect of the defense action, we coded each of these heuristics. Using this coding scheme, 91% of all defender’s actions were categorized.

Table 6.

Heuristics.

BehaviorStrategy
Recovering a compromised host at the user or administrator levelReactive
Recovering the Operational Server when it is impactedReactive
Blocking an initial Impact attemptProactive
Preventing a host from being compromisedProactive
Repeating a successful actionProactive
Monitoring or AnalyzingPassive
BehaviorStrategy
Recovering a compromised host at the user or administrator levelReactive
Recovering the Operational Server when it is impactedReactive
Blocking an initial Impact attemptProactive
Preventing a host from being compromisedProactive
Repeating a successful actionProactive
Monitoring or AnalyzingPassive
Table 6.

Heuristics.

BehaviorStrategy
Recovering a compromised host at the user or administrator levelReactive
Recovering the Operational Server when it is impactedReactive
Blocking an initial Impact attemptProactive
Preventing a host from being compromisedProactive
Repeating a successful actionProactive
Monitoring or AnalyzingPassive
BehaviorStrategy
Recovering a compromised host at the user or administrator levelReactive
Recovering the Operational Server when it is impactedReactive
Blocking an initial Impact attemptProactive
Preventing a host from being compromisedProactive
Repeating a successful actionProactive
Monitoring or AnalyzingPassive

In particular, we characterized proactive actions as a way to determine whether the defenders were ahead of the attacker by choosing the action that would prevent the attacker from doing damage to the network in the future. A repetition of proactive actions reflects an advanced understanding of the opponent’s strategy, and explains the learning across episodes.

The overall proportion of reactive, proactive, and passive strategies coded from the defenders’ actions when confronted with Beeline and Meander adversaries are presented in Table 7. The table indicates that passive strategies are more common than proactive strategies.

Table 7.

Descriptive statistics (mean ± SD) regarding the average proportion of defense strategy per attacker type.

BeelineMeander
Reactive.27 ±.15.26 ±.16
Proactive.19 ±.19.15 ±.20
Passive.48 ±.22.45 ±.24
BeelineMeander
Reactive.27 ±.15.26 ±.16
Proactive.19 ±.19.15 ±.20
Passive.48 ±.22.45 ±.24
Table 7.

Descriptive statistics (mean ± SD) regarding the average proportion of defense strategy per attacker type.

BeelineMeander
Reactive.27 ±.15.26 ±.16
Proactive.19 ±.19.15 ±.20
Passive.48 ±.22.45 ±.24
BeelineMeander
Reactive.27 ±.15.26 ±.16
Proactive.19 ±.19.15 ±.20
Passive.48 ±.22.45 ±.24

Figure 7 presents the proportion of these strategies per episode. This figure illustrates that passive strategies are most common, regardless of the type of adversary. The proportion of reactive strategies decreases over the course of episodes, while the proportion of proactive strategies increases. This pattern appears to be very similar for both adversaries, although the increase of proactive strategies appears to be faster against the Beeline adversary compared to the Meander adversary.

Average proportion of each strategy per episode.
Figure 7.

Average proportion of each strategy per episode.

The mixed-ANOVA results shown in Table 8 indicates a significant effect of the episode on the proportion of each strategy in both types of adversaries. It also shows a significant interaction between the episode and the type of adversary for the proportion of proactive strategy.

Table 8.

Results of the mixed ANOVA regarding the effect of adversary type and episodes on the proportion of defense strategies.

StrategyNumDFDenDFF-valuePP significanceη2
Reactive
Adversary1117.000.18.675.00
Episode4.15485.828.83<.001***.03
Adversary:Episode4.15485.822.30.550.01
Proactive
Adversary1117.001.09.299.01
Episode3.03354.999.23<.001***.02
Adversary:Episode3.03354.992.70.045*.01
Passive
Adversary1117.000.66.417.00
Episode3.73436.853.51.009**.01
Adversary:Episode3.73436.851.11.352.00
StrategyNumDFDenDFF-valuePP significanceη2
Reactive
Adversary1117.000.18.675.00
Episode4.15485.828.83<.001***.03
Adversary:Episode4.15485.822.30.550.01
Proactive
Adversary1117.001.09.299.01
Episode3.03354.999.23<.001***.02
Adversary:Episode3.03354.992.70.045*.01
Passive
Adversary1117.000.66.417.00
Episode3.73436.853.51.009**.01
Adversary:Episode3.73436.851.11.352.00

*P <.05, **P < .01, and ***P <.001.

Table 8.

Results of the mixed ANOVA regarding the effect of adversary type and episodes on the proportion of defense strategies.

StrategyNumDFDenDFF-valuePP significanceη2
Reactive
Adversary1117.000.18.675.00
Episode4.15485.828.83<.001***.03
Adversary:Episode4.15485.822.30.550.01
Proactive
Adversary1117.001.09.299.01
Episode3.03354.999.23<.001***.02
Adversary:Episode3.03354.992.70.045*.01
Passive
Adversary1117.000.66.417.00
Episode3.73436.853.51.009**.01
Adversary:Episode3.73436.851.11.352.00
StrategyNumDFDenDFF-valuePP significanceη2
Reactive
Adversary1117.000.18.675.00
Episode4.15485.828.83<.001***.03
Adversary:Episode4.15485.822.30.550.01
Proactive
Adversary1117.001.09.299.01
Episode3.03354.999.23<.001***.02
Adversary:Episode3.03354.992.70.045*.01
Passive
Adversary1117.000.66.417.00
Episode3.73436.853.51.009**.01
Adversary:Episode3.73436.851.11.352.00

*P <.05, **P < .01, and ***P <.001.

Post-hoc one-way ANOVAs, and considering the Bonferroni adjusted P-value (P.adj), it can be seen that the simple main effect of Episode on the proportion of Proactive strategy was significant against Beeline [F(2.46, 159.66) = 9.152, P.adj < .001, η2 =.04] but not against Meander [F(3.11, 161.83) = 2.930, P.adj =.068, η2 =.01].

Individual differences

Figure 8 represents the proportion of each strategy fit per episode for each individual participant separately. Furthermore, these panels are organized according the overall loss of each of the participants, where the top-left panel represents the participant with the maximum loss and the bottom-right panel represents the participant with the minimum loss.

Proportion of each strategy per subject and episode. Subjects are ordered by Loss. Least performing subject (maximum loss) in the top-left corner. The loss value is displayed above each graph.
Figure 8.

Proportion of each strategy per subject and episode. Subjects are ordered by Loss. Least performing subject (maximum loss) in the top-left corner. The loss value is displayed above each graph.

This figure immediately reveals the variability in the individual behaviors and the connections between the strategy that each participant used and the individual loss. Many unsuccessful defenders use passive strategies more often, while more successful defenders were more proactive.

Strategy and loss correlations

The association between the strategy and the total loss across both adversaries, was also analyzed through correlations. Scatter plots in Fig. 9 represent the relationship between each individual defender’s total loss score and the proportion of each strategy.

Scatter plot of subject’s total Loss and proportion of strategy.
Figure 9.

Scatter plot of subject’s total Loss and proportion of strategy.

Spearman’s correlation tests indicate a strong significant positive correlation between the participant’s loss and the proportion of proactive strategy (Spearman rank correlation: Rs = 0.66, P <.001). That is, generally, defenders with a higher proportion of proactive behaviors are more likely to lose fewer points, i.e. to protect the network better. Being proactive, such as performing a Remove action that prevents a host from being exploited, is an efficient way to prevent loses and being more successful in protecting the network.

Similarly, Spearman’s correlation tests indicate a moderate significant negative correlation between the defender’s loss and its proportion of passive strategy (Spearman rank correlation: Rs = −0.45, P <.001). Defenders with larger number of passive actions were more likely to lose more points since they are not taking any active defense action, i.e. they are not protecting the network.

Finally, the correlation between the defender’s loss and the proportion of reactive strategy was not significant.

Discussion

We designed a simple cyber-defense game as a web-based application, to study human defense decisions against simulated adversaries. In this experiment, we measured the impact of two different deterministic attack strategies on defenders’ behaviors. To do so, we analyzed their performance, their defense choices and behaviors, and their strategies.

As expected, the defenders performance reflects the difference in “aggressiveness” of the attack strategy in terms of Loss, Recovery Time, and number of Disruptions. Indeed, as an attacker following the Beeline strategy was quicker to reach the Operational Server than one following a Meander strategy, it resulted in significantly bigger Loss for the human defender, more Disruptions and longer Recovery Time. However, we have observed that, over the episodes and independently from the condition, participants have managed to improve their performance and lower their Loss. Two possible explanations can be investigated for the overall improvement: (1) the number of Disruptions dropped while subjects learned to more efficiently prevent the attacker from reaching the Operational Server and/or, (2) the Recovery Time improved, i.e. subjects became faster to recover the Operational Server from a disruption.

Results indicate a significant drop in the number of Disruptions recorded over time, while no amelioration is noticeable in terms of Recovery Time. This can be interpreted as the defenders learning to more efficiently block the progression of the attacker in the network, before it reaches the Operational Server.

Overall, participants confronted with a Beeline attacker learned to develop an efficient Proactive defense strategy to improve their performance, be it in terms of loss, number of disruptions, and recovery time. Our interpretation is that, even though both attack strategies are deterministic, Beeline is more direct and consistent, and routing through a smaller number of hosts than Meander. This makes the Beeline strategy easier for the defenders to form a mental representation of, and to predict the adversarial actions with increased defense experience. The predictability of the strategy of attack had a significant influence on how humans learn an effective defense strategy.

Although participants who faced the Beeline adversary seemed to significantly improve their performance over time, they only succeeded to achieve similar level of performance than participants who faced the Meander adversary. In some ways, the Beeline adversary leaves more room for improvement, which could also be a factor in the observed difference in learning pace. In past results involving experiments with cognitive models on the same task [33], defense agents showed accentuated learning curves when confronted to a Beeline attacker but similar final performance after a large number of episodes. It would be interesting to see how humans are able to improve their strategies and how their performance evolves with more episodes. Also, in future work, longer episodes (i.e. more than 25 steps) could allow us to use patterns identification methods and extended analysis of actions sequences, to refine the categorization of defense strategies and perhaps identify more complex heuristics.

In general, this study illustrates how the type of simulated adversary that human defenders face may influence the speed of learning and the development of an adequate defense strategy. A more aggressive but more predictive attacker was found to be easier to learn and exploit by human defender compared to a stealthy and less predictable adversary.

Cyber analysts have to work in a highly dynamic environment, with flawed and noisy information. Adversarial cyber-defense games and simulation tools like the IDG can help simulate such decision-making situations and better understand the cognitive demands faced by humans cyber defenders.

This experiment also aimed to provide human data to assess the accuracy of human-like IBL defense agents, as presented in refs. [33, 39, 40]. In this context, our work sheds light on the importance of providing less predictive attackers for the development and training of human defenders. These results support the findings of recent modeling experiments that have shown that dynamic attack strategies are a weakness for cognitive models and AI defense [33, 40].

Future work needs to look into the effect of such fully dynamic and adaptive attackers on the human development of defense strategies. We formulate the hypothesis that cognitive dynamic and adaptable attack agents that are able to learn, will present a bigger challenge to defenders, and thus, it provide a better training opportunity for defenders.

This is also a necessary evolution toward more realistic scenarios where expertise brings an advantage. The cybersecurity expertise in particular would be necessary in situations with complex environments and complex tools used in the workplace. In naturalistic settings, the diversification of strategies of attack and their dynamic adaptation to the opponent’s actions is indeed more common, and becoming a prominent topic with AI-led cyberattacks.

Because participants with the skills and knowledge required to test highly technical tasks and sophisticated adversaries are hard to find and are often too busy to provide their time to test emulated adversaries, extensive care has been given to design a relevant cyber task that could be performed by a general population.

Future work will aim to improve the task design to be more representative of real-world environments, with an increased complexity of the scenario (e.g. larger networks, simulated regular user activity), by providing more diverse opponents strategies and by introducing teamwork. In particular, we will look into the development of a collaborative defense environment to further explore human–AI collaboration in cyber defense and address some of the challenges of the cyber battlefield of the future.

Acknowledgement

The authors thank the anonymous reviewers for their valuable suggestions. We thank Jeffrey Flagg, Dynamic Decision Making Laboratory, for research assistance in reviewing and running the study.

Funding

This research was supported by the Army Research Office and accomplished under Australia–US MURI Grant Number W911NF-20-S-000 and by the Army Research Laboratory under Cooperative Agreement Number W911NF-13-2-0045 (ARL Cyber Security CRA).

Author contributions

Baptiste Prebot (Conceptualization, Formal analysis, Methodology, Software, Writing – original draft, Writing – review & editing), Yinuo Du (Conceptualization, Software, Writing – original draft, Writing – review & editing), and Cleotilde Gonzalez (Conceptualization, Methodology, Writing – original draft, Writing – review & editing)

Conflict of interest statement. None declared.

References

1.

Li
Y
,
Liu
Q
.
A comprehensive review study of cyber-attacks and cyber security; emerging trends and recent developments
.
Ener Rep
.
2021
;
7
:
8176
86
.

2.

Thanh
CT
,
Zelinka
I
.
A survey on artificial intelligence in malware as next-generation threats
.
Mendel
.
2019
;
25
:
27
34
.

3.

Colbert
EJ
,
Kott
A
,
Knachel
LP
.
The game-theoretic model and experimental investigation of cyber wargaming
.
J Def Model Sim
.
2020
;
17
:
21
38
.

4.

Ferguson-Walter
K
,
Shade
T
,
Rogers
A
et al.
The Tularosa study: an experimental design and implementation to quantify the effectiveness of cyber deception
.
Technical report
,
Albuquerque, NM
:
Sandia National Lab. (SNL-NM)
,
2018
.

5.

Applebaum
A
,
Miller
D
,
Strom
B
et al.
Intelligent, automated red team emulation
. In:
Proceedings of the 32nd Annual Conference on Computer Security Applications
. pp.
363
73
.,
New York, NY, USA
:
Association for Computing Machinery
,
2016
.

6.

Kavak
H
,
Padilla
JJ
,
Vernon-Bido
D
et al.
Simulation for cybersecurity: state of the art and future directions
.
J Cybersecur
.
2021
;
7
:
tyab005
.

7.

Varshney
M
,
Pickett
K
,
Bagrodia
R
.
A live-virtual-constructive (LVC) framework for cyber operations test, evaluation and training
. In:
2011-MILCOM 2011 Military Communications Conference
.
Baltimore, MD, USA
:
IEEE
,
2011
,
1387
92
.

8.

Gutzwiller
RS
,
Hunt
SM
,
Lange
DS
.
A task analysis toward characterizing cyber-cognitive situation awareness (CCSA) in cyber defense analysts
. In:
2016 IEEE International Multi-Disciplinary Conference on Cognitive Methods in Situation Awareness and Decision Support (CogSIMA)
.
San Diego, CA, USA
:
IEEE
,
2016
,
14
20
.

9.

Veksler
VD
,
Buchler
N
,
LaFleur
CG
et al.
Cognitive models in cybersecurity: learning from expert analysts and predicting attacker behavior
.
Front Psychol
.
2020
;
11
:
1049
.

10.

Veksler
VD
,
Buchler
N
,
Hoffman
BE
et al.
Simulations in cyber-security: a review of cognitive modeling of network attackers, defenders, and users
.
Front Psychol
.
2018
;
9
:
691
.

11.

Cranford
EA
,
Gonzalez
C
,
Aggarwal
P
et al.
Towards a cognitive theory of cyber deception
.
Cogn Sci
.
2021
;
45
:
e13013
.

12.

Johnson
CK
,
Gutzwiller
RS
,
Gervais
J
et al.
Decision-making biases and cyber attackers
. In:
2021 36th IEEE/ACM International Conference on Automated Software Engineering Workshops (ASEW)
.
Melbourne, Australia
:
IEEE
,
2021
,
140
4
.

13.

Gonzalez
C
,
Ben-Asher
N
,
Oltramari
A
et al.
Cognition and technology
. In:
Cyber Defense and Situational Awareness
.
Cham
:
Springer
,
2014
,
93
117
.

14.

Jones
DN
,
Padilla
E
,
Curtis
SR
et al.
Network discovery and scanning strategies and the Dark Triad
.
Comput Hum Behav
.
2021
;
122
:
106799
.

15.

Curtis
SR
,
Rajivan
P
,
Jones
DN
et al.
Phishing attempts among the dark triad: patterns of attack and vulnerability
.
Comput Hum Behav
.
2018
;
87
:
174
82
.

16.

Gutzwiller
RS
,
Fugate
S
,
Sawyer
BD
et al.
The human factors of cyber network defense
. In:
Proceedings of the Human Factors and Ergonomics Society Annual Meeting
. Vol.
59
.
Los Angeles, CA
:
SAGE Publications
,
2015
,
322
6
.

17.

Buchler
N
,
Rajivan
P
,
Marusich
LR
et al.
Sociometrics and observational assessment of teaming and leadership in a cyber security defense competition
.
Comput Secur
.
2018
;
73
:
114
36
.

18.

Strom
BE
,
Applebaum
A
,
Miller
DP
et al.
Mitre attack: design and philosophy
.
Technical report
.
The MITRE Corporation
,
2018
,
Online
.

19.

Gonzalez
C
,
Vanyukov
P
,
Martin
MK
.
The use of microworlds to study dynamic decision making
.
Comput Hum Behav
.
2005
;
21
:
273
86
.

20.

Aggarwal
P
,
Gonzalez
C
,
Dutt
V
.
HackIt: a real-time simulation tool for studying real-world cyberattacks in the laboratory
. In:
Handbook of Computer Networks and Cyber Security
.
Cham
:
Springer
,
2020
,
949
59
.

21.

Singh
K
,
Aggarwal
P
,
Rajivan
P
et al.
Training to detect phishing emails: effects of the frequency of experienced phishing emails
. In:
Proceedings of the Human Factors and Ergonomics Society Annual Meeting
. Vol.
63
. pp.
453
7
.,
Los Angeles, CA
:
SAGE Publications
,
2019
.

22.

Ben-Asher
N
,
Gonzalez
C
.
Effects of cyber security knowledge on attack detection
.
Comput Hum Behav
.
2015
;
48
:
51
61
.

23.

Moisan
F
,
Gonzalez
C
.
Security under uncertainty: adaptive attackers are more challenging to human defenders than random attackers
.
Front Psychol
.
2017
;
8
:
982
.

24.

Hutchins
EM
,
Cloppert
MJ
,
Amin
RM
et al.
Intelligence-driven computer network defense informed by analysis of adversary campaigns and intrusion kill chains
. Vol.
1
,
Bethesda, MD, USA
:
Lockheed Martin Corporation
,
2011
,
80
.

25.

Zhang
L
,
Thing
VL
.
Three decades of deception techniques in active cyber defense-retrospect and outlook
.
Comput Secur
.
2021
;
106
:
102288
.

26.

Tambe
M
.
Security and game theory: algorithms, deployed systems, lessons learned
.
Cambridge
:
Cambridge University Press
,
2011
.

27.

Abbasi
Y
,
Kar
D
,
Sintov
ND
et al.
Know your adversary: insights for a better adversarial behavioral model
In:
Proceedings of the 8th Annual Conference of the Cognitive Science Society
,
Austin, TX
:
Cognitive Science Society
,
2016
.

28.

Aggarwal
P
,
Maqbool
Z
,
Grover
A
et al.
Cyber security: a game-theoretic analysis of defender and attacker strategies in defacing-website games
. In:
2015 International Conference on Cyber Situational Awareness, Data Analytics and Assessment (CyberSA)
.
London, UK
:
IEEE
,
2015
,
1
8
.

29.

Nochenson
A
,
Heimann
C
.
Simulation and game-theoretic analysis of an attacker–defender game
. In:
International Conference on Decision and Game Theory for Security
.
Berlin, Heidelberg
:
Springer
,
2012
,
138
51
.

30.

Do
CT
,
Tran
NH
,
Hong
C
et al.
Game theory for cyber security and privacy
.
ACM Comput Surv (CSUR)
.
2017
;
50
:
1
37
.

31.

Attiah
A
,
Chatterjee
M
,
Zou
CC
.
A game theoretic approach to model cyber attack and defense strategies
. In:
2018 IEEE International Conference on Communications (ICC)
.
Kansas City, MO, USA
:
IEEE
,
2018
,
1
7
.

32.

Wang
Y
,
Wang
Y
,
Liu
J
et al.
A survey of game theoretic methods for cyber security
. In:
2016 IEEE First International Conference on Data Science in Cyberspace (DSC)
.
Changsha, China
:
IEEE
,
2016
,
631
6
.

33.

Du
Y
,
Prébot
B
,
Xi
X
et al.
Towards autonomous cyber defense: predictions from a cognitive model
.
Proc Hum Factor Ergon Soc
.
2022
;
66
:
1121
5
.

34.

Gonzalez
C
,
Lerch
FJ
,
Lebiere
C
.
Instance-based learning in dynamic decision making
.
Cogn Sci
.
2003
;
27
:
591
635
.

35.

Grisham
J
,
Samtani
S
,
Patton
M
et al.
Identifying mobile malware and key threat actors in online hacker forums for proactive cyber threat intelligence
. In:
2017 IEEE International Conference on Intelligence and Security Informatics (ISI)
.
Beijing, China
:
IEEE
,
2017
,
13
8
.

36.

Bhuyan
SS
,
Kabir
UY
,
Escareno
JM
et al.
Transforming healthcare cybersecurity from reactive to proactive: current status and future recommendations
.
J Med Syst
.
2020
;
44
:
1
9
.

37.

Samtani
S
,
Abate
M
,
Benjamin
V
et al.
Cybersecurity as an industry: a cyber threat intelligence perspective
. In:
Holt T, Bossler A
(eds),
The Palgrave Handbook of International Cybercrime and Cyberdeviance
,
Cham
:
Palgrave Macmillan
.
2020
,
135
54
.

38.

Zarreh
A
,
Saygin
C
,
Wan
H
et al.
A game theory based cybersecurity assessment model for advanced manufacturing systems
.
Procedia Manuf
.
2018
;
26
:
1255
64
.

39.

Prébot
B
,
Du
Y
,
Xi
X
et al.
Cognitive models of dynamic decision in autonomous intelligent cyber defense
. In:
International Conference on Autonomous Intelligent Cyber-defense Agents
,
Bordeaux, France
,
2022
.

40.

Du
Y
,
Prébot
B
,
Gonzalez
C
.
A cyber-war between bots: human-like attackers are more challenging for defenders than deterministic attackers
.
Proceedings of the 56th Hawaii International Conference on System Sciences (HICSS 2023)
, In:
Bui T 
(ed),
Honolulu, HI, USA
:
HICSS Conference Office
,
2023
.

This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.