Extended phase space description of human-controlled systems dynamics

Humans are often incapable of precisely identifying and implementing the desired control strategy in controlling unstable dynamical systems. That is, the operator of a dynamical system treats the current control effort as acceptable even if it deviates slightly from the desired value, and starts correcting the actions only when the deviation has become evident. We argue that the standard Newtonian approach does not allow to model such behavior. Instead, the physical phase space of a controlled system should be extended with an independent phase variable characterizing the operator motivated actions. The proposed approach is illustrated via a simple non-Newtonian model capturing the operators fuzzy perception of their own actions. The properties of the model are investigated analytically and numerically; the results confirm that the extended phase space may aid in capturing the intricate dynamical properties of human-controlled systems.


Introduction
The basic physical notions and mathematical formalism have been successfully employed in modeling social and psychological phenomena. The notions of Newtonian mechanics were used in social force models for traffic dynamics and crowd behavior [1,2]. The statistical physics framework, namely, the master equation approach, has been widely used in describing opinion formation and language evolution [3]. Nonetheless, despite the success gained in describing social phenomena in mathematical terms up to now, there is a strong demand for notions and models reflecting the unique properties of human beings [4]. Such features as feelings, emotions, intentions, and beliefs distinguish humans from the inanimate objects studied in physics. Development of specific notions and formalism capturing the peculiarities of human behavior at the level of individuals can enable us to model, simulate, and better understand the complex social phenomena met in everyday life.
One of the cornerstones of modern physics widely met in social psychology [5] is the notion of a fixed-point attractor, or equilibrium. For instance, a person achieving and maintaining a certain endstate or goal can be formally treated as a dynamical system drifting towards an equilibrium point in the corresponding phase space [6]. Likewise, one may consider an entity controlled directly by a human operator whose purpose is to maintain its stability. In this case the system dynamics as a whole can also be described by a fixed-point attractor. External or internal factors may cause the system to deviate from the equilibrium, but if the operator is capable of handling such perturbations, the system will eventually evolve to the desired state.  Recent advances in the field of human control give evidence to the fact that humans do not generally operate the systems under their control in a precise way. Maintaining the system exactly at the desired position requires the ability of the operator to keep perfect awareness and to react immediately even to the smallest deviations. Meanwhile, experimental studies have revealed that the considerable response latency and the effects of noise in the sensorimotor system prevent human operators from implementing continuous control strategies (see, e.g., [7] and references therein). Instead, discontinuous, or intermittent, control [8] is found to be efficient in the presence of time delays and random perturbations in human-controlled processes. The "drift and act" pattern of human control has been detected, e.g., in aircraft landing [9], stick balancing at the fingertip [10], and postural control during quiet standing [11].
In each of these processes the human operator prefers to ignore small deviations of the dynamical system from the desired state, starting the active control over the system only when the deviation becomes too large to ignore. The reasons for such behavior vary depending on the properties of the particular system. For instance, while controlling systems that are relatively sensitive to human response (e.g., balancing a stick), operators ignore small deviations in order not to destabilize the system by imprecise corrective actions [7]. In processes with relatively slow dynamics (e.g., car following) human operators tend to "satisfice" rather than to optimize [12]: an operator stays relaxed while the current situation is acceptable, taking control over the system only when she is uncomfortable with the deviation from the desired state.
The aforesaid allows us to conclude that human operators, at least in some situations, do not distinguish between the optimal state of a controlled system and sub-optimal states in its vicinity. This property of human behavior is a manifestation of human fuzzy rationality [13]. In the general case, the operator fuzzy rationality makes the standard equilibrium point formalism not applicable for describing the dynamics of human-controlled systems. Instead, such systems exhibit some complex time-dependent patterns of behavior near the virtual equilibrium [14]. Up to now there have been a few attempts to develop a mathematical formalism capturing the effects of human fuzzy rationality. In particular, the model of fixed human reaction threshold is commonly used in applied studies [9,10,15,16], but still there is a lot of uncertainty about the intrinsic mechanisms causing the anomalous behavior of the systems under human control (see, e.g., [17]). The dynamical trap model [18][19][20], being a certain version of the fuzzy threshold concept, is another alternative to the standard fixed-point attractor in complex sociopsychological systems. Both models capture the fuzziness of the desired end-state by introducing a certain region around the virtual equilibrium where each state is treated as acceptable by the operator (Fig. 1, right frame). However, these models consider the operator behavior to be strictly optimal outside the region of acceptable states. It means that in PTEP 2014, 033J02 A. Zgonnikov and I. Lubashevsky driving the system towards this region the operator generally follows some predefined control law. Let us, for example, consider a physical system whose dynamics are specified by its coordinate x and velocity v = dx/dt. Then the actions of an operator can be represented as a certain function a = a(x, v) of the phase variables {x, v}, maybe with some time delay. As a result, the behavior of such a system governed by the operator actions is completely described by a differential equation Still, one may claim that humans are often unable to implement the desired control strategy a(x, v) precisely. Rather, the operator is only able to detect if the current control effort is worth changing when the deviation from the desired strategy becomes large enough. In other words, at each instant there is a whole fuzzy set of control parameter values which are acceptable for the operator (Fig. 2). The common approach to modeling this effect is to introduce an additive noise term (see, e.g., [21]), which is justified in some situations. However, this approach does not reflect some characteristic features of human control (e.g., on-off intermittency [8]). Besides, a wide class of intricate phenomena observed in systems of interacting individuals still remain unexplained.
In the present paper we argue that the effect of human fuzzy rationality extends beyond the concept of desired end-state to the notion of action strategy. Appealing to the dynamical traps framework we try to capture the fluctuations of the operator actions in the vicinity of the virtual optimal control strategy. We extend the two-dimensional system phase space (x, v) with acceleration a as an independent phase variable that is perceived, evaluated, and indirectly controlled by the operator. At the cost of introducing the non-Newtonian variable a we gain the possibility of describing mathematically the fuzzy set of acceptable suboptimal action strategies which is further referred to as the action dynamical trap.

Model background
Hereafter we explain the idea of the basic dynamical trap model by employing a simple example of a noisy human-controlled dynamical system with no internal dynamics. The phase space of the system comprises the coordinate x and the velocity v. The goal of the operator is to maintain the system at the desired state, the origin, by implementing the optimal (in some sense) control strategy a opt (x, v). However, if the system currently resides in some vicinity of the desired state, the operator prefers to halt active control over the system. The equations describing the system dynamics under operator control are written as follows:ẋ where f (t) is a random force with amplitude ε 1. The cofactor (x, v) is some function such that (x, v) ≈ 1 for all the values (x, v) that are far enough from the origin and (x, v) 1 in a certain neighborhood Q tr of the origin (Fig. 1, right frame).
In order to explain the meaning of the cofactor (x, v) we consider the behavior of the operator approaching the desired state (x = 0, v = 0). When the current state is far from the origin, the operator perfectly follows the optimal action strategy a opt (x, v). If the current position is recognized as "good enough," i.e., (x, y) ∈ Q tr , the operator halts active control over the system. So, during a considerable period of time the system is affected only by random factors of a small amplitude; in other words, the system is "trapped" in a vicinity of the desired position. Therefore, Q tr is called the area of dynamical trap. One may notice that in the case of linear feedback strategy where σ > 0 is a constant damping parameter, the given system under human control is analogous to the physical system of a damped harmonic oscillator. This allows us to call system (1) the oscillator with dynamical trap.
The oscillator with dynamical trap captures the basic behavior properties of the fuzzy rational operator, i.e., the operator who does not react to small deviations from the desired phase space position. When the system deviates significantly from the goal state, the operator decides to start controlling the system in order to return it to an acceptable state. This can be achieved by varying the control parameter, namely, the acceleration, in a way that is optimal in some sense.
Let us appeal to car following, which is a characteristic example of a dynamical system governed by an operator with fuzzy rationality [22]. Car drivers are unable to continuously keep perfect awareness of the surrounding situation, so they usually set the acceleration to some constant value based on the current circumstances. Once fixed, the value of acceleration is changed only when the driver realizes that the deviation from some "optimal" acceleration value has become too large to be ignored. In other words, considerable deviations of the current acceleration a from the optimal value a opt cause the operator to start active control over the car motion. However, when the difference a − a opt is rather small, there are no stimuli for the driver to respond to, i.e., to change the acceleration. Thus, one may imagine a certain region around the optimal strategy a opt (x, v), wherein each strategy is regarded as acceptable (Fig. 2). Instead of precisely following the optimal strategy, the operator just keeps the actually implemented strategy inside this region, making some corrections only when the mismatch a − a opt exceeds some fuzzy threshold. For this reason the region of acceptable strategies around a opt will be called the action dynamical trap. The "thickness" of the action dynamical trap is determined by the capacity of the operator perception and levels of concentration and motivation to follow the optimal control strategy. The action dynamical trap model is proposed to capture the discussed effects of fuzzy rationality in choosing and implementing the action strategies in humancontrolled dynamical processes.

Action dynamical trap model
We start our speculations from considering the original dynamical trap model described by equations (1). In order to elucidate the basic properties of the model, we exclude the random factors from the scope of the present paper, i.e., we consider f (t) ≡ 0.
The pivot point of the proposed approach is that we regard human actions as an independent component of the system rather than some predetermined function of its physical state. We extend the physical phase space {x, v} by introducing a new phase variable; in the given case, the system PTEP 2014, 033J02 A. Zgonnikov and I. Lubashevsky acceleration a, i.e., This enables us to ascribe to the system an additional degree of freedom corresponding to the operator actions. Now, the model capturing the dynamical trap effect in controlling the deviation a − a opt is written asẋ where τ is the operator reaction time parameter, and functions a opt (x, v) and a (a − a opt ) are to be specified.
We define the operator control strategy as a linear feedback aimed at maintaining the system at the origin: where ω and σ are non-negative constant coefficients. However, as the operator is fuzzy rational, the optimal control strategy should incorporate the dynamical trap effect in correcting the velocity variations: Thus, the control strategy a opt is optimal from the standpoint of a fuzzy rational human operator.
Here the dynamical trap cofactor v (x, v) is claimed not to depend on x. This reflects the assumption that control over the system velocity v is of higher importance for the operator compared to control over the coordinate x. The desired effect can be mimicked by any function (v) such that 1 if v ≈ 0 and ≈ 1 otherwise. Without loss of generality we use the ansatz where v th > 0 is the threshold value of velocity and v ∈ [0, 1] is the dynamical trap intensity coefficient (Fig. 3). When v equals unity, there is no dynamical trap effect-the operator is strictly rational and reacts even to the tiniest deviations. The case v = 0 matches the situation when the operator ignores the small deviations but engages actively in control over the system when the deviation becomes large enough. One may notice that if we set a a − a opt x, v ≡ 1, system (2) describes following the optimal action strategy a opt precisely by the operator whose reaction time is τ . As the human operator is not where, in analogy to (3), a ∈ [0, 1] indicates the presence of the action dynamical trap and a th is the threshold in perceiving acceleration deviations from the optimal value. In order to reduce the number of system parameters we change the time and spatial scales as follows: It is easy to check that in these dimensionless units parameters ω and a th are both equal to unity. Thus, the above expressions for a and a opt take the form

Dynamics of an oscillator with action dynamical trap
System (2)-(4) possesses the only equilibrium point at the origin. Linear stability analysis reveals that this equilibrium is stable for all values of the system parameters σ , τ , and a such that If the effect of the action dynamical trap is absent, a = 1, the system is stable for τ < σ, i.e., when the operator reaction time τ is relatively small and (or) the capability of suppressing the velocity deviations σ is relatively high. When the action dynamical trap effect comes into play, a 1, system (2)-(4) is stable only if τ σ . This may be interpreted in that the operator cannot precisely maintain the desired state of the system, unless the operator's reaction is almost immediate (τ 1) or the velocity feedback gain σ is extremely large. Moreover, when a reaches zero, the system governed by Eqs. (2)-(4) becomes unstable at the origin regardless of the values of the other parameters. It is notable that the system stability does not depend on the parameters v and v th quantifying the intensity of the velocity dynamical trap and the velocity perception threshold, respectively.
In the present paper we focus on the operator affected by both velocity and acceleration dynamical traps: v = a = 0. We also comment briefly on the case when the operator is perfectly rational in controlling either velocity ( v = 1) or acceleration ( a = 1) deviations. The intermediate values of the parameters a,v far from the boundary values have in fact little physical meaning, corresponding to the hypothetical case when the operator stays focused on controlling the small deviations, but at the same time applies reduced effort in doing so. For this reason, although the results below hold for any a,v , we refrain from detailed analysis of the system dynamics in the case of 0 < a,v < 1.
We analyzed the behavior of system (2)-(4) numerically under the adopted assumptions for various values of the system parameters. The absolute and relative error tolerance parameters of the routine used for the numerical simulations were chosen in a way that varying them tenfold could not affect the results of the simulations. The initial conditions for simulations were formed by assigning small random values to the phase variables.  We observed two major patterns of the system dynamics depending on the parameters σ , τ , and v th . The system either performs periodic oscillations or becomes uncontrollable by the operator, with all phase variables exhibiting unbounded growth.
Generally, the periodic behavior can be observed when the operator response latency τ is in some sense small and (or) the feedback gain σ is relatively large. The form of the resulting limit cycle almost does not depend on the particular values of these parameters. Figure 4 represents the example of the limit cycle found for σ = 1, τ = 0.9, v th = 0.2. The fragment of the acceleration time pattern a(t) corresponding to this phase portrait is depicted in the top frame of Fig. 5, as is the evolution of the optimal action strategy a opt (t). As clearly seen, the implemented action strategy remains in the vicinity of the optimal one. When the difference between these two strategies becomes sufficiently small, the acceleration growth ratio is also small. It reflects the fact that under this condition the operator almost does not change the control variable, a, for a certain period of time. However, when the deviation from the optimal action strategy becomes large, the operator behavior becomes active and the actual acceleration changes quickly. This is also reflected in the bottom frame of Fig. 5, where the time pattern of the dynamical trap cofactor a is represented. The values of a near unity correspond to the periods of active acceleration growth or decrease, while the stagnation of a is characterized by values of a close to zero. When a − a opt (t) becomes large, a "switches on," and the operator starts to actively control the system. Occasionally only little effort is needed to adjust the current control strategy to the optimal one (see, for instance, t ≈ 156 in Fig. 5); however, sometimes the operator has to correct the actions substantially (e.g., t ∈ [160, 161]).
As τ grows and (or) σ decreases, the amplitude of the oscillations increases and eventually the periodic pattern evolves to uncontrolled motion. Figure 6 demonstrates the dependency of the system velocity amplitude on these two parameters (for some fixed value of v th ). One can note that for any fixed value of τ the oscillation magnitude monotonically decays with σ . Indeed, the better the operator can handle the velocity deviations, the closer the system is to the desired state. Similarly, the limit cycle shrinks as τ decreases: as the operator reaction becomes faster, it becomes easier to keep the system near the desired position. On the contrary, operators with large τ and small σ are not capable of controlling the system. Such operators destabilize the system by unintelligent actions, so the phase variables reach infinite values.
As can be seen from Fig. 6, there is a boundary in the parameter space (τ, σ ) at which the transition from the periodic behavior pattern to unbounded growth occurs. We found that this boundary depends essentially on the parameter v th . Figure 7 Fig. 4.   Fig. 6. Amplitude v max of the system velocity oscillations depending on the parameters τ and σ . The brightness of each point is associated with the numerically calculated motion amplitude of the system (2)-(4) for parameters v th = 0.2, a = v = 0. The dark gray region corresponds to unbounded motion of the system. a larger area of the parameter space corresponding to the periodic motion around the origin. This may be explained in the following way. An operator characterized by the relatively large response delay τ may destabilize the system by the delayed and therefore improper actions when trying to act continuously, i.e., to compensate for the tiniest deviations (v th close to zero). However, the operator neglecting small velocity deviations (v th of order unity) can handle maintaining the bounded motion  of the system, even at the cost of increased motion amplitude. This situation corresponds well to the experimental findings on human control of an inverted pendulum, where continuous control is less efficient in comparison to discontinuous, or intermittent, control [7,23]. In particular, in balancing an inverted pendulum human operators who ignore small deviations from the vertical position perform better than the operators trying to react to every detectable deviation [23].
Finally, we make some remarks on the cases of v = 1 and a = 1. When the velocity dynamical trap is absent ( v = 1 or v th = 0), the system properties described above remain essentially the same, with only minor changes in the form of the limit cycle presented in Fig. 4. The impact of the acceleration dynamical trap is much higher: its absence ( a = 1) makes the operator able to precisely stabilize the system at the origin, instead of just maintaining it in some bounded area (although the operator reaction time τ should still be relatively small with respect to σ ).
The following summarizes the discovered dynamical properties of the proposed model: • the operator is basically unable to precisely stabilize the system under control; • a skilled operator can maintain the oscillations of the system in the vicinity of the desired position; • an unintelligent operator fails to control the system, destabilizing it by imprecise actions; • increasing the range of acceptable states may help even the unintelligent operator to get control over the system.

Discussion
We have tackled the problem of the mathematical description of human-controlled systems. We have built on the concept of dynamical traps [18][19][20] that matches the modern paradigm of discontinuous human control [8] and appeals to the existence of a certain region of acceptable states near the desired phase space position. The present paper argues that the dynamical trap notion is more general, and extends it to the operators' perception of their own actions. A human operator controlling a dynamical system is usually not capable of selecting or calculating the optimal action strategy that allows it to reach and maintain the desired end-state or goal. However, 9/12 Downloaded from https://academic.oup.com/ptep/article-abstract/2014/3/033J02/1501152 by guest on 27 July 2018 during the control process the operator is able to realize that the currently implemented strategy deviates from the optimal one if this deviation becomes large enough. Once aware of the mismatch, the operator can adjust the actions until she feels that the current value of the control parameter is acceptable. In order to capture this feature of human cognition, we extend the phase space of the dynamical system under human control with the control parameter as an independent phase variable. This enables us to introduce a certain region alongside the optimal strategy in the space of all action strategies; each strategy within this region is treated as acceptable by the operator. The latter region is called the action dynamical trap.
We have studied an example that describes the behavior of a human operator trying to control a simple dynamical system. The results of the theoretical and numerical analysis of the developed model correspond well to the basic properties of human-controlled systems. In particular, we have elucidated the fact that it is mainly the operator reaction time and capability of suppressing the velocity deviations that determine the system behavior. The system can hardly be precisely stabilized even by operators with exceptional abilities. Generally, the system is boundedly stable, exhibiting periodic oscillations around the equilibrium, or may even be completely destabilized by the actions of the operator.
As the problem at hand is concerned with car following, we feel it necessary to note the following. First, the developed model appeals to the notion of an oscillator in an extended phase space including acceleration as an independent phase variable. Most of the car-following models, however, are of type (1) and consider driver behavior to be governed by two stimuli: the necessity of maintaining a safe distance and controlling the relative velocity [24]. They operate, in particular, with such notions as the optimal velocity for a given headway and velocity difference. The linearizion of the corresponding governing equations gives rise to the harmonic oscillator model which can capture a number of fundamental properties of car dynamics [25][26][27]. Some of these models explicitly allow for the delay in human reaction and take the form of delay differential equations. For example, the optimal velocity model with delay [28] relates the current car acceleration a(t) with the headway distance h(t − T ) and the car velocity v(t − T ) taken at the time shifted to the past by the delay time T , or, just the same, {h(t), v(t)} → a(t + T ). In this case, expanding a(t + T ) into the Taylor series with respect to T within the first-order accuracy, i.e., using the approximation we can reduce the optimal velocity model with delay to a model operating with the car jerk da/dt as a certain function of the headway distance h, velocity v, and acceleration a. It should be noted that exactly this approach was used in establishing the relation between the first-order Newell's carfollowing model and the previous second-order model without delay [24]. Second, there are a number of psychological and action point models taking into account the bounded capacity of drivers in recognizing small variations in the car velocity and headway distance [29]. They appeal to the human perception thresholds determining the moments when drivers start to correct the motion of their cars. This concept is directly related to the notion of dynamical traps discussed in the present paper.
The previous studies on the dynamical trap effect have reported on various complex cooperative phenomena in the ensembles of coupled oscillators. Still, the interaction of at least three coupled oscillators was required for the unstable dynamics to emerge without noise effects. The presented non-Newtonian model demonstrates that even the simple isolated oscillator may exhibit non-trivial behavior in the presence of the action dynamical trap. 10 The model still requires extensive further development. First of all, the stochastic nature of human control has to be taken into account. The conventional approach to this problem, as mentioned previously, is to include the additive or multiplicative Gaussian noise term in the feedback of human operator. Such noise typically substitutes for all the unaccounted minor features of the modeled system, as well as external random events of small magnitude. However, we feel that this approach may be inappropriate when one wishes to capture the intrinsic stochasticity of human actions. In the present model the corrective actions of an operator are caused by the controlled variable (a − a opt ) exceeding a certain threshold, which is completely deterministic (despite being fuzzy). In other words, the operator reacts exactly the same to the same value of the system deviation over and over again. We have a strong feeling that a probabilistic description of reaction threshold should be developed in order to build a more comprehensive model of human control. Another issue is that the control effort generated when the threshold is crossed is again defined in a deterministic way, being a simple linear feedbackȧ ∝ −(a − a opt ). Experimental findings on the stick balancing task reveal that instead of a feedback control human subjects produce open-loop, ballistic-like corrective movements [30]. The amplitude of such movements also appears to be probabilistic, which should be given due consideration as well.
Second, we would like to remark that the independence of the two dynamical traps in perceiving the velocity and acceleration deviations is in fact only the zeroth-order approximation. Presumably, the operator internal apparatus recognizing the situations which require active behavior should be regarded as a single mechanism.
Although we mentioned some empirical facts supporting the presented concepts, the action dynamical trap model is still very abstract. It was designed not to mimic a particular real-world system, but rather to highlight that the extended phase space may enable one to capture intricate dynamical properties of human-controlled systems. The ideas of this proof-of-concept study ought to be developed further into more concrete models of specific human-controlled processes which should be in turn confronted with experimental data. We believe the phase space extension approach proposed here is rather general and may be employed in a wide class of models where human actions are of primary importance.