## Abstract

**Motivation:** We present a spatial-temporal (ST) human immunodeficiency virus (HIV) simulation model to investigate the spatial distribution of viral load and T-cells during HIV progression. The proposed model uses the Finite Element (FE) method to divide a considered infected region into interconnected subregions each containing viral population and T-cells. HIV T-cells and viral load are traced and counted within and between subregions to estimate their effect upon neighboring regions. The objective is to estimate overall ST changes of HIV progression and to study the ST therapeutic effect upon HIV dynamics in spatial and temporal domains. We introduce sub-regional (spatial) parameters of T-cells and viral load production and elimination to estimate the spatial propagation and interaction of HIV dynamics under the influence of a 3TC D4T Reverse Transcriptase Inhibitors (RTI) drug regimen.

**Results:** In terms of percentage change standard deviation, we show that the average rate per 10 weeks (throughout a 10-year clinical trial) of the ST CD4+ change is 5.35% 1.3 different than that of the CD4+ rate of change in laboratory datasets, and the average rate of change of the ST CD8+ is 4.98% 1.93 different than that of the CD8+ rate of change. The half-life of the ST CD4+ count is 1.68% 3.381 different than the actual half-life of the CD4+ count obtained from laboratory datasets. The distribution of the viral load and T-cells in a considered region tends to cluster during the HIV progression once a threshold of 86–89% viral accumulation is reached.

**Availability:** Executable code and data libraries are available by contacting the corresponding author.

**Implementation:** C++ and Java have been used to simulate the suggested model. A Pentium III or higher platform is recommended.

**Contact:**george.towfic@clarke.edu

**Supplementary information:** Supplementary data are available at *Bioinformatics* online.

## 1 INTRODUCTION

To estimate the effect of change of human immunodeficiency virus (HIV) dynamics (drug regimens, T-cell count and viral load) on HIV progression, many top–down mathematical models, based on the solution of a set of Ordinary Differential Equations (ODE), have been proposed (Kirschner, 1999; Muller *et al.*, 2001; Perelson, 2001; Stafford *et al.*, 1999). One drawback of this approach is that it is restricted by the ODE's ability to estimate temporal but not spatial changes of system variables.

Another possible approach to simulate HIV progression is the bottom-up approach based on Agent-Based (AB) models (Teweldemedhin and Marwala, 2004) and Cellular Automata (CA) models (Neumann, 1966). This approach can be used to construct and understand the components of a complex system by assigning different agents (prototypes), developed in different ways, to simulate a considered model (such as an HIV progression model). During the development process, agents cooperate to reward successful agents and impede or eliminate unsuccessful agents. Although this approach can be used to study HIV progression at the population level (Teweldemedhin and Marwala, 2004), it has not been implemented at the cellular level since it is very difficult to keep updating agents and their spatial connections when the system dynamics rapidly change during the acute phase of HIV progression (Callaghan, 2005). Incorporating a drug resistance factor into the progression process adds more randomness and hence significant complexity to the AB and CA models.

In this work, we extend our previous research (Draghici *et al.*, 2003) in which an Independent Component Analysis (ICA) algorithm had been used for two purposes: (1) to generate identical patterns of HIV dynamics and (2) to eliminate redundant noise in each pattern to produce reliable datasets from available laboratory data.

We use the refined CD4+ and CD8+ T-cell counts and viral load acquired from the ICA algorithm to develop a Finite Element (FE) model that traces the motion of T-cells and viral load in a considered lymph domain. The objective of this spatial-temporal (ST) tracking is to provide an understanding of the internal distribution of the T-cells and the HIV viral load during the HIV life cycle (an example of the distribution is depicted in Fig. 5). A distribution profile can be used in different applications that require HIV modeling. Examples of such applications include: (1) predicting the spatial virion rate of flow to be used in drug design (including estimated drug resistance and spatial therapeutic effect), (2) predicting viral load and T-cell interaction and (3) predicting overall HIV progression in terms of its local constituencies.

We compared the analytical results obtained from the model with laboratory datasets obtained from the University of Iowa, the University of Wisconsin and Stanford University (Rhee *et al.*, 2003). We show that the ST tracking of HIV progression provides useful information that can be used to understand and control the overall progression of major HIV dynamics including the progression of CD4+ and CD8+ T-cell counts and viral load, and therapeutic effect.

## 2 PATIENTS AND METHODS

We used 1500 patient records. From the three libraries, we selected datasets that yielded 93.7% similarity, in terms of number of treatment trials, type of regimens and CD4+ on-therapy baseline.

Patients using the 3TC D4T Reverse Transcriptase Inhibitors (RTI) regimen have been selected due to the availability of more trials for this regimen. Using the resulting refined datasets, a mathematical model was developed, estimating important parameters (such as base CD4+ and CD8+ T-cell counts and viral load, during different trials) determined by the three HIV clinics and available in the three considered datasets. The proposed mathematical model projected the change of the CD4+ and CD8+ T-cell counts and viral load in terms of their rate of production, elimination and therapy duration. Validation and accuracy of the model was assessed by comparing the analytical results of the model with that of the patients’ records. Figure 1 illustrates the main steps of our simulation. We divide a proposed region of volume 1000 ml^{3} into 1000 FE subregions and track the HIV dynamics in each subregion. The temporal change of each subregion is calculated during the treatment duration, using the Boltzmann equation (R D Kingdon and White, 1992), by determining the change of the HIV dynamics due to the flow of T-cells and viral load from one neighboring subregion to another.

**Fig. 1.**

**Fig. 1.**

## 3 SYSTEM EQUATIONS

We consider the T-lymphocytes and viral load as a viscous, incompressible fluid occupying, at time *t*, a bounded rectangular region ℜ^{2} with boundary Γ. In constructing the governing differential equations, we consider the Eulerian conservative form of the continuity equation to preserve mesh size and dimensions (Douglas *et al.*, 2000). The rate of change of viral load and T-cells (calculated from Equation 1) will be used in the FE model (Equations 7–9) to calculate the distribution densities of the T-cells and viral load.

*u.n*= 0 and ∇ν.

*n*= 0. In Equation 1,

*u*and ν are functions of the spatial Cartesian coordinates

*x*and

*y*and the time coordinate

*t*. Figure 2 displays the actual and calculated CD4+ and viral load values in 10-week time periods throughout 10 years of HIV progression. In this model,

*u*represents the viral load and ν represents the CD4+ count as depicted in Figure 2. The vector

*n*is normal to the gradient operator. The gradient ∇ is defined as:

**Fig. 2.**

**Fig. 2.**

The following formulas and data values are derived from the laboratory datasets by calculating the average change of viral load, CD4+ and CD8+ T-cell counts over time and are consistent with the mathematical model and data values suggested by Nowak and May, pp. 17–21 (Perelson, 2001)and Stafford (Stafford *et al.*, 1999). In Equations 2–5, the superscripts *n* and (*n*+1) represent the number of the current and next time steps, respectively.

The variables *u*, ν, ω and *ϕ* are the rate of viral change due to therapy, the rate of CD4+ change with respect to time, the rate of CD8+ change with respect to time, and the overall rate of change in viral load and CD4+ and CD8+ count respectively. The considered initial values (measured in days) are: the rate of viral load production, β = 0.001; the rate of natural viral load elimination, γ = 7.4 × 10^{−4}; the rate of viral load elimination due to therapy κ = 3.1 × 10^{−2}; the uninfected cell production rate ε_{2p} = 0.01; the uninfected cell elimination rate ε_{2e} = 0.007; the average weight of virus effect as time changes is *r* = 0.001; the average rate for a virus to locate a CD4+ cell is η = 1.09 and δ = 0.2, is the rate of production of immature T-cells that will not contribute in the production of new T-cells. The CD8+ count at a new time step is calculated using the CD8+ count at the current time step, the rate of production, τ_{1} = 0.001, and the rate of elimination of CD8+ cells, τ_{2} = 0.018. ε is the change in the number of CD4+ cells at a given time (ν^{(n)} − ν^{(n−1)}).

In the above systems of equations (Equations 2–5), the value of the HIV variables at an advanced time step is estimated according to the existing value of the considered variable plus the extra values that are added to the system due to the considered production rate of that variable minus the eliminated values due to natural death or therapy. We discuss the meaning of individual terms of Equations 2–5 in the appendix.

We considered the following FE trial function (Douglas *et al.*, 2000):

*et al.*, 1996; Chen and Raju, 2003) is used to calculate the following FE weak solution: The approximation value

*ϕ*

_{j}(

*x, y*) is considered to be weak since it is defined at each element of the finite element mesh rather than the entire ℜ

^{2}domain (i.e. the solution of the system of Equation 1 is a combination of local solutions calculated at each finite element). For any variable, η,

*d*

_{j}(η) = ||η−η

_{j}||

_{2}. For example, when η represents the

*x*coordinate at the center of a considered element, η

_{j}represents the

*x*coordinate of nodal point

*j*in that element. In the above equation,

*dm*

_{j}is a support parameter (Douglas

*et al.*, 2000). We consider a constant mesh distance |

*x*-

*x*

_{j}| = |

*y*-

*y*

_{j}| = h. The continuous, infinite dimensional state space solutions given by the original partial differential equation (Equation 1) is replaced with a discrete finite set of solutions associated with the considered weight function,

*ϕ*

_{j}(

*x, y*). This leads to the following formula (where

*c*

_{j}(

*t*),

*j*= 1, 2 , … ,

*n*is an approximation to the solution

*u*

_{j}(

*t*),

*j*= 1, 2 , … ,

*n*, at the

*j*-

*th*node of the FE region):

Substituting the above formula in Equation 1 and integrating in each spatial coordinate yields:

where The forcing source*u*(

*x, y, t*) in Equation 1 is thus distributed on each node of the finite element region and is considered as part of the approximation function

*c*(

*t*). The values of the spline functions

*ϕ*

_{j}(

*x, y*) will be different in the calculation of

*OM*and

*OS*depending on the function

*d*(η) = || η − η

_{j}||

_{2}. The Neumann boundary condition ∇

*ϕ.n*= 0, is satisfied automatically within the Galerkin and variational formulations of Equations 7–10. This means that the integral term associated with boundary points vanishes and we obtain the standard Galerkin finite element formulation. The vectors C

_{v}(

*t*) and C

_{c}(

*t*) depicted in Equations 7 and 8 represent the calculated values of the viral load and T-cells at time,

*t*, in each FE region.

## 4 MODEL EVALUATION

To evaluate our suggested mathematical model, we first compared the analytical datasets obtained from the mathematical model with those recorded in the three considered HIV libraries. Figure 3 depicts the result of the comparison obtained for 500 patients with an on therapy CD4+ count of 300 and on-therapy baseline HIV RNA of 4.1 (measured in Log_{10}) copies/ml.

**Fig. 3.**

**Fig. 3.**

The square boxes in Figure 3 represent the average of the calculated CD4+ cells for a 10-year period. The solid line represents the average values obtained from the three laboratory datasets for the same time period. Therapy at Figure 3 started at the 967-th day of viral infection and ended at the 2817-th day. The figure shows that there was approximately 30 days delay for the therapy to take effect and approximately 63 days for the therapy to lose its effect in restraining the viral progression (this is shown in the figure as a surge in the CD4+ cells after the continuous decline of CD4+ cells during the first 63 days). The analytical datasets and the laboratory datasets for the 500 patient records differ by approximately 2.4 SDs.

Using a confidence interval table for 500 data points, this accounts for about 92% confidence in the proposed finite element model. Given that the three libraries are completely independent in measuring patient datasets, and the results obtained from the ST model (depicted in Fig. 2), we concluded that the derived Finite Element model can be used for further analysis to estimate the effect of spatial viral propagation and drug resistance on the overall HIV progression. Using the considered datasets and in terms of mean SD, we calculated the difference of the average value of the actual laboratory CD8+ count to be 216.413 ± 30.568 and the average CD8+ ST values as 250.586 ± 33.649, which confirms the model's robustness in predicting the change in the T-cells and viral load. Using the datasets in Figure 2, we calculated the difference between the average rate of change of the actual laboratory CD4+ count and the calculated ST CD4+ count to be 5.35% ± 1.3 and the difference between the average rate of change of the actual laboratory CD8+ dataset and the calculated ST datasets to be 4.98% ± 1.93. To test the convergance of the FE model with respect to different mesh sizes, we used ANOVA. Table 1 shows comparison between the analytical data obtained from four finite element mesh sizes of 300, 500, 1000 and 1500. The fact that the *F* value between the four groups is approximately 0.001 indicates that there is no statistically significant difference between the CD4+ values calculated using the four mesh sizes. Figure 4 depicts the half-life average for both actual laboratory and ST change of 14 CD4+ on-therapy baseline values. Figure 4 shows that a reasonable *r*^{2} value of 0.913 is obtained from the polynomial *y* = −0.0058*x*^{3} + 0.2106*x*^{2} − 2.8029*x* + 18.613. This indicates that a polynomial fit can be used with 82% confidence to predict the half-life value of CD4+ count given a considered CD4+ baseline value. This result is consistent with that obtained by Drusano and Daniel (Drusano and Stein, 1998). The average difference between the spatial and overall half-life of the CD4+ count is 1.68% ± 3.381. This shows that there is no significant difference between the spatial and overall models in calculating the half-life for the CD4+ count.

**Fig. 4.**

**Fig. 4.**

**Table 1.**

Groups | Count | Sum | Average | Variance | |
---|---|---|---|---|---|

300 mesh | 33 | 11697 | 354.4545 | 1202.818 | |

500 mesh | 33 | 11641 | 352.7576 | 979.8769 | |

1000 mesh | 33 | 11654 | 353.1515 | 1015.008 | |

1500 mesh | 33 | 11792 | 357.3333 | 790.9792 |

Groups | Count | Sum | Average | Variance | |
---|---|---|---|---|---|

300 mesh | 33 | 11697 | 354.4545 | 1202.818 | |

500 mesh | 33 | 11641 | 352.7576 | 979.8769 | |

1000 mesh | 33 | 11654 | 353.1515 | 1015.008 | |

1500 mesh | 33 | 11792 | 357.3333 | 790.9792 |

ANOVA | |||||
---|---|---|---|---|---|

Source of Variation | SS | df | MS | F | P-Value |

Between groups | 33 | ||||

Within groups | 17976.15 | 3 | 5992.051 | 0.00987 | 0.0547 |

127637.8 | 128 | 997.1705 | F- Crit | ||

Total | 145614 | 131 | 2.675387 |

ANOVA | |||||
---|---|---|---|---|---|

Source of Variation | SS | df | MS | F | P-Value |

Between groups | 33 | ||||

Within groups | 17976.15 | 3 | 5992.051 | 0.00987 | 0.0547 |

127637.8 | 128 | 997.1705 | F- Crit | ||

Total | 145614 | 131 | 2.675387 |

## 5 RESULTS

Figure 5 shows three samples of viral and T-cell distribution obtained from the implemented FE model. The figure helps in understanding viral load progression under the 3TC D4T RTI regimen. The following observations can be inferred from Figure 5: (1) initial viral distribution is important in determining concentration of viral load and T-cells during the HIV progression. The second column shows that in 10 years, the viral load will spread rapidly: 57% faster than the third column's initial state and 73% faster than the distribution shown in the fourth column. (2) Viral load and T-cells tend to accumulate as they expand or shrink during HIV progression. When a therapy manages to break the viral load concentration (as in the third and fourth columns), the effect of HIV progression is reduced by a factor of *T*/*N* (*T* is the total viral load and *N* is the number of regions where viral load start accumulating). (3) Once 86–89% of the viral load becomes accumulated, symmetric expansion tends to dominate the HIV progression. (4) Viral load accumulation occurs when 86–89% viral load concentration is reached at any part of the infected region.

**Fig. 5.**

**Fig. 5.**

## 6 CONCLUSIONS

The objective of this research is to aid in understanding the spatial propagation of HIV dynamics (CD4+ and CD8+ T-cell counts and viral load) during HIV progression. The premise is that if a ST mathematical model can provide a reasonable estimate for the actual HIV progression, more datasets will be available for analysis per individual HIV patient. The distribution of viral load and T-cells during a considered progression time will also help in the drug design process by determining the pattern of viral progression and its continual interaction with T-cells. Treatment of other infectious diseases may also benefit from the proposed model in the spatial study of viral dynamics related to each infectious disease. This study used 1500 patient records from three clinical libraries, selecting datasets with 93.7% common trend in terms of CD4+ and CD8+T-cell counts and viral load change using the 3TC D4T Reverse Transcriptase Inhibitors (RTI) regimen. Based on a 300 on-therapy baseline CD4+ count and 4.1 HIV RNA copies (measured in log_{10}), we show that the CD4+ count is 23% ± 4.56 higher than that of the CD8+ count. The CD4+ count for the considered viral load is 360.206 ± 34.424 while the average value of the ST CD4+ count is 329 ± 38.872. We calculated the average value of the actual laboratory CD8+ count to be 216.413 ± 30.568 and the average Spatial-Temporal CD8+ count to be 250.586 ± 33.649. The above values assert the robustness of the proposed finite element model in predicting the progression of viral load and T-cells. The distribution patterns depicted in Figure 5 show that the finite element model provides insight into the complicated dynamic process of HIV progression. Figure 5 also shows that the initial viral distribution determines the pace of HIV progression. The viral load and T-cells tend to spread in groups during HIV progression once a threshold of 86-89% viral accumulation is reached.

## ACKNOWLEDGEMENTS

We acknowledge the valuable contribution of Dr Robert Shafer of Stanford University for providing us with password access to the Stanford HIV Database (http://hivdb.stanford.edu). We also thank Dr Sorin Draghici of Wayne State University for his valuable comments. Dr Thomas Lengauer and this article's anonymous reviewers made valuable suggestions and comments that significantly improved the content of this article.

This research was funded through a grant by the Maytag Foundation.

*Conflict of Interest*: none declared.

## REFERENCES

*Technical report, European Research Consortium for Informatics and Mathematics*

*Working Papers 99-05-036, Santa Fe Institute*

*In*Proceedings of the Fourth International Conference on Hybrid Intelligent Systems (HIS'04).

*IEEE Computer Society*

### APPENDIX

Equation 2 calculates the number of viral load in the next time step in terms of the values of different system parameters. The first term of the right-hand side of Equation 2 represents the number of viral load at the current time step. The second term measures the increase in viral load due to viral load production rate β. The third and fourth terms calculate the reduction of viral load due to rate of natural death (γ) and rate of therapy effect (κ), respectively. The final two terms in Equation 2 calculate the overall change (*ϕ*) in the system and the new value of CD8+ count (ω), respectively. Both *ϕ* and ω will be discussed in more detail below in the discussion of Equations 4–5. It is worthwhile to mention here that the CD8+ count is added to Equation 2 due to the effect of CD8+ count on the production of new CD4+ cells [some authors simply estimate the number of CD8+ count as 2/3 of the total CD4+ count (Haraba and Dolezal, 1996)]. After multiplying, the first two terms in the ratio of Equation 3 are the current CD4+ count and the new naturally produced CD4+ cells (ε_{2p}_{*} ν ^{(n)}) respectively. The last three terms in Equation (3) represent the number of CD4+ T-cells that will be eliminated from the system due to natural death (ε_{2e}_{*} ν^{(n)}), the number of CD4 -cells that will be eliminated as viral loads detect and infect them at the considered time step (*r*ην^{(n)}), and the number of immature CD4+ cells that will not contribute in the production process of new T-cells δν^{(n)}. The variable ε (the change in number of CD4+ cells in consecutive times) is used to simulate the biological response of the immune system to produce more CD4+ cells when the current number of CD4+ cells reduces and vice versa. As ε → 1, the newly calculated CD4+ count stays the same. However, when ε > 1, the production of the new CD4+ count is restricted by the value of ε. If the value of ε < 1, the production of the new CD4+ cells will decelerate according to the value of ε.

In Equation 4, the overall system change (changes in number of CD4+ cells and viral load) is calculated as a ratio between viral elimination and viral production. The numerator in Equation 4 estimates: (1) accumulative effect of viral elimination and reduced effectiveness of viral load on the system (ψ_{1}). This is calculated using the rate of viral elimination on the system due to natural elimination, γ and rate of reduced effectiveness τ over time, (2) The effect of viral production on the CD4+ cells and viral load (ψ_{2}), and (3) The effect of viral elimination due to therapy (κ) and reduces effectiveness (τ) on the viral load. All the above three numerator terms are affected by the rate (η) at which viral load(s) locate a CD4+ cell. The denominator in Equation 4 simulates an HIV progression fact (Perelson, 2001), that every infected CD4+ cell will on average produce one other infected cell. Hence, like Equation 3, the denominator (*u*^{(n+1)}ν^{(n+1)}) will accelerate or decelerate the production of T-cells and viral loads according to the overall possible viral loads that can be generated by the system. Equation (5) calculates the CD8+ count at next time step using the CD8+ production rate τ_{1} and the CD8+ elimination rate τ_{2}.

## Comments