## Abstract

**Motivation:** Bicoid protein molecules, translated from maternally provided *bicoid* mRNA, establish a concentration gradient in *Drosophila* early embryonic development. There is experimental evidence that the synthesis and subsequent destruction of this protein is regulated at source by precise control of the stability of the maternal mRNA. Can we infer the driving function at the source from noisy observations of the spatio-temporal protein profile? We use non-parametric Gaussian process regression for modelling the propagation of Bicoid in the embryo and infer aspects of source regulation as a posterior function.

**Results:** With synthetic data from a 1D diffusion model with a source simulated to model mRNA stability regulation, our results establish that the Gaussian process method can accurately infer the driving function and capture the spatio-temporal dynamics of embryonic Bicoid propagation. On real data from the

**Contact:**m.niranjan@southampton.ac.uk

**Supplementary information:**Supplementary data are available at *Bioinformatics* online.

## 1 INTRODUCTION

Patterning, during embryonic development, is thought to be regulated by a class of molecules known as morphogens, which propagate spatially and establish concentration gradients. As early as 1952, Turing (1952) hypothesized that a reaction-diffusion mechanism might form the basis of such differential concentrations. Wolpert (1969) further developed this view as a universal mechanism of spatial pattern formation arising from position-dependent cell fate. Subsequently, Driever and Nüsslein-Volhard (1988a, b) discovered the role of the Bicoid morphogen in early embryonic development of the fruit fly *Drosophila melanogaster*. Bicoid sets up a concentration gradient along the anterior-posterior (A–P) axis of the embryo and regulates several downstream gap genes including Hunchback and Krüppel, which are implicated in segmentation along the A–P axis (Driever and Nüsslein-Volhard, 1989; Ephrussi and Johnston, 2004; Houchmandzadeh *et al.*, 2002; Johnston *et al.*, 1989; Struhl *et al.*, 1989). The *bicoid* mRNA is maternal, deposited at the anterior pole of the embryo and translation of this mRNA is thought to begin immediately after egg deposition (Driever and Nüsslein-Volhard, 1989) with the resulting protein propagating towards the posterior end. Most of the computational and experimental work on the Bicoid morphogen is concerned with the establishment of the steady state concentration gradient, and the sensitivity with which a threshold on it may be sensed for downstream gene expression (Gregor *et al.*, 2005, 2007a, b; Hecht *et al.*, 2009; He *et al.*, 2010; Holloway *et al.*, 2006; Houchmandzadeh *et al.*, 2002; Löhr *et al.*, 2010). Systematic computational modelling of downstream gap gene circuits, and how their non-linear interactions result in segment boundary formation have also been published in Ashyraliyev *et al.* (2009); Jaeger *et al.* (2004); Reinitz and Sharp (1995), and include steady state exponential Bicoid expression profiles as a regulating input. In an alternate view of the dynamics of this system, Bergmann *et al.* (2007) suggested that much of the desirable decoding properties of the steady state profile can also be realized during the pre-steady-state stages.

A particularly novel insight into the process of Bicoid translation comes from the experimental work of Surdej and Jacobs-Lorena (1998). These authors suggested that the stability of the maternal mRNA may be systematically regulated; i.e. kept stable for a period of time during which mRNA is translated and morphogen synthesized, and subsequently rapidly killed off by some active processes. This observation, of course, matches our natural expectation as there is no need for the organism to continue to produce Bicoid protein beyond the point in time when it is decoded. However, to our surprise, modelling literature over the 30 years since the discovery of Bicoid ignore this possibility and assume a constant production rate at the anterior pole. We showed recently (Liu and Niranjan, 2009, 2011) that it is possible to model Bicoid production in a manner similar to Surdej and Jacobs-Lorena (1998)'s experimental findings and computationally extract the time at which mRNA decay begins, and the rate at which it is killed off, to match data measured on real fly embryos and archived in the

*et al.*, 2009). We used an explicit model of mRNA stability regulation and a least squares fitting procedure between model output and observed data.

Bayesian inference has been shown to be useful in a range of applications including systems biology and bioinformatics. Successful examples range from the identification of gene regulatory networks by dynamic Bayesian networks (Husmeier, 2003; Peer *et al.*, 2001; Perrin *et al.*, 2003), network inference using informative priors (Mukherjee and Speed, 2008), inference of transcription regulation using a state space model (Sanguinetti *et al.*, 2006a, b), approximate inference using variational methods for the spatio-temporal Bicoid system (Dewar *et al.*, 2010) and parameter estimation using Monte Carlo simulations to understand stochastic dynamics of bacterial gene regulation (Wilkinson, 2011).

In this article, we build on these algorithmic foundations and apply the non-parametric probabilistic approach of Gaussian Process (GP) regression (Rasmussen and Williams, 2006) to address the problem of modelling Bicoid morphogen propagation. The GP model is the tool of choice for regression problems characterized by the need to model uncertainties and deal with unobserved data in a systematic way, both of which are aspects of our problem. The approach has been demonstrated in a wide range of applications in the machine learning literature. Specifically, Gao *et al.* (2008); Lawrence *et al.* (2007) and Honkela *et al.* (2010) studied interesting problems in biological modelling with GPs. The pioneering work on this is due to Lawrence *et al.* (2007) in which GPs were shown to be very effective in inferring target genes regulated by the tumour suppression transcription factor p53, providing an efficient alternative to the Monte Carlo approach advanced earlier by Barenco *et al.* (2006). Later publications have further confirmed the validity of the approach on other datasets of gene expression time series.

Apart from Dewar *et al.* (2010), the work on Bayesian methods mentioned above addressed purely temporal phenomena. Our work, on the other hand, is on a spatio-temporal problem and differs from Dewar *et al.* (2010) in the sense that we are interested in inferring the source driving function that corresponds to the stability regulation observed by Surdej and Jacobs-Lorena (1998). Alvarez *et al.* (2009) describe a similar attempt at combining a data-driven model (GP) with a latent process explaining the known physics of a system. They consider the heat equation, which is a simplified form of a diffusion system, to model the spatio-temporal profile of pollution. In this work, we derive the computational strategy needed to extend Lawrence *et al.* (2007)'s work to deal with spatio-temporal problems, and demonstrate its application to Bicoid regulation modelling using both synthetic data and real dataset from

*et al.*, 2009). The results demonstrate the power of the method on synthetic and its limitations on real data. As such, ours is the first contribution that adapts the powerful algorithmic setting of non-parametric regression to tackle an important spatio-temporal inference problem in developmental biology.

## 2 METHODS

### 2.1 Bicoid spatio-temporal reaction-diffusion system

The partial differential equation, governing diffusion of Bicoid along a 1D anterior–posterior axis, we start from:

where*m*(

*x*,

*t*) is the morphogen concentration as a spatio-temporal function,

*D*, the diffusion constant and τ

_{p}, the half-life of the morphogen protein.

*f*(

*t*), the source, is the mRNA regulation function which we consider unknown and place a prior distribution over. A linear gain term

*S*

_{0}is included to allow scaling of the data to match observations. There is an implicit assumption of a one to one mapping between mRNA regulation and the corresponding protein production, which we believe is justified as there is no evidence available of either post-transcriptional or post-translational regulation of Bicoid in this developmental system.

We consider the embryo as consisting of *N* cubes along the A–P axis in order to derive a linear dynamical system model from the continuous spatial diffusion equation. The basic idea for this stems from the work of Erban *et al.* (2007) (see later). With this discretization into *N* cubes, the chemical reactions involved in the diffusion process are as follows:

*d*, where

*d*is given by

*d*=

*D*/

*h*

^{2}and

*h*is length of each cube. The second process in Equation (3) describes Bicoid protein degradation. Finally, Equation (4) is the translation of Bicoid proteins from the maternal mRNA with

*f*(

*t*) being the latent function that needs to be inferred.

### 2.2 Linear dynamical system for Bicoid profile

We implemented a model in which source production occurs in a smaller first cube with length *h*_{f} (5 μ*m*—the length of a nucleus) and the other *N*−1 cubes (*h*=10μ*m*) equally splitting the remaining A–P axis. Therefore, the rate constants are different between the first cube (*d*_{f}), where mRNA is produced and its stability regulated, and the other cubes (*d*):

In order to develop a linear dynamical system for Bicoid profile, following Erban *et al.* (2007), we rewrite the partial differential equation in Equation (1) as a system of ordinary differential equations for the morphogen concentration in each bin (*i*=1,…,*N*):

Defining **m**(*t*)=[*m*_{1}(*t*),…,*m*_{N}(*t*)]^{T}, the linear dynamical system for Bicoid reaction-diffusion system is then vectorized as:

**A**(

*N*×

*N*) is defined by: and source production rate

**s**=[

*S*

_{0},0,…,0]

^{T}.

The solution to Equation (10), **m**(*t*), giving the Bicoid spatio-temporal profile, is in terms of a matrix exponential and is given by:

**m**(0) is zero at the beginning of the embryo development.

### 2.3 Gaussian process modelling

We treat *bicoid* mRNA as a function drawn from a GP and extend it to the spatio-temporal application of Bicoid dynamical system. The GP prior for the latent mRNA regulation is defined by mean and covariance functions:

*k*

_{f,f}(

*t*,

*t*′) is given by squared exponential covariance function with the length scale

*l*: Because the right-hand side of Equation (11) is linear, as noted by Lawrence

*et al.*(2007) in the content of modelling transcription regulation,

**m**(

*t*) turns out to be a multivariate function drawn from a GP: The corresponding cross covariance function

**K**

_{m,m}(

*t*,

*t*′) is given by: The cross covariance function between

**m**(

*t*) and

*f*(

*t*) becomes These expressions can be derived analytically and the derivation details are shown in Supplementary Material.

### 2.4 Predictive distribution

Let **f**_{*} be a vector of *J*_{*} values of the source function at equally spaced time points, and **m**_{i}^{*} be the corresponding protein profiles at these instances in time, within the *i*-th cube along the A–P axis. Concatenating these, we define **h**_{*}=[**f**_{*},**m**_{1}^{*},…,**m**_{N}^{*}]^{T} corresponding to *J*_{*}+*NJ*_{*} test points. The corresponding training data consists of the morphogen values in the *N* cubes, taken at *J* points in time, and a fixed source value *f*, contained in the vector **h**=[*f*,**m**_{1},…,**m**_{N}]^{T} of dimension 1+*NJ*.

With the above notation, the mean and covariance of the posterior distribution are given by:

Each covariance matrix in Equations (17) and (18) is partitioned across the source and*N*Bicoid intensities in different cubes. Illustrating this for

**K**

_{h*,h}, Therefore, the dimensions of the covariance matrices

**K**

_{h*,h},

**K**

_{h,h}and

**K**

_{h*,h*}are (

*J*

_{*}+

*NJ*

_{*})×(1+

*NJ*), (1+

*NJ*)×(1+

*NJ*) and (

*J*

_{*}+

*NJ*

_{*})×(

*J*

_{*}+

*NJ*

_{*}), respectively. The observations, collected in vector

**y**of dimension 1+

*NJ*, are assumed to be corrupted by additive noise as where

*e*

_{i}(

*t*) are drawn from

*N*(0,σ

_{i}

^{2}(

*t*)). Hence, With the hyperparameter vector

**θ**

_{h}=[

*l*,σ

^{2}

_{f},σ

^{2}

_{11},…,σ

^{2}

_{1J},…,σ

^{2}

_{N1},…,σ

^{2}

_{NJ}], the likelihood is:

The observations are taken from

*l*, we maximized the above likelihood following Rasmussen and Williams (2006). Estimates of these hyperparameters are given in Supplementary Material. When simulating synthetic data, we added zero mean Gaussian noise of SD σ=0.1, and assumed it to be known.

For model parameter estimation, we evaluated least squared error between Bicoid ODE system [Equations (7)–(9)] output (*m*^{model}) and measured intensities from

*m*

^{data}) to estimate the parameters: diffusion constant

*D*(3 μ

*m*

^{2}/

*s*) and protein half-life τ

_{p}(86 min) (Liu and Niranjan, 2011):

Note we have estimated the hyperparameters using maximum likelihood and set the model parameters by least squares fitting. Ideally, one would like to exploit the elegance of the GP framework and estimate all these by maximum likelihood. To achieve this, we need analytical expressions for the covariance matrices and their derivatives. We discuss this further in the Supplementary Material.

Since the production concentration *S*_{0} is independent of *D* and τ_{p} and is linear of the model output, we differentiate the Equation (23) with respect to *S*_{0} and set it to zero. The calculation for *S*_{0} is given by:

## 3 RESULTS AND DISCUSSION

### 3.1 Inference of mRNA regulation function

We first assume that *bicoid* mRNA is localized and its regulation occurs only in the anterior pole, the first cube in our discretized model. Therefore, source production amplitude vector is given by

As in our previous work (Liu and Niranjan, 2009), to generate synthetic data, we implemented mRNA stability regulation by the function

which characterizes mRNA, and hence the production of Bicoid protein, to be stable and constant to time*t*

_{0}, followed by an exponential decay of time constant τ

_{m}. δ is the Kronecker delta function and Θ is the Heaviside step function. The parameter values taken from our previous work, estimated by a least squares fit between model output and

*t*

_{0}=144 min and τ

_{m}=9 min. The dashed lines in Figures 1B, 2B and 3B show the true source function according to our hypothesis on regulation [Equation (26)].

Figure 1A shows the synthetic training data using the ODE system of Equation (7)–(9) with additive noise over the entire developmental period of 0~200 min, with 20 equally spaced time points and 51 cubes along the A–P axis. Figure 1B shows the estimated mRNA regulation function, *S*_{0}*f*(*t*), from the GP approach. We see that the GP is able to recover the regulated source function quite well, though the resulting estimate is smoother. This is to be expected from a GP model, for which a function with a sharp discontinuity will have very low likelihood in the prior. Still, the decay beyond 2 h is very rapid. The posterior mean of the inferred Bicoid concentration profile from the model is shown in Figure 1C. The temporal dynamics of a morphogen gradient being established and then killed off is clearly present in the model output.

In the above, shown in Figure 1, we have used the synthetic data over the full developmental time scale of interest. However, in the

Figure 3 shows the behaviour of the GP model on real data from

As noted, the GP inferred source functions are smoother than hypothesized by our model. A consequence of smooth functions fitting the data is also that the temporal point at which mRNA begins to decay starts earlier. The rapid change between mRNA being translated and killed off is not explicitly modelled in the GP approach. Such rapid changes may well be better modelled in a probabilistic framework that explicitly incorporates switching behaviour, such as the two-state Markov Jump process (also known as a telegraph process) considered in Sanguinetti *et al.* (2009).

Figure 4 shows the predicted temporal profiles of the Bicoid at three adjacent spatial points along the embryo.^{1} The training data are also shown. The three rows in the figure correspond to cases illustrated in Figures 1–3. We see that at the level of the GP model generating the data, reasonably good fits are obtained. With real data, we see high uncertainties in regions where data are not present.

### 3.2 Non-localized maternal mRNA

Additionally, we show in Figure 5 cross sections of the spatio-temporal profiles, taken along the A–P axis at different developmental cycles. Here, we see that the exponential spatial decays of

In the literature, Spirov *et al.* (2009) and Little *et al.* (2011) have discussed the scenario in which maternal *bicoid* mRNA itself has a spatial gradient (i.e. non-localized). While Spirov *et al.* (2009) considered this mRNA spatial profile to be the main determinant of morphogen gradient, Little *et al.* (2011) concluded that >90% of total *bicoid* mRNA is within the anterior 20% EL, and this alone is insufficient to establish the required spatial gradient of the protein in the time available over the length scale of interest. We also simulated this possibility in our GP models, with maternal mRNA being spatially distributed in the first 10 of the 50 cubes (*h*=10μ*m*) with an initial exponentially decaying spatial profile. The corresponding results of the inferred mean source (now a spatio-temporal profile) and the GP mean morphogen profile are shown in Supplementary Figure S4B and C. We note that in this case, the onset of mRNA decay begins slightly earlier than for the point source in the first bin, which is to be expected since the mRNA spatial distribution contributes to the generation of morphogen upto 20% EL, and the destruction has to start earlier to compensate. As seen in Supplementary Figure S5, the fit to the data does improve with spatially distributed mRNA. We include this for completeness, showing that a GP model can be applied in a flexible way in this manner, but do not think that the results can resolve the differences discussed by Spirov *et al.* (2009) and Little *et al.* (2011).

## 4 CONCLUSION

In this contribution, we have shown that the non-parametric GP regression model can be effectively applied to the problem of inferring biologically useful information from the spatio-temporal distribution of the Bicoid morphogen in early Drosophila embryogenesis. Discretization of the spatial domain transforms the spatio-temporal problem into a dynamical system for which, with a GP prior imposed on the source, the solution can be obtained as a matrix exponential. With synthetic data obtained from a linear spatio-temporal dynamical system, our results show that the GP approach is able to accurately recover the driving input and model the Bicoid distribution. On real world data, our results also estimate a smoothed version of the driving input due to the data being available only during part of the developmental process, and yet the part of the source decay is fairly well estimated.

*Conflict of Interest*: none declared.

^{1}These cubes are chosen for illustration because these are locations where much of the variation is happening.

## REFERENCES

*Arxiv preprint arXiv:0704.1908*

## Comments