DifferentialRegulation : a Bayesian hierarchical approach to identify differentially regulated genes

uncertainty via a latent variable approach, where reads are allocated to their gene/transcript of origin, and to the respective splice version. We designed several benchmarks where our approach shows good performance, in terms of sensitivity and error control, versus state-of-the-art competitors. Importantly, our tool is flexible, and works with both bulk and single-cell RNA-sequencing data. Availability and implementation : DifferentialRegulation is distributed as a Bioconductor R package


Introduction
Bulk and single-cell RNA-sequencing (RNA-seq) data enable estimating the abundance of both (mature) spliced (s) and unspliced (u) (or precursor-) mRNA. These splicing dynamics have been previously studied  (Weiler et al., 2022): unspliced mRNA (u), containing both introns and exons, is transcribed from DNA (at rate α); then, splicing (at rate β) leads to spliced mRNA (s), which is eventually degraded (at rate γ).
from bulk data (Zeisel et al., 2011;Gaidatzis et al., 2015b); furthermore, in single-cell RNA-seq (scRNAseq) data, they have been further exploited by RNA velocity tools, that infer the time derivative of the gene expression state of cells (La Manno et al., 2018;Bergen et al., 2020). In these approaches, s and u abundances are compared to their (estimated) equilibrium values. Intuitively, if a gene has a higher relative abundance of u reads than at steady-state, in the near future the s mRNA will increase because we expect a higher abundance of newly spliced mRNA, compared to the amount of s mRNA that is going to be degraded ( Figure 1). Therefore, gene expression (i.e., s) is currently increasing, and we can think of this gene as being up-regulated. Conversely, if the relative abundance of u reads is lower than at its equilibrium, in the near future, the amount of s mRNA will decrease, because the newly spliced mRNA will not fully compensate for the degraded mRNA ( Figure 1). In this case, gene expression is currently decreasing; hence, we can conclude that the gene is being down-regulated. Here, following a similar rationale, we aim at identifying differences in gene regulation between experimental conditions (e.g., treatments), by comparing the relative abundance of u reads, denoted by π U . In particular, for a given gene, if π U is higher in condition A than B, we speculate that the gene is being up-regulated in A, compared to B. Note that this is different from canonical differential gene expression tests, which focus on differences in the overall abundance of s reads. Instead, our goal is to identify differences in the direction that gene expression is currently undergoing. In particular, it was found that unspliced mRNA lags approximately 15 minutes behind spliced mRNA, and can be taken as a proxy for nascent transcription (Hendriks et al., 2014). Furthermore, changes in unspliced mRNA are thought to be indicative of changes in post-transriptional regulation (Gaidatzis et al., 2015a). Therefore, identifying variations in π U can provide valuable insight into gene regulation changes between conditions. From a technical point of view, RNA-seq data is characterized by a large degree of quantification uncertainty due to multi-mapping reads, i.e., reads compatible with multiple transcripts or genes (McDermaid et al., 2018;Dharshini et al., 2020). Furthermore, when analyzing splicing dynamics, we consider both splice versions of each gene/transcript; this doubles the number of transcripts (bulk data) or genes (singlecell data) in the reference, and increases even more mapping ambiguity. Two approaches, namely eisaR (Gaidatzis et al., 2015a) and BRIE2 (Huang and Sanguinetti, 2021), have been proposed to compare splicing dynamics between groups of samples from bulk and scRNA-seq data, respectively. The first approach uses the edgeR (Robinson et al., 2009) differential pipeline, based on a negative binomial distribution, where samples and groups are used as covariates for the mean parameter. The second method, instead, implements a Bayesian regression approach on percent spliced-in values, with samples and groups modelled as covariates; however, this approach was found to be extremely computationally intensive. Additionally, other tools, originally designed to detect differences in alternative splicing patterns, could also be employed to discover changes among s and u reads. Notably, DRIMSeq , satuRn (Gilis et al., 2022), SUPPA2 (Trincado et al., 2018), and DEXSeq (Anders et al., 2012) performed well in recent benchmarks (Love et al., 2018;Tiberi and Robinson, 2020;Gilis et al., 2022). In particular, DEXSeq, for the purpose of our analyses, could be applied to transcript estimated abundance (Love et al., 2018) (referred to throughout as DEXSeq_TECs), or to equivalence classes counts (Cmero et al., 2019) (denoted by DEXSeq_ECs), where equivalence classes (ECs) are collections of reads compatible with the same set of transcripts (including splicing status). Such ECs, and their multiplicities, are typically used to model the variability of multi-mapping reads. The majority of differential methods, in our case, eisaR, DRIMSeq, satuRn, SUPPA2 and DEXSeq_TECs, input estimated counts, and thus fail to account for the noise in those estimates. Conversely, DEXSeq_ECs avoids this issue by performing differential testing directly on equivalence classes. However, while this approach accounts for reads mapping between s and u versions of a transcript, it does not handle reads mapping to multiple transcripts, which are discarded, hence resulting in a loss of data. Moreover, while most methods presented above can test genes or transcripts directly, SUPPA2 and DEXSeq_ECs perform differential testing on exon junctions and ECs, respectively, which results in multiple statistical tests for each transcript, that are then aggregated to the transcript level. When applied to scRNA-seq data, BRIE2, DRIMSeq, satuRn, and DEXSeq_TECs can partially account for the quantification uncertainty, by treating separately ambiguously mapping reads (i.e., those mapping to both s and u versions of a transcript); such reads are denoted by a. However, the ambiguity in multi-gene mapping reads cannot be modelled. To overcome these challenges, we propose a Bayesian approach that accounts for the quantification uncertainty via a latent variable model, and allocates reads to their transcript or gene of origin, and corresponding splice version. Our approach also allows for sharing of information across samples, via a hierarchical framework, and genes, via informative priors. We designed a double analysis framework, based on two distinct ad hoc algorithms to analyze bulk and single-cell RNA-seq data, accounting for the specifics of each type of data. In particular, bulk protocols enable studying transcript-level signals across all cells, while single-cell data offer high cellular resolution but do not allow accurate transcript-level inference. Here, we take advantage of the information that each offers: our bulk approach targets changes at the transcript level (across all cells), while our single-cell method identifies cell-type specific changes (at the gene level), e.g., genes that are differential in a cell type but not in others. Below, we illustrate both approaches (Section 2), describe our benchmarks (Section 3), and discuss results (Section 4).

Model for bulk data
DifferentialRegulation takes as input the equivalence classes counts derived from RNA-seq reads, and recovers the overall abundance of each transcript. Assume that, for a given experimental condition, we collect RNA-seq data for N samples (i.e., biological replicates), with a total of T transcripts; we define by X iU the vector indicating the overall abundance of spliced and unspliced reads coming from transcript t in sample i, with t = 1, . . . , T , and i = 1, . . . , N . Our approach is built around two models. The first one is a multinomial model for the abundance of reads across the T transcripts: iU is the overall abundance (aggregated across both splice versions) of transcript t in sample i, and ρ (t) i indicates the relative abundance for the t-th transcript in the i-th sample, with The second model is a hierarchical beta-binomial distribution for the abundance of reads within the s and u versions of each transcript: for t = 1, . . . , T , and i = 1, . . . , N, where π (t) Si represents the relative abundance of spliced reads for transcript t in the i-th sample; and Beta(a, b) indicates the beta distribution with mean a a+b and variance ab (a+b) 2 (a+b+1) . Note that, for easier interpretation, the hyper-parameters can be reparametrized as δ U , usually referred to as the precision parameter, which indicates the sample-to-sample variability, andπ denoting the group-level relative abundance of s (or u) reads for transcript t, respectively. Note that we chose not to use a hierarchical prior for ρ for two main reasons: i) overall transcript abundances are easier to infer (hence the benefit of sharing information across samples is smaller), and ii) we wanted to limit the model complexity, and its computational cost.
Since the values of X and Y are not observed, they are treated as latent states and are sampled via a data augmentation approach. In particular, we allocate multi-mapping reads among the transcript(s) and respective splice version(s) they are compatible with. For instance, in sample i, consider a read compatible with the s version of transcript w, and the u version of transcript z; this read will be allocated to the former and latter cases with probability proportional to ρ U being the effective lengths of the s version of transcript w and the u version of transcript z, respectively. Normalizing for the transcript effective lengths ensures that the probability of allocating multi-mapping reads does not depend on how long transcripts are, and was previously found to improve model accuracy (Soneson et al., 2015;Tiberi and Robinson, 2020).

Model for single-cell data
Our framework for single-cell data is similar to the bulk approach, but presents four key differences, as summarized below. First, droplet scRNA-seq data have little resolution at the transcript level; therefore, analyses are performed on genes instead of transcripts. Second, cells are typically clustered (usually in cell types): when cell clusters are available, we separately analyze each cluster, and identify cluster-specific changes in regulation. Third, data refer to individual cells: after clustering them, we use a pseudo-bulk approach and, for each sample, compute the total s and u counts across all cells in a given cluster. Fourth, reads ambiguously mapping to both s and u versions of a gene (i.e., a reads) cannot easily be allocated to their splice version of origin. This is because the allocation step requires the estimated probability that an ambiguous read is spliced, which cannot be accurately computed, because it depends on unknown factors (Supplementary details). Therefore, we only use a latent variable approach for reads mapping to multiple genes; instead, a reads are treated separately from s and u. Consider one experimental condition and a single cell cluster, with scRNA-seq data available for N samples and G genes; we denote by X the vector with the overall abundance of spliced, unspliced and ambiguous reads (across all cells in the cluster) coming from gene g in sample i, with g = 1, . . . , G, and i = 1, . . . , N . Again, we use two models; the first one is a multinomial distribution for the abundance of reads across the G genes: iA is the overall abundance (across all splice versions) of gene g in sample i, and ρ (g) i is the relative abundance for gene g in the i-th sample, with G g=1 ρ The second model is a hierarchical Dirichlet-multinomial, which is a generalization of the beta-binomial model in (3)-(6), for the abundance of s, u and a reads within each gene: for t = 1, . . . , T , and i = 1, . . . , N, is the vector with the relative abundance of spliced, unspliced and ambiguous reads for gene g in the i-th sample. Again, from the hyper-parameters, we can obtain the precision parameter δ A , governing the sample-to-sample variability, and the group-level relative abundances of s, u and a reads for transcript t:π As before, the counts in X and Y are not observed and treated as latent variables, which are sampled based on ρ's and π's, by allocating reads to their gene of origin and respective splice version (i.e., s, u or a). As an example, assume that, in sample i, a read is compatible with the s version of gene w, the u version of gene z, and the a version of gene q; this read will be allocated to the one of three cases with probability proportional to ρ Ai , respectively. Note that, unlike in the bulk model, here we do not normalize for the effective lengths of genes; this is primarily due to two reasons. First, the effective length of genes is defined as a weighted average of the effective lengths of transcripts, weighted by transcript relative abundance, which is not known. Second, normalizing for the effective in the bulk model is based on the assumption that, given the same mRNA abundance, longer transcripts will produce more RNA-seq reads; however, this assumption is not valid in scRNA-seq protocols.

Parameter inference
In both models, our hierarchical framework allows sharing of information between samples; we further share information across transcripts (bulk model) and genes (single-cell model) via an empirical Bayes approach. In particular, our hyper-parameters δ, in (3) and (6), are initially estimated from a random selection of 1,000 genes/transcripts, via DRIMSeq ; these estimates are used to formulate informative priors for all hyper-parameters (Supplementary Details). Note that, our empirical Bayes approach is very mild because each gene/transcript contributes in a tiny fraction to the prior formulation. Additionally, we assume a weakly informative conjugate Dirichlet prior distribution for ρ, which results in a conjugate Dirichlet posterior distribution (Supplementary Details). Parameters are inferred via a Metropolis-within-Gibbs Markov chain Monte Carlo (MCMC) scheme, where we alternately sample from the conditional distributions of parameters, and latent states (Supplementary Details). Importantly, although our scheme involves many parameters, the vast majority of them are updated using a Gibbs sampler, which results in better mixing and convergence; only the hyper-parameters are sampled according to a Metropolis step, where values are proposed based on an adaptive random walk (Haario et al., 2001)

(Supplementary Details).
Since the sampling of the latent variables is the most computationally intensive step of our algorithms, we employ an undersampling scheme where latent variables are updated every 10 iterations (users can decrease this parameter). In our benchmarks, this led to a reduction of the runtime of our full pipeline of 74% and 35%, for the bulk and single-cell models, respectively. By default, the MCMC is run for 2,000 iterations, with a burn-in of 500 iterations (parameters can be increased by users). To ensure convergence, a Heidelberger and Welch (HW) stationarity test (Heidelberger and Welch, 1983) is performed on the marginal log-posterior density of the hyper-parameters. If the test fails, the burn-in is automatically increased up to half the chain length; if convergence is still not reached, a new chain is run, with double burn-in and number of iterations.

Comparing groups
Until now, we have shown how we infer model parameters, separately, in each group of samples; in what follows, we will describe how results across conditions are compared. To this aim, we introduce a new parameter,π U : in the bulk model, we setπ U =π U , while in the single-cell approach we also account for 50% of ambiguous reads, and defineπ U =π U + 0.5 * π A . Note that, for simplicity, we have dropped gene and transcript indices from the notation. Given two groups of samples, A and B, to identify differentially regulated genes/transcripts, we compareπ U between A and B, that we call AπU and BπU , respectively. We therefore define the probability that group B is up-regulated, compared to group A, as p = P r ( BπU > AπU ), which can be easily estimated from the posterior chains. In the Results Section, we used this probability to rank genes/transcripts for DifferentialRegulation; in particular, we rank them according to max(p, 1 − p). In other words, results with p close to 0 or 1 are ranked first (i.e.,π U differs between groups), while results with p ≃ 0.5 are ranked last (i.e.,π U similar across groups).
In the single-cell model, we acknowledge that settingπ U asπ U + 0.5 * π A is based on the arbitrary choice of equally assigning 50% of ambiguous reads to s and u. Therefore, we also provide an alternative way to rank genes, which does not require assigning ambiguous reads. In particular, we approximate the posterior distribution of the original (π S , π U ) with a bivariate normal around its posterior mode (Gelman et al., 1995), and perform a bivariate Wald test (Li et al., 1991) to verify if s and u proportions are equivalent across groups (Supplementary details); note that π A is not considered, because it is uniquely defined by π S and π U . Genes are then ranked based on the p-value of this test. Below, results based on the probability p are referred to as DifferentialRegulation, while those based on the Wald test are called DifferentialRegulation_Wald.

Simulation design
We designed several simulation studies to benchmark our method and competing approaches. To generate realistic simulations, we started with real datasets as anchor data. In the bulk simulation, which is an extension of the human simulation framework used in Soneson et al. (2016), we used a sample from Trapnell et al. (2013) (SRR493366) to infer the relative abundance of each transcript and splice version. Estimates of sample-to-sample variability were obtained using data from Cheung et al. (2010) and Pickrell et al. (2010), as previously described (Soneson and Delorenzi, 2013). We then used these parameters to simulate counts for each transcript (and splice version) for 6 samples, which were randomly separated in two groups. We randomly selected 2,000 transcripts as differentially regulated (DR); for each one, we inverted their s and u relative abundances in one of the two groups. In order to introduce quantification uncertainty in our simulation, we provided the vectors with desired transcripts per million values for each transcript to RSEM (Li and Dewey, 2011) to simulate reads, and then mapped these reads with salmon (Patro et al., 2017). We also generated two further simulations, where we added differential gene expression (DGE), with an average fold change of 3 between groups, or differential alternative splicing (DAS) as nuisance effects (Supplementary Details). In the single-cell simulation, we started from the mouse data from Park et al. (2018), consisting of four biological replicates; in this case, however, we did not simulate counts; instead, we used estimated spliced and unspliced counts directly. We annotated cell types via SingleR (Aran et al., 2019), and kept the three most abundant ones. As above, we separated samples in two groups, and introduced a DR effect, separately for each cell-type, in 20% of genes, by inverting s and u counts in one of the two groups (selected at random). We further generated a second simulation with DGE as confounding effect, again with an average fold change of 3 between groups (Supplementary Details). DAS was not simulated in this case, because it requires transcript-level resolution, which is not available in scRNA-seq protocols. In order to generate cell-type-specific changes, we randomly selected distinct differential genes in each cell type. To introduce quantification uncertainty, we provided the count matrices (generated above) to a read-level simulator, minnow (Sarkar et al., 2019), and aligned the simulated reads via alevin-fry (He et al., 2022). In both simulations, we performed basic filtering and analyzed genes/transcripts with at least 10 counts

Bulk simulation study
We benchmarked DifferentialRegulation against eisaR, which was developed to identify changes in splicing dynamics from bulk RNA-seq data, and various competitors that recently displayed good performance in detecting DAS from bulk RNA-seq data: DRIMSeq, satuRn, SUPPA2, and DEXSeq (Anders et al., 2012), which was used on both transcript estimated abundance (i.e., DEXSeq_TECs), and equivalence classes counts (i.e., DEXSeq_ECs). Figure 2 reports the receiving operating characteristic (ROC) curve for the three bulk simulations, and the number of false detections among the top-ranked transcripts, which is particularly relevant because top discoveries are usually selected for subsequent analyses by life scientists. In all simulated scenarios, DifferentialRegulation, eisaR and SUPPA2 display good performance, in terms of sensitivity, specificity, and false positive detections among top-ranked transcripts; however, of the three methods, SUPPA2 seems more affected by DGE and DAS confounding effects. DEXSeq_TECs and satuRn also perform well, yet with lower statistical power. Notably, DRIMSeq and DEXSeq_ECs display a low TPR, because they fail to analyze several transcripts, and return multiple NA's. These results are consistent with what was previously observed in Love et al. (2018), Tiberi and Robinson (2020), and Gilis et al. (2022). From a computational perspective, DifferentialRegulation is the most demanding method, which is unsurprising given the high cost of full MCMC algorithms involving latent states (Figure 3, left panel); nonetheless, the approach ran in about 1 hour on a single thread.
In addition, we investigated how overall transcript abundance affects performance, and stratified results into lowly, medium, and highly abundant transcripts, corresponding to the first, second, and third thirtile of abundance, respectively. In general, higher abundance corresponds to increased statistical power, and brings the performance of all methods closer; in all cases, the relative ranking of methods remains approximately stable ( Supplementary Figures 1-2).

Single-cell simulation study
In the single-cell simulation, we benchmarked DifferentialRegulation against BRIE2, and several approaches originally designed for bulk data: eisaR, DEXSeq_TECs, DRIMSeq, and satuRn. Except BRIE2, which uses single-cell observations, all methods worked with pseudo-bulk counts (i.e., aggregated counts across cells in a cluster). Here, DEXSeq_ECs and SUPPA2 were excluded because they could not be adapted to single-cell data: the former approach is bound to the output structure from salmon, which is a bulk pseudo-aligner, and the latter requires transcript-level abundances, while single-cell aligners (e.g., alevin-fry) return counts at the gene level. While eisaR was used on s and u reads, all remaining methods were run on s, u and a estimated counts, hence accounting for the uncertainty in ambiguous reads. Note that, however, only DifferentialRegulation accounts for the variability in reads mapping to multiple genes. In both simulations, DifferentialRegulation displays good sensitivity and specificity, although its ROC curve is mainly below those of eisaR and DEXSeq_TECs (Figure 4). Nonetheless, our approach has fewer false discoveries than competitors among top-ranked genes, particularly when introducing DGE as a nuissance effect. Our approaches based on the posterior probability p (DifferentialRegulation), and on a Wald test (DifferentialRegulation_Wald ) perform similarly, although the second one leads to (marginally) fewer false discoveries in top ranked genes ( Figure 4). Computationally, coherently with what previously observed, eisaR, satuRn and DEXSeq_TECs emerged as the fastest approaches, while DifferentialRegulation required significantly more time, yet approximately 35 times less than the other Bayesian approach, BRIE2 (Figure 3, right panel), despite BRIE2 using 6 times more cores than any other method. As in the bulk simulation, we also stratified results by overall gene expression, and found a consistent ranking of methods across abundance level; as expected, higher gene expression (i.e., more data) is associated with higher statistical power ( Supplementary Figures 3-4). Finally, note that, although BRIE2 can account for the sample-to-sample variability, we found that including sample information led to a major decrease in power (Supplementary Figure 5); therefore, in the main Figures, we considered the results obtained by fitting BRIE2 with group as the only covariate in the model.

Real data application
To compare methods on a real dataset, we considered the scRNA-seq data from Velasco et al. (2019), containing a total of 21 brain organoids from the human cerebral cortex, which were grown in vitro, for up to 6 months. Here we only considered a subset 6 brain organoids from the PGP1 stem cell line: 3 organoids were observed at three months of development, and 3 were collected at six months of development.
Comparing these two groups of samples should highlight changes that happen during brain development.
After  brain  only  3  3  0  0  0  0  cerebral  cortex  7  7  0  1  0  0  excitatory  neurons  2  1  0  0  0  0  PGP1  2  2  1  0  0  0  overall  14  13  1  1  0  0   Table 1: Number of interesting genes present among the top 200 results, of each cell-type, returned by every method. "DiffReg" refers to DifferentialRegulation; "brain only" denotes the 97 genes which were only detected in human brain; "cerebral cortex" indicates the 180 genes which display high expression in the human cerebral cortex, compared to other regions of the brain; "excitatory neurons" represents the 30 excitatory neurons; "PGP1" refers to the 3 genes associated to PGP1; "overall" gathers all 299 genes belonging to any of the previous lists.
(Supplementary Table 1). We applied differential methods and discovered differences, for each cell type, across development time points. We then used The Human Protein Atlas website (Pontén et al., 2008) to generate the following lists of potentially interesting genes: 97 genes which have only been detected in the human brain; 180 genes that displayed significantly higher abundance in the human cerebral cortex (i.e., the area the organoids are derived from), compared to other regions of the brain; 30 excitatory neurors, which play a key role in the development of the human brain cortes (Costa and Müller, 2015); 3 genes associates to the PGP1 cell line (i.e., the cell line used to generate the data). In absence of a ground truth, for each method, we investigated how often these genes appear in the top 200 ranked genes from each cell type. We found that, in all lists, DifferentialRegulation and DifferentialRegulation_Wald top discoveries contained significantly more potentially interesting genes than competitors (Table 1). This finding is coherent with the fact that, in the simulation studies, our approach, among its top ranked results, led to fewer false positives, than other methods. Note that BRIE2 was excluded from this analysis because, given the size of this dataset, it failed to return results within a week.

Discussion
We have introduced DifferentialRegulation, a Bayesian hierarchical approach to discover differentially regulated genes and transcripts across conditions, by detecting changes in the relative abundance of unspliced reads, which indicate differences in the future mRNA production. Our method works with both bulk and single-cell RNA-seq data, and is based on two distinct models to adapt to the peculiar aspects of the data being analyzed. Similarly, the outputs of the two frameworks differ, and take advantage of the information that each data type provides: in bulk data, we target transcript-level changes (across all cells), while in single-cell data, we aim at cluster (e.g., cell-type) specific changes, yet at the gene level. Importantly, RNA-seq data is typically characterized by a high degree of quantification uncertainty: we account for it via a latent variable approach where reads are allocated to their gene/transcript or origin, and to the corresponding splice version. Starting from real data as anchor data, we designed several benchmarks for bulk and single-cell RNA-seq data, and compared DifferentialRegulation to state-of-the-art tools, our method displays good sensitivity and specificity, and shows fewer false discoveries than competitors among top ranked genes, that are usually chosen by biologists for further investigations. Additionally, our approach appears to be robust with respect to nuisance effects, such as differential gene expression and differential alternative splicing, and shows good performance even in lowly abundant genes/transcripts. We also performed a real data analysis, where our method can detect (among its top ranked genes) more potentially interesting genes than alternative approaches.
We distributed DifferentialRegulation, open-access, as an R package via the Bioconductor project, which facilitates its integration with other bioinformatics tools and pipelines; furthermore, we provided an example usage vignette, and a plotting function, that simplifies the visualization of results. Finally, we would like to acknowledge some limitations of our framework. First, our method is amongst the most computationally demanding tools we tested, although clever coding techniques (such as undersampling, and C++ coding) enabled us to run our approach in our benchmarks in approximately 40-60 minutes using a single core. Furthermore, note that, our single-cell approach can also benefit from parallel coding, which can be particularly useful in large datasets. Second, covariates, such as batch effects, are not modelled; nonetheless, such nuisance effects usually affect overall gene abundance, while our framework focuses on relative abundance, and (as shown) is robust to DGE changes; therefore, such covariates are unlikely to impact results.

Availability
DifferentialRegulation is freely available as a Bioconductor R package at: https://bioconductor.org/ packages/DifferentialRegulation. The scripts used to run all analyses are available on GitHub (https: //github.com/SimoneTiberi/DifferentialRegulation_manuscript). All R scripts were run with R version 4.3.0, and Bioconductor packages from release 3.16. Raw data (fastq files) for the sample used to seed the bulk simulation are available from https://www.ebi.ac.uk/ena/browser/view/SRR493366. The dataset used as anchor for the single-cell simulation can be downloaded from Gene Expression Omnibus (accession number GSE107585). The data used in the real data analysis is available at the Gene Expression Omnibus (accession number GSE129519), as well as and at the Single Cell Portal (https://portals.broadinstitute.org/single_cell/study/reproducible-brain-organoids).