MultiRNAflow: integrated analysis of temporal RNA-seq data with multiple biological conditions

Abstract Motivation The dynamic transcriptional mechanisms that govern eukaryotic cell function can now be analyzed by RNA sequencing. However, the packages currently available for the analysis of raw sequencing data do not provide automatic analysis of complex experimental designs with multiple biological conditions and multiple analysis time-points. Results The MultiRNAflow suite combines several packages in a unified framework allowing exploratory and supervised statistical analyses of temporal data for multiple biological conditions. Availability and implementation The R package MultiRNAflow is freely available on Bioconductor (https://bioconductor.org/packages/MultiRNAflow/), and the latest version of the source code is available on a GitHub repository (https://github.com/loubator/MultiRNAflow).


Introduction
In this supplementary material, we illustrate all the output graphs of the R package MultiRNAflow with the dataset associated to the article of Weger et al. (2021).The function DATAnormalization() uses the vst method (Anders and Huber, 2010) from the R package DESeq2 (Love et al., 2014).The PCAanalysis() function also returns the same graphs describe above but without lines.An option also allows to plot all previous 3D PCA graphs in a rgl window allowing to interactively rotate and zoom.The function PCAanalysis() uses the R package FactoMineR (Lê et al., 2008).The HCPCanalysis() function also returns the same 3D PCA graph as in Figure S3.C in a rgl window allowing to interactively rotate and zoom.The function HCPCanalysis() uses the R package FactoMineR (Lê et al., 2008).The function MFUZZanalysis() uses the R package Mfuzz (Kumar and Futschik, 2007).3 Supervised statistical analysis of the transcriptional response

Temporal statistical analysis
Figure S1: The function DATAnormalization() of MultiRNAflow returns a boxplot showing the distribution of the normalized expression (here using the vst method) of genes for each sample.The color of the boxplots is different for different biological conditions.The x-labels give biological conditions, time and replicate numbers separated by dots of each sample.

Figure S2 :
Figure S2: The PCA is realized with the R function PCAanalysis() of Mul-tiRNAflow.When the experimental design contains several time points and several biological conditions, the function returns : one 3D PCA graph (A) and one 2D PCA graph (B) where samples are colored with different colors for different biological conditions.Furthermore, lines are drawn between each pair of consecutive points for each sample.The function also returns one 3D PCA graph (C) and one 2D PCA graph (D) for each biological condition (here BmKo only), where samples are colored with different colors for different time points.Furthermore, lines are drawn between each pair of consecutive points for each sample.
Figure S3: The R function HCPCanalysis() of MultiRNAflow performs a hierarchical clustering on principal components (HCPC).The function returns: a dendrogram (A); a graph (B) showing for each sample, its cluster (fisrt column), the associated time points (second column) and biological condition (third column) using a color code; one 3D PCA graph (C); one 2D PCA graph (D) where colors correspond to HCPC clusters.

Figure S4 :
Figure S4: Soft clustering of genes is realized with the R function MFUZZanalysis() of MultiRNAflow in order to detect the most common temporal behavior among all genes for each biological condition.The figure corresponds to the results of the soft clustering for the biological condition BmKo.

Figure S5 :
Figure S5: The function DATAplotExpressionGenes() of MultiRNAflow allows to plot for each biological condition: the temporal evolution of the four replicates of the expression of a selection of genes, and the corresponding mean and standard deviation.Colors are different for different biological conditions.Here, the temporal expression pattern of gene ENSMUSG00000031770 is shown and that gene belongs to the cluster 3 of the results of the soft clustering for the biological condition BmKo.

Figure S6 :
Figure S6: For each biological condition, the function returns: an alluvial diagram (A) of DE genes at least at one time, a graph (B) showing the number of DE genes as a function of time for each temporal group, a Venn barplot (C) showing the number of DE genes belonging to each DE temporal pattern.By temporal group, we mean the sets of genes which are first DE at the same time.By temporal pattern, we mean the set of times t i such that the gene is DE between t i and the reference time t 0 .The function DEanalysisGlobal() also returns: an alluvial diagram (D) for DE genes at least at one time for each biological condition and a barplot (E) showing the number of DE genes up-regulated and down-regulated for each time and biological condition.

Figure S8 :Figure S9 :Figure S10 :Figure S11 :
FigureS8: The function DEanalysisGlobal() also returns plots combining the temporal and biological condition analyses: a barplot (A) showing the number of DE genes and signature genes for each time and biological condition and an alluvial graph (B) for DE genes wich are signature at least at one time for each biological condition.A gene is called signature of a given biological condition at a time t i , if the gene is both i) statistically DE at time t i and thus participating in the temporal transcriptional response of this biological condition, and ii) specific of this biological condition at time t i .This set of genes (DE between t i and t 0 and specific for this biological condition at time t i ) constitutes the transcriptional signature of this biological condition.A barplot (C) showing the number of genes which are DE at least at one time, specific at least at one time and signature at least at one time for each biological condition For each time point, the function DEanalysisGlobal() returns a Venn barplot (A) which gives the number of genes for each possible intersection.We say that a set of pairs of biological conditions forms an intersection if there is at least one gene which is DE for each of these pairs of biological conditions, but not for the others.The function also returns a barplot (B) showing the number of specific genes per biological condition for each time and an alluvial diagram (C) of genes which are specific at least at one time for each biological condition.A gene is called specific of a given biological condition at a time t i , if the gene is DE between this condition and any other conditions at time t i , but not DE between any pair of other biological conditions at time t i .