Interpretation of network-based integration from multi-omics longitudinal data

Abstract Multi-omics integration is key to fully understand complex biological processes in an holistic manner. Furthermore, multi-omics combined with new longitudinal experimental design can unreveal dynamic relationships between omics layers and identify key players or interactions in system development or complex phenotypes. However, integration methods have to address various experimental designs and do not guarantee interpretable biological results. The new challenge of multi-omics integration is to solve interpretation and unlock the hidden knowledge within the multi-omics data. In this paper, we go beyond integration and propose a generic approach to face the interpretation problem. From multi-omics longitudinal data, this approach builds and explores hybrid multi-omics networks composed of both inferred and known relationships within and between omics layers. With smart node labelling and propagation analysis, this approach predicts regulation mechanisms and multi-omics functional modules. We applied the method on 3 case studies with various multi-omics designs and identified new multi-layer interactions involved in key biological functions that could not be revealed with single omics analysis. Moreover, we highlighted interplay in the kinetics that could help identify novel biological mechanisms. This method is available as an R package netOmics to readily suit any application.


INTRODUCTION
Cost reductions of DNA sequencing in addition to other high-throughput multi-omics technologies have revolutionized many research fields ranging from personalized medicine (1,2) to systems biology (3,4). These innovations have led to new biological insights and a better understand-ing of living organisms (5)(6)(7). Thus, enabling the assessment of most biological layers, this democratisation of highthroughput technologies has created large datasets representing different biomolecules that necessitate specific processing and statistical methods (8,9). Multi-omics trials typically collect different types of biomolecules (mRNA, proteins, metabolites, etc.) from the same biological samples with the ultimate goal of highlighting the interaction between biological layers that could be responsible for causing complex phenotype or diseases (2). Even more so most biological phenomenon involves complex interactions between layers that vary through time (10). Adapted multiomics time-course methods to integrate and accurately capture interactions among those biological layers are thus now required of and fully capture interactions within and between omic layers.
To describe complex interactions and regulatory mechanisms behind biological systems, mathematical models, such as network, are built to interpret and reverse-engineer cellular functions. Networks are used to represent all relevant interactions taking place in a biological systems (11). In networks, molecules (genes, proteins, metabolites) are reduced to a series of nodes that are connected to each other by edges. Edges represent the pairwise relationships, interactions, between two molecules within the same network. Molecular networks have become extremely popular and have been used in every area of biology to model for example transcriptional regulation mechanisms, physical protein-protein interactions (12,13), or metabolic reactions (14). Networks come with valuable properties and useful topological features such as degree distribution to identify highly connected nodes or shortest paths which determine proximity between two nodes. On a different scale, network modularity defines sub-network units with highly connected nodes in respect to the rest of the network. These sub-networks, also known as modules, often share a similar function. Thus, the 'guilt by association' property assumes that known or unknown highly connected molecules should be functionally related (15).
Inference methods for network construction are often applied to a single omic layer to identify interaction between molecules. However, this does not directly elucidate interaction across multiple omic layers (16). To connect these layers, a first approach require prior knowledge of across omic molecular networks such as publicly-available databases (17). This approach is based on a legacy limited to model organisms and may not reflect the current biological condition. A second method use multivariate data-driven methods that statistically infers correlations between molecules based on multi-omics data. However, this approach may have many possible solutions. A combination of the two methods could improve multi-omics network construction.
The ultimate goal of multi-omics networks is to connect phenotypes to biological mechanisms and their regulators. Analysis of the interactome identifies direct neighbors and modules linked to a phenotype. However, direct neighbors can lead to false discoveries. Phenotypes and molecules may be linked by irrelevant interactions. Furthermore, our knowledge of the interactome is not complete and we can miss true interactions or interactions with more distant molecules (18). Based on the work of Page et al. (19), propagation algorithms recently became the state-of-the-art to investigate gene disease associations and also gene function prediction (18)(19)(20). From the known association, the signal is iteratively propagated through the network. When a steady state is reached, new nodes can be added to the initial association by their propagation score reflecting their proximity to the starting nodes. It thus highlights potential new phenotype-related targets. New advances in randoms walks algorithms allow to propagate the signal in heterogeneous multi-layered networks which improves association prediction (21).
In this paper, we propose to build hybrid multi-omics networks from longitudinal multi-omics data in order to facilitate the interpretation of multi-layers systems ( Figure 1). This methodology is based in the first place on the modelling and clustering of expression profiles with similar behaviours over time. It relies on both accurate network reconstruction methods and knowledge-based reviewed interactions between either molecules of the same or different types. Finally, a random walk algorithm was used to identify and make new hypothesis about links between omics molecules and key biological functions or mechanisms. The main objectives of this method is to provide a versatile framework for multi-omics network-based integration but also to provide interpretation guidelines to explore these networks to further highlight key intra-omics and interomics mechanisms and interactions. We illustrate this approach through three case studies. These studies have different experimental designs with different omics data types, timepoints and organisms to demonstrate that the proposed approach is able to deal with a wide range of situations.

MATERIALS AND METHODS
This approach proposes pre-processing, modelling and clustering steps for multi-omics longitudinal data. It mainly emphasizes on network-based integration and multi-layered network exploration ( Figure 2) using network propagation algorithms in order to provide new biologi- Figure 1. Overview of the proposed approach. (A) Description of the experimental design: the same biological material is sampled at several time points across several omic layers indicated in different colors. Each omic data is normalised using both platform-specific and time-specific normalisation steps. (B) Multi-Omics Network is built using both inferencebased and knowledge-based methods to connect intra-and cross-layered biological features or molecules (mRNA, proteins, metabolites). Measured molecules are clustered into groups of similar expression profiles over time and corresponding nodes formed kinetic sub-networks. Overrepresentation analysis is performed to add an extra layer of functional annotation. Propagation analysis is performed on specific nodes of interest, called seeds (biological function, gene, protein, metabolite, etc.) to identify closely related molecules. cal insights. We developed the R package netOmics which wraps the method proposed below. It was developed to simplify and reproduce the integration and interpretation steps. It provides documentation and practical guidelines to build and explore (longitudinal) multi-omics networks.

Multi-Omics longitudinal design
We define longitudinal multi-omics designs as follow. From the same biological sample, omics data are produced (RNA, proteins, etc.) at different timepoints. Raw data are processed to get (NxP) abundance tables by omics data type with samples in rows and molecules or biological features (RNA, proteins, metabolites, etc.) in columns. We call these tables blocks. In this framework, there is no need to have matching timepoints between blocks because we use a modelling step to interpolate missing timepoints and even out uneven designs.

Pre-processing of longitudinal multi-omics data
We assume each omics data is a raw count table resulting from bioinformatics quantification pipelines (22,23). Low counts are filtered and data are normalized according to PAGE 3 OF 21 Nucleic Acids Research, 2022, Vol. 50, No. 5 e27  the type of data in each table. We also applied a filter on time profiles and kept only molecules with the highest expression fold change between the lowest and highest point over the entire time course, as described in (24). For each case study, we adapted these filters to take into account platform-specific dynamic range of values.

timeOmics: modelling and clustering of longitudinal multiomics data
The timeOmics approach (24) was used to cluster multiomics molecules with similar expression profiles over time. The framework is based on two main steps: (1) The first step uses a Linear Mixed Model Spline framework (25) to model every molecule over the time-course by taking into account the inter-individual variation. This framework tests different models and assigns the best model to each molecule according to a goodness of fit test. One of its benefits is to allow interpolation of missing timepoints and thus accommodate non-regular experimental designs with missing data.
(2) The second step clusters the modelled expression profiles in groups of similar expressions over time. This is performed using various multivariate projection-based methods implemented in mixOmics (26). With a 3blocks omic design, we used multi-block Projection on Latent Structures (block PLS) to cluster time profiles from multi-omics datasets. Optimal number of clusters is determined by maximising the average silhouette coefficient.
The objective of these preliminary steps is to summarise each molecule with an expression pattern. This tag will then be used in the multi-omics network reconstruction to build cluster-specific sub-networks.

Network reconstruction
In order to build a multi-omics network, we started by building a map for each layer (genes, proteins, metabolites, etc) using a combination of both data-driven and knowledge-driven building methods. The method used is specific to the type of data and are described below. As mentioned in section 2.3, we kept the clustering information by e27 Nucleic Acids Research, 2022, Vol. 50, No. 5 PAGE 4 OF 21 building several sub-networks per kinetic cluster. We also built an entire network without the cluster labels.
Data-driven network reconstruction. For gene expression data, we relied on Gene Regulatory Network inference. This class of method tries to reverse engineer complex regulatory mechanisms in the organisms and infer relationship between genes. ARACNe (27) is a co-expression-based inference algorithm which identify most likely TF-targeted genes interaction by estimating mutual information, a similarity distance between pairs of transcript expression profile. We used ARACNe algorithm on gene expression profile to infer potential TF-targeted genes interaction from the gene expression dataset.
Knowledge-driven network. Some kind of interactions cannot be revealed by inference methods and has to be experimentally determined (e.g. protein-protein interactions, ChIP). We then relied on reviewed interactions found in specialized databases to connect molecules of the same type and also get cross-layered interactions (binding, enzyme, regulation) From the measured molecules in the dataset, we collected all possible interactions through targeted databases. To maximize cross-layered connectivity, we also included non-measured proteins or metabolites which were directly connected to measured molecules.

Protein interactions.
Physical or functional proteinprotein interaction (PPI) is one kind of interaction that is difficult to predict and PPI network inference algorithms for MS data are still in their infancy (28). For human proteomic data, we relied on the BioGRID database (29) which records >1.8 million proteins and genetics interactions from major model organisms. This database collects experimentally determined physical protein-protein interactions and also connects transcriptomics and proteomics layers with regulatory relationship (TF-gene interactions). For other model organisations, we can rely on more specialised or custom databases (30).

Metabolite interactions.
We used the KEGG Pathway database (31) which records a collection of manually drawn metabolic pathways representing molecular interaction, reaction and relation networks for human and other model organisms. We used KEGG to link metabolite compounds involved in the same reactions. We also connect metabolites to genes and/or proteins if they are involved in the same biochemical enzymatic reaction thanks to KEGG Orthology database that links genes to high-level functions.
The objective of this building step is to provide an entire multi-omics network composed of three main layers and several sub-networks specific to kinetic clusters. The next steps will focus on the analysis of these multi-omics networks.

Enrichment analysis
Over representation analysis (ORA) helps to find enriched and meaningful biological insights from interacting biomolecules. This task was achieved using gProfiler2 (32), first on each kinetic cluster and then on all molecules from the entire network. We focused on the three Gene Ontology (GO) terms: Biological Process (BP), Molecular Function (MF) and Cellular Component (CC). P-values were corrected with gProfiler2 custom multiple testing correction algorithm (g:SCS) (33) and only significant terms were considered (g: SCS < 0.05). Size and significance of P-values distributions were compared between both clusters and entire network approaches. We also used Fisher's combined probability test (34) for multi-omics P-values comparison.

Random walk
As described in (21), in an undirected graph G = (V, E), the random walk (RW) starts from a node (v 0 ), called seed, and simulate a particle that randomly moves from one node v t to another v t + 1 following the probability distribution: ∀x, y ∈ V, ∀t ∈ N where d(x) is the degree of the node x in the graph G. Valdeolivas et al. (21) also added the possibility to restart at the initial node to avoid dead ends in multi-layered networks. When a steady state is reached, the algorithm gives a probability score to each node of the network which represents the proximity of that node and the seed. We then used the R package RandomWalkRestartMH (21) to apply random walk with restart algorithm on multiomics network with three main purposes to guide interpretation. (i) RW can be used to identify multi-omics nodes and their interactions linked to mechanisms of interest (e.g. GO:BP). Therefore, a GO term node can be turned into a seed and RW can be performed from that starting point. Then, a sub-network with the top 25 closest nodes to that seed can be built. Naively, all significant GO term nodes were iteratively turned into seeds. We then screened subnetworks containing different types of molecules to highlight the multi-omics aspects of the integration. We applied this analysis on both kinetic cluster sub-networks and entire network. (ii) RW can be used for nodes function prediction. Similar as above, unlabelled nodes can be turned into seeds. We relied on gProfiler2 annotations to identify nodes without any known functions. For an unlabelled seed, a list of ranked nodes was produced and the closest GO term node was assigned to that seed. We repeated this for the three GO ontologies (BP, MF, CC) on the entire network. (iii) Combined to kinetic clusters, RW can locate regulatory mechanisms and find interacting nodes with different expression profiles in the entire network. Once again, each node can be turned into seed and sub-networks were built using the 10 closest nodes. Then we screened sub-networks with different cluster labels from the seeds that might reveal underlying regulatory mechanisms.

Data
In the following section, two published multi-omics case studies are presented. These applications have longitudinal multi-omics designs, but each block was analysed separately. We modelled these dataset with multi-layer networks to highlight the multi-omics interactions. Specific analysis PAGE 5 OF 21 Nucleic Acids Research, 2022, Vol. 50, No. 5 e27 steps for each example are described, along with specific databases used for each case study.
Case study 1: HeLa cell cycling study. Understanding the complex relationship between gene expression, translation product and protein abundance is the key to decrypt biological mechanisms. Genes undergo several steps of regulation before turning into proteins including transcription, translation, folding, post-translational modification and eventually degradation (35) studied the poor correlation between mRNA and related protein levels during cell cycling regulation using triplicate measurement of mRNA expression (microarray), translation product (PUNCH-P) and protein quantification (MS) from synchronized HeLa S3 cells. Authors sampled cells during phases G1, S and G2 ( Figure 3).
The authors were able to find clusters of patterns in the expression of genes and their products related to key functions of the cell cycle. These functions were up-or down-regulated at different stages. For example, Cell division, Cytokinesis, Spindle, Chromosome segregation and Microtubule based movement biological processes shared similar patterns and were down-regulated in G1/S and upregulated in S/G2 transitions. Interestingly, multi-omics showed that a gene and its derivatives can have different expression patterns during a given time course, which highlights underlying molecular mechanisms, such as mRNA or protein degradation.
In this first case study, we intended to highlight the multiomics interactions involved in the control of HeLa cell cycle during G1, S and G2 phases.
To do so, we filtered the RMA-normalized mRNA with a 2-Fold-Change filter. iBAQ normalized translatome products and proteins were filtered with a different 3-Fold-Change threshold because of the differences in platformspecific dynamic ranges.
We modelled the expression of every molecule with LMMS including the variations of the three replicates for each of the three timepoints. We built multi-omics clusters of expression profiles based on the direction of variation between each step. mRNA and proteins networks were built with ARACNe and BioGrid interaction databases, respectively. Protein coding and TF regulated information were used to connect these layers where UniProtID were converted into Gene Symbols (https://www.uniprot. org/uploadlists/) to ensure proper matching. We also included metabolite reactions from KEGG connected to protein enzymes to metabolites. We performed ORA with mRNA, translatome products and proteins against GO:BP, GO:MF and GO:CC terms for both clusters and entire set of molecules. Finally, we performed RW on this multi-omics network.
Case study 2: dynamic maize responses to aphid feeding. Maize (Zea mays) is one of the most productive cereal crops in the world. However, the plant is subject to numerous biotic attacks caused by herbivorous insects and it is therefore critical to understand the maize defense mechanisms in order to improve its productivity. Aviner et al. (35) studied the dynamic of maize response to aphid feeding and they found that mutants in benzoxazinoid biosynthesis and terpene synthases genes do affect aphid proliferation.
To measure gene expression changes over time, authors exposed five two-weeks maize plants (B73) to corn leaf aphid (Rhopalosiphum maidis) during four days (Figure 7). They also include five control maize plants with the same growing conditions minus the exposure to aphids. During this time course, they sampled the five exposed and five controls plants at six timepoints (i.e. 2, 4, 8, 24, 48, 96 h) and conducted gene expression profiling with RNA-seq, LC-TOF-MS nontargeted metabolite quantification as well as amino-acids, phospholipids and terpenes targeted metabolite quantification.
In the paper, the authors focused on the genes and metabolites involved in the maize response to aphids. We intended to go a little further by addressing the complex regulatory relationships that may exist between these multiomic actors over time.
For this example, we focused on the first five timepoints (i.e. 2, 4, 8, 24, 48 h). We discarded genes which were not differentially expressed between exposed and control groups and those having an expression difference <2-fold over the entire time course. We split TF-coding genes from other transcripts to get a 3-omics-like experimental design. Finally, we discarded nontargeted metabolite as they were not annotated and performed longitudinal clustering with multiple block PLS. For network reconstruction, we used ARACNe on genes and TF. We used the Protein-Protein Interaction database for Maize (PPIM) (36) to identify protein-coding genes and extracted direct neighbor interactions. PPIM contains more than 2.7 millions interactions between protein, TF and gene interactions. These interactions are either predicted or experimentally determined from public databases such as UniProt (37), Bi-oGrid (29), DIP (38), IntAct (39) and MINT (40) or using to text mining. We mainly focused on validated interactions and predicted ones qualified as high-confidence interactions that represents 155 845 interactions with top 5% highest decision scores (36). Then we connected the measured metabolites and those involved in the same reaction to the genes/proteins using KEGG. We performed ORA with genes, TF and proteins on GO:BP, GO:MF and GO:CC terms. Finally, we performed RW on this multi-omics network.
Case study 3: diabetes seasonal study. Diabetes is the seventh leading cause of death in the world according to the WHO (41) and its prevalence is constantly increasing (42). The role of the integrative Human Microbiome Project (iHMP) is to study the impact of host-microbiome relationships on diabetes mechanisms of appearance and progression (43).
In the study from Integrative (44), 105 subjects with diabetes (Insulin Sensitive and Insulin Resistant) were followed over a period of more than 4 years. Each patient was sampled every 3 months and every 3-4 days during stress periods. This resulted in an average of 27 samples per patient. At each visit, 51 clinical tests were performed. From the blood samples, transcriptomics, proteomics, metabolomics, cytological and microbiome data (oral and gut) were produced.
Thanks to this complex design, Sailani et al. (45) identified several molecules linked to diabetes and important biological changes related to annual seasonality. They found  Table 2. (C) Multi-omics network layout: represents the connection between entities and their different types of interactions. The network was composed of four layers: a gene layer build from mRNA expression, a PPI layer build from measured proteins and BioGRID known interactions, a metabolite layer from KEGG pathways and GO term layer from enrichment analysis. different shift in expression in all omics molecules and they group them into two main expression profile clusters. We expect that network integration will provide a deeper understanding of the diabetes process and the role of hostmicrobiome interaction in that disease.
Since microbiome data is highly variable we decided to perform the modelling and integration on only one individual. We also reduced the time period to one year (7 timepoints) for the same reason. We performed the integration of transcriptomics, proteomics, metabolomics, cytokines, gut microbiota data and clinical variables. We applied a Fold-Change filter for each omics block, except the clinical variables, with a specific threshold. We modelled the data with LMMS and used a multiple block PLS to cluster the data. We applied the sparse multi-block PLS to identify a key signature per cluster. For network reconstruction, we applied the ARACNe algorithm (27) separately on transcriptomics, unlabelled metabolomics and clinical variables. We used the BioGRID interaction database (29) to build a PPI network from proteins and cytokines and used the information of TTRUST (46) and TF2DNA (47) to connect these molecules to the transcriptomics layer. We applied the SparCC algorithm (| | ≥ 0.3) to build a microbiome network as recommended in (48). With limited knowledge about host-microbiota interactions, microbiota network were connected to each of the layers listed above by computing the spearman rank correlation on the expression data and kept only the highest interactions between microbiota and other layer (| | ≥ 0.99). We followed the same procedure to connect the unlabelled metabolomics data and clinical variables to the other layers. Lastly, we performed an enrichment analysis with the transcriptomics, proteomics and cytokines molecules against Gene Ontology and we added the significant GO terms as a GO layer by connecting the significant GO terms to the corresponding RNA, proteins or cytokines. In addition, we performed a gene-related disease enrichment analysis with MedlineRanker (49) a data mining tool, which searches for publication abstracts in which genes were linked to diseases (Med-lineRanker parameters: Min number of citations = 5; min number of genes significantly associated with a disease = 2; FDR:0.05). Like GO terms, disease terms were added to the network and connected to the related genes. Finally, we performed RW on this multi-omics network and the cluster specific sub-networks.

Case study 1: HeLa cell cycling study
In this example, we studied the HeLa cell cycling from Aviner et al. (35). This dataset was composed of three omic layers and three timepoints.
Pre-processing, modelling and clustering of time profile. HeLa cell dataset was assembled into a single table focused on proteins. After missing value removal, this dataset was composed of 6785 mRNA, 4102 translation products and Time profile clustering. Once all the molecules were modelled over time and noisy profiles were removed, remaining expression profiles were clustered according to their differential expression value between two timepoints. This clustering resulted in 4 clusters ( Table 2) with a silhouette coefficient of s sil = 0.75. We compared this clustering with the timeOmics approach with four clusters for similar parameters. timeOmics resulted in a lower silhouette coefficient (s sil = 0.63). In the first approach, Cluster 1 included the largest number of molecules (n = 4443). It is characterized by a decrease between the G1 and S phases and then an increase between S and G2/M. This seemed to be the main kinetic pattern for proteins since it contained the majority of them (95%). Cluster 2 (n = 1444), included the largest amount of translation products (54%) and it was characterized by an increase in expression from the first to the last step. Cluster 3 (n = 958) showed an opposite pattern compared to the cluster 2 with a decrease across the overall time course. Finally Cluster 4 (n = 156) included the least number of molecules and no protein appeared to follow an increase and decrease pattern.
Multi-layered network reconstruction. The first layer to be reconstructed was the gene inference network from the mRNA. We used the ARACNe algorithm to build a network by kinetic cluster but also to build a entire network composed of all the mRNA. Supplementary Table S1 shows statistics about the sub-networks such as the number of connected / disconnected nodes and edges. The second layer was the PPI network. Proteins were connected to each other using the BioGRID interaction database. As for the genes, PPI sub-networks were built for each kinetic cluster but we also built a entire PPI network composed of all the proteins. In addition, we included Bi-oGRID proteins which were directly connected to the measured ones (first degree neighbours). Number of nodes and edges are detailed in Supplementary Table S1. These first two layers were combined thanks to two types of links. First, protein-coding information linked 126 genes to their corresponding proteins. Second, TF-regulated information from TF2DNA (47) and TTRUST databases linked 57 proteins to 403 genes (16 846 interactions). In addition, KEGG pathway database was also used to link protein enzymes to metabolite reactions. This new layer was composed of 1595 metabolites connected to 2213 proteins (12 694 interactions).  Table S2. With the largest number of molecules, proteins gathered the most significant GO terms. Table 3 details the number of enriched GO terms by omic types. Enriched terms were splited by ontologies. mRNA collected the fewest enriched terms in the three ontologies, followed by translation products and proteins. The first degree neighbours (extended proteins) collected the most terms, as they have the highest number of nodes in the network.
In Figure 4, we compared the number of unique GO terms and the distribution of P-value between the global or cluster approach. With the exception of translation products which had more enriched term in clusters than the entire set, both approaches seemed to have a similar number of unique GO terms but some terms DNA replication (GO:0006260), mitotic cell cycle (GO:0000278) were shared between clusters. In addition the clustering approach gave much more significant terms than the global approach with the extended proteins. There was no significant difference in the P-value distributions however the clustering approach results in smaller p-values. In the Figure 4C   Finally, significant GO terms were added in the multilayer network as another new layer and were connected to genes, proteins and metabolites. This last building step resulted in a multi-omics network composed of four layers. Table 4 describe network composition with the different type of node for each layer.

Random walk.
Molecules involved in specific mechanisms. The first purpose of RW applied to multi-omics network was to iden-tify molecules involved in a specific mechanisms. Naively, each of the 2279 significantly enriched GO terms, (union of GO terms found in cluster and the entire network), can be a mechanism or a function of interest. Each of these GO terms was then turned into a seed. RW from each seed were used to generate sub-networks the 25 top closest nodes. In order to focus on multi-omics integration, we only focused on seeds that reached both genes and proteins. In a second step, we also included seeds that reached both genes, proteins and metabolites. Table  5 describe the number of GO terms (BP, MF, CC) where these conditions were met. 16 unique terms were linked to metabolite nodes (12 BP, 4 MF). Some terms were shared between clusters. These terms encompassed functions related to cell development such as histone lysine methylation GO:0034968 or regulation of mitotic nuclear division (GO:0007088). In Figure 5, we illustrate in detail the multiomics interactions leading to the activation of this last biological process (GO:0007088). More general terms were also found: positive regulation of cellular metabolic process (GO:0031325), mRNA catabolic process (GO:0006402). Some MF terms seemed also related to cell development anaphase-promoting complex binding (GO:0010997),

PAGE 9 OF 21
Nucleic Acids Research, 2022, Vol. 50, No. 5 e27 Figure 5. HeLa cell cycling study: Random walk with restart result from the seed GO:0007088 . The top 25 results are displayed. From the seed, the algorithm is able to associate genes, proteins and go terms. Thanks to the addition of knowledge-based layers, it can also access other proteins and metabolites which were not produced in the original study (Ext. neighbours). S-adenosylmethionine-dependent methyltransferase activity (GO:0008757). We also found such terms as response to ionizing radiation (GO:0010212) which could be an artifact of mass-spectrometry analysis. On the other hand, besides the term histone methyltransferase activity (H3-K27 specific) (GO:0046976) other shared terms seemed to be rather generic : methyltransferase activity (GO:0008168) or regulation of signal transduction (GO:0009966).
Function prediction. The second purpose of RW was to predict the function of unannotated nodes (genes, proteins). In this perspective, 6 genes and 24 proteins were unannotated according to gProfiler. Each of these nodes was then turned into a seed. We then assigned to each node the closest GO term. Results of function prediction annotation are detailed in the Supplementary  files. All the nodes representing a measured molecule were transformed into a seed. From the 2362 seeds, 438 were connected to molecules from other clusters We kept 47 seeds that can also reach GO term nodes. These results are detailed in Supplementary Table S3. We identified many biological processes terms related to the cell development. For example, the node labelled PIMREG was a protein coding gene belonging to Cluster 1. With RW, it was linked to Cell Division (GO:0051301), Cell Cycle (GO:0007049) and other GO:BP terms. PIMREG is a mitotic regulator which takes part in the control of metaphase-to-anaphase transition. It was linked to SETD1B (Cluster 2), an histone methyltransferase that tri-methylates histone H3 at Lysine 4 and contributes to epigenetic control of chromatin structure and gene expression. This is an active marker of upregulated transcription and this may explain why the expression of the PIMREG gene is increasing between phases S and G2/M. In Figure 6PIMREG is also directly connected to the CDK5RAP2 gene, which has the same expression profile as PIMREG and plays a role in the formation and stability of microtubules, essential to the cell development.

Case study 2: dynamic maize responses to aphid feeding
Pre-processing, modelling and clustering of time profile. The maize dataset included a gene expression assay composed of 41 716 transcripts identified using the B73 maize reference genome (AGPv3.20), with 2254 transcripts identified as TF-coding genes and a targeted metabolite assay with 45 molecules measured. 1606 maize genes, 123 TF and all metabolites were differentially expressed between aphid and control groups for at least 1 time points. 331 genes, 36 TF and 29 metabolites were filtered after the 2 Fold-Change filtering step. Once all the molecules were modelled and noisy profiles were removed 1208 genes, 31 TF and 29 metabolites remained in the exposed group (Table 6).
Time profile clustering. Next step was the clustering with timeOmics using genes, TF-coding genes and targeted metabolites. This resulted in optimally 4 clusters (s sil = 0.82). Table 7 gives the cluster composition. Clusters 1 and 2 recorded most of the molecules (42%; 40%). These were mainly composed of linear models representing an increase (cluster 1) or a decrease (cluster 2) in the time course. Cluster 3 (7%) was characterised by a rapid decrease followed by a slight increase from t = 24 h. Cluster 4 (11%) showed the opposite of cluster 3 with a rapid increase followed by a relatively steady state.
Multi-layered network reconstruction. The first layer to be reconstructed was the gene inference network from the mRNA and TF using ARACNe. We used unmodelled filtered data from aphid and control samples to build this first layer. As the first case study, we build sub-network for each cinetic cluster and a global network composed of all the mRNA/TF. Networks composition is detailed in Supplementary Table S4. The second layer was the PPI network. Filtered PPIM database was used to build protein-protein interaction network from measured protein-coding genes. We also included proteins which were directly connected to the measured protein coding genes (first degree neighbours). Gene kinetic cluster sub-networks and global network were extended with PPI interactions.
The third network contained metabolic reactions. Metabolites converted to KEGG Compound ID were linked to each other with KEGG Pathways. Metabolites involved in the same metabolic reactions were also included in this layer. As for genes and proteins, kinetic cluster  sub-networks and entire network were extended to connect genes and proteins to metabolites. Composition of multi-omics networks is detailed in Table 8.
Enrichment analysis. ORA was performed on genes and proteins against GO:BP, GO:MF and GO:CC. This approach was achieved both for each cluster and combined network. We first performed ORA on measured molecules only and with PPIM's first-degree neighbours. All the results of ORA analysis are detailed in Supplementary Table S5 and Table 9 summarise the number of unique GO terms by ontology and cluster. Cluster 1 with more genes gathered the largest number of significant terms. Terms specific to this cluster were molecular functions related to nucleotide bindings (GO:0035639, GO:0032555, GO:0017076, GO:0032553). Other clusters grouped terms related to bindings or kinase activity (GO:0043546, GO:0043424). Figure 8 represents the number of significant enriched GO terms and their p-value distribution. Measured molecules tended to produce more significant terms when ORA was performed on the entire set of genes rather than with the clustering. Nevertheless, with PPIM's first-degree neighbours, there were more significant term in the clusters than in the entire set. Neighbours also gave smaller p-values. Lastly, ORA identified unique terms per cluster, not found in the entire set, and vice versa. Therefore, we found the terms mentioned for Cluster 1. Concerning the unique terms in the entire set, they included terms corresponding to defence response to other organism (GO:0098542) or other biological processes that can be linked to defence mechanisms such as lignin metabolic process (GO:0009808), interspecies interaction between organisms (GO:0044419), cellular response to chemical stimulus (GO:0070887) or transmembrane transport (GO:0055085). Stranger, defense response to oomycetes (GO:0002229) was also significantly enriched in the entire set.
Random walk. Like the HeLa cell cycling study example, RW was used on the multi-omics network with three purposes: Identify multi-omics molecules linked to mechanisms (GO term nodes as seed), function prediction (unla-     stress including salinity, drought, heavy metals, pesticides, ultraviolet radiation and temperature stress (50). In the entire network, this node reached the metabolites L-Tyrosine (C00082) and L-Phenylalanine (C00079) which were both aromatic amino acids measured in the initial study. In addition, it reached non-measured metabolites such as trans-Cinnamate (C00423) and 4-Coumarate (C00811) which were connected to Tyrosine and Phenylalanine respectively. These connections were enabled by the L-phenylalanine ammonia-lyase and L-tyrosine ammonia-lyase reactions in which both measured and non-measured compounds were involved. In this same sub-network, we obtained several isoenzymes involved in these reactions such as pal1,5,6,9. Interestingly, several gene models without functional annotation are connected to the network (GRMZM2G063679, GRMZM2G087259). RW from Phenypropanoid metabolic process on clusters 2 and 4 networks also returned Tyrosine and Tryptophan since they were classified in clusters 2 and 4 respectively and were involved in the same metabolic reaction. It also reached genes connected to similar pathways in both clusters: dhurrin biosynthesis acting as a plant defense compound and also suberin monomers, cuticular wax and cutin biosynthesis for plant development. Specificaly to cluster 4, one protein (GRMZM2G164036) was linked to DIBOA-glucoside biosynthesis, a benzoxazinoids dervived from indole that is a volatile compound used in parasitic defence (51).

Function prediction.
Unlabelled gene nodes (n = 96) were turned into seeds and the closest GO terms were assigned to these nodes. 347 unlabelled nodes were disconnected from the network therefore it was not possible to apply the algorithm for these ones. These results are detailed in Supplementary Table S6. Many seeds were connected to various biosynthesis process such as cinnamic acid (GO:0009800), ceramide (GO:0046513), galactolipid (GO:0019375). We found unlabelled genes linked to defense response processes (GO:0006952) or jasmonic acid metabolic process (GO:0009694) which is an hormone that can be produce during biotic or abiotic stress. Other nodes were connected to various molecular functions such as hydrolase, chitinase, methyltransferase or bindings.
Cluster intersection. All measured nodes were turned into seeds and sub-neworks were build with the to 10 highest scored nodes from the entire network. 678 seeds were able to reach both nodes with different assigned kinetic cluster and GO term nodes. Results are detailed in Supplementary Table S6. We found molecular functions corresponding to different enzymatic activities (ATP binding, kinase, oxidoreductase ). Less frequently, we observed a toxin activity (GO:0090729) for the rip2 gene (GRMZM2G119705), also connected to defense response (GO:0006952). In Figure 9, gene AC209374.4 FG002, assigned to cluster 2, were indirectly connected to nodes from both clusters 1, 3 and 4. They were linked to DNA-binding transcription factor activity (GO:0003700). This regulatory mechanisms may be explained by the opposite expression profile as illustrated in Figure 9B.

Case study 3: diabetes seasonal study
Pre-processing, modelling and clustering of time profile. The diabetes seasonal study dataset ( Figure 10  The modelled time profiles from the 6 blocks were clustered using the multi-block PLS and resulted in two clusters (s sil = 0.26). The first cluster (Cluster 1) was composed of 293 molecules and characterised by an increase until the middle of the year (t = 75) then a decrease. The second cluster (Cluster 2) was composed of 730 molecules and had an opposite pattern to Cluster 1 with a decrease until the middle of the year then an increase. We used the sparse multiblock PLS to identify a signature of the cluster and the number of selected molecules by sparse cluster is displayed in Table 10.
Multi-layered network reconstruction. The first layer of the multi-omics network was the gene inference network from mRNA expression using ARACNe algorithm. As the first 2 case studies, we build sub-networks with cluster-specific mRNA and a general network with all the molecules. We also applied ARACNe to build networks of clinical variables and unlabelled metabolites.
As in the HeLa cell cycling case study, we build a PPI network with the BioGRID (29) interaction database with proteins and cytokines. This layer was connected to the mRNA layer using TFome information described above (TF2DNA (47), TTRUST (46)).
Specific to this case study, we build a microbiome correlation network from gut OTU using SparCC algorithm and cluster-specific sub-networks. We connected this layer to the other layer by computing the Spearman rank correlation and kept only the highest correlation between gut OTU and other molecules (| | ≥ 0.99). With a correlation above the threshold, 58 OTU were linked to five RNA, one metabolite and two clinical variables (AG, EGFR) with a total of 132 interactions.
Unlike the other two case studies, the metabolites were not annotated, therefore we could not use the KEGG reaction database to connect this layer to genes and proteins. Instead, we applied the same method as the microbiome layer to connect the metabolite layer to the network. The 105 metabolites were linked to 11 mRNA, 8 OTU and 3 clinical variables (EGFR, LYM, ALKP), with a total of 231 interactions. Finally, we applied the same method for the clinical variable layer. The 39 clinical variables were linked to 775 mRNA, 16 cytokines, 30 proteins, 105 metabolites and 58 OTU, with a total of 1026 interactions.
The network composition is detailed in Table 11.
Enrichment analysis. ORA was performed on mRNA, proteins and cytokines data against Gene Ontology (BP, MF and CC). It was performed on both cluster-specific terms and the combined list of molecules. Unsurprisingly, with more molecules than Cluster 1, Cluster 2 gathered more significant GO terms (Table 12, Supplementary Table S7). Figure 11 represents the number of significant enriched GO terms and their P-value distribution. Measured molecules tended to produce more significant terms when ORA was performed on the entire set of genes rather than with the clustering. In Cluster 2, we found a lot of GO terms linked to cell organization and biogenesis such as cellular component organization or biogenesis GO:0071840, regulation of organelle organization GO:0033043, positive regulation of neurogenesis GO:0050769, regulation of organelle organization GO:0033043, cellular component biogenesis GO:0044085, membrane organization GO:0061024, actin cytoskeleton organization GO:0030036, cell-cell junction GO:0005911. These enriched terms suggest important changes in the mechanical properties of tissues such as skin.
In winter, cold and low humidity increase skin permeability and epidermal thickening (45,52). This is consistent with the kinetic of cluster 2 which displayed a decrease in expression until in the winter and spring seasons and an increase in summer. In Cluster 1, we found more generic terms linked to cell activities such as translational initiation (GO:0006413), biological regulation (GO:0065007) or positive regulation of cellular process (GO:0048522). We also found the term viral infection (GO:0016032) which can also be explained by the seasonal nature of the data and can be increased in patients with diabetes (53). In addition to GO enrichment analysis, we performed a disease-related gene enrichment analysis using MedlineRanker in this case study (49). We found significantly enriched diseases such as Hemoglobinopathies (P-value = 1.7 10e-3) or Thalassemia (P-value = 3.3 10e-3) a bloodrelated genetic disorder. People with this hemoglobinopathy are likely to develop autoimmunity, insulin resistance and disease such as type 1 or 2 diabetes (54). We also found Renal Tubular Acidosis (P-value = 2.10e-2) (Supplementary Table S7) which is also known to be a diabetes-related disease (55,56).
Finally, we used this enrichment information to add two more layers to the network. Each GO or disease term was linked to the corresponding RNA, protein or cytokines used for the enrichment analysis. The final network was composed of 8 layers of molecules/terms and the final composition of the entire network and the kinetic cluster specific subnetworks is available in Table 11.
Random walk. In this case study, random walk was used to identify multi-omics interaction from diseases and biological functions (GO terms). It was also used to identify molecules between kinetic clusters. Unlike the other case studies, any gene or protein nodes were unannotated and we did not predict any function.
To identify molecules linked to biological mechanisms, the random walk starting points were significant GO terms and Disease nodes. After the random walk from each of those seeds, we ranked the nodes and extracted the top 15 first nodes related to the seed. We then had a list of 274 initial seeds (96 GO, 178 diseases). X seeds were able to reach at least two omic layers within the 15 closest nodes and the seeds were linked to a maximum of three different layers. For example, the disease node labelled Renal Tubular Acidosis was linked to one cytokine, three proteins and 12 RNA despite that this disease was only related to two known RNA (SLC4A1, CA2) in this network (Figure 12). In addition, in this sub-network, we detected the RNA MEIS1 which was found in the signature. The gene is a protein-coding gene from the family of homeobox genes (HOX). It is associated with restless legs syndrome (57). We did not find any relevant literature about a potential link between this gene and diabetes mellitus but associations between restless leg syndrome and diabetes mellitus are well documented (58)(59)(60).
RWR was also applied to identify seeds linked to both kinetic clusters. The disease node Renal Tubular Acidosis was connected to molecules which belong to both 2 kinetic clusters. We therefore have a clear demonstration of the benefits of building and annotating longitudinal multi-omics networks for the identification of potential mechanisms and revealing new interactions in diabetes mellitus.

DISCUSSION
Improvements and cost reductions in high-throughput omics technologies have resulted in the emergence of increasingly complex experimental designs that combine both multi-omics data and longitudinal sampling from the same biological samples. These complex designs are implemented to provide a comprehensive view of biological systems in order to identify complex relationships between omics layers and answer various biological questions such as disease subtyping, biomarker discovery, models and predictions (2,(61)(62)(63). Appropriate integration of the multiple omics profiled must be performed to fully take advantage of the interactions between omics and their interdependencies. The challenge of multi-omics integration is to offer interpretation guidelines adapted to the multiplicity and heterogeneity of designs (data types, number of conditions, samples, timepoints, etc.). In order to identify cellular mechanisms with multi-omics regulation, we propose to use a combination of both data-driven and knowledge-driven integration methods from longitudinal multi-omic data to highlight interactions not only between molecules of the same biological type but also between omic layers linked to cellular mechanisms.

Multi-omics network integration in literature
Multi-omics networks are the key to illustrate topological relationships between different molecular species (64). The HetioNet network is currently one of the most impressive multi-omics networks. Composed of 18 different types of nodes (genes, SNPs, proteins, compounds, tissues, diseases) and 19 different types of edges between these layers (65). Its main objective is to provide a natural representation of the human system hierarchy and to propose genetic-disease association prediction. Although very comprehensive, some relationships evolve in time and the addition of these dynamic interactions as well as specific expression data should be very valuable to study complex disease mechanisms. On a smaller scale, MetPriCNet (66) is a 3level network composed of genes, metabolites, diseases and associations between each cross-layer pair. Its objective is to prioritize candidate metabolites for known diseases or phenotypes. Although very useful for therapeutic target prediction, these networks should be connected to expression data to further address the needs of personalized medicine. In another paradigm, OmicsNet (67) offers to build customized multi-omics networks from a list of molecules of interest PAGE 17 OF 21 Nucleic Acids Research, 2022, Vol. 50, No. 5 e27  (genes, proteins, metabolites, TF, miRNA). It does not provide association prediction but the connection between the nodes is ensured by several databases (STRING, miRBase, KEGG, ENCODE, etc.). Its main advantage is to propose in a web browser a user friendly 3D representation of the resulting network. Like other networks, it is specifically designed for the human organism and still too few integration networks target other model organisms.
Like OmicsNet, MiBioMics (68) also offers a web service and an easy-to-use user interface. The method offers the user to directly upload their data, and performs pre-processing and multi-omics integration with multiple co-inertia analysis. It also offers network inference using WGCNA (69). Although the interface is dynamic, network exploration is limited to module detection.
A recent contribution to multi-omics networks with time series data was proposed in (70). From time series expression data, the authors defined time events with Minardo-Model. These events have significant shifts in expression on a specific time window. They identified groups of molecules with similar time events. The analysis of the successive time events resulted in an expected causal directed graph between the molecules. This was applied on a multiomics dataset (phosphoproteomics, transcriptomics, and proteomics) where each block was analyzed independently and the results were combined during the final event ordering step. Complementary to our approach and if the experimental design allows it (Kaur et al. (70) used 7+ evenly spaced timepoints), one can easily imagine that the order of events can be embedded into another layer of the graph and provide unique interpretation results.

Interpretation guidelines for multi-omics networks: advantages and limitations
Ranging from expression modelling over time, multivariate clustering to network reconstruction, in the following section we discuss the pros and cons of the choices made at each stage of this methodology.
Data requirement and experimental design. Each experimental design is different, in terms of type of data, numbers of replicates, conditions and timepoints. As illustrated in the different case study, the framework can handle a large variety of data. Special care must be taken in designing the study. To improve the integration, interpretation and reduce false positive discoveries, one should study events with large variation over the same time period and when the variation between omics blocks is expected to happen on the same time scale (71). Normalisation can also impact the net- work building process, we recommend to using gold standard normalisation and preprocessing approaches appropriate for each omic layer. Longitudinal data and Multivariate clustering: our previous timeOmics framework (modeling and clustering step) was designed to identify highly associated temporal profiles between multiple and heterogeneous datasets. In order to maximise the method performance, we made recommendations about experimental design, including the number of data blocks (3) and evenly spaced time points (5-10) (24). In addition, the interpolation procedure which infer missing timepoints for uneven designs may have a large impact on the clustering result as demonstrated in (24). We also used cases when they were numerous timepoints or high inter-individual variability. A filtering step was implemented to increase interpretability, based on highly variable profiles within the time period above a set fold-change. This naive threshold can be refined with the addition of condition specific differential expression or other models according to the experimental design (25,45).
Network and kinetic clusters. In this framework, we have proposed to use multi-layer networks to help with the interpretation and the visualization of multi-omics interactions. Although we built a global gene network, we also used the inference method on each of the clusters separately. We first tried to extract cluster-specific sub-networks from the general network. The second approach is clustering first, and then building a network per cluster. We kept the second approach because it gave a higher number of connected nodes and edges than the network-first approach. Furthermore, with the addition of knowledge data for which we do not have temporal profiles, some nodes will be shared between clusters.
Network inference. We relied on ARACNe algorithm for gene inference network. More mathematically advanced alternatives exist (72) recently applied Gaussian graphical model with graphical lasso to infer gene interactions in 15 specific types of human cancer using RNA-Seq expression data from The Cancer Genome Atlas. Nevertheless, ARACNe is a well-cited, reliable coexpression-based inference method and it provides an easy to use implementation. As mentioned by (73), the inference problem is NPcomplete and only the combination of the results of several methods might help uncover true interactions. Moreover, of omics data should influence the choice of the inference method. With microbiome abundance data, co-occurrence network inference is often applied to identify synergism or competition relationships between species and their environment (74,75). In terms of multi-omics interaction, the recent tool OmicsAnalyst (76) proposes an interesting improvement. This method used two filters to control correlation strengths for intra-omics and for inter-omics interactions.
Knowledge-based interaction. The proposed hybrid multiomics networks were composed of expression data and knowledge links. Therefore, known links can be limited by incomplete databases. In the first case study, we merged Bi-oGRID, TTRUST and TF2DNA to connect proteins to genes. However, more specialised databases could also be added to include more links. In the second case study, since maize is not as studied as humans, the PPIM database covers all known to date interactions for this organism and some unknown interactions may be missing. In addition, these links represent potential interactions that may append in the studies but not measured interactions and these networks are referred to as maps. An additional integration step would consist in constructing a dynamic signal flow where the directionality and intensity of the interactions would be represented in the form of flows or pathways. Unfortunately, this type of representation requires a time-series design using precise multiple doses of stimulation under the same condition (16). On the contrary, the use of erroneous databases can lead to false discoveries and thus we believe that only knowledge databases with trusted interactions should be used. Alternatively, overlapping interaction between several databases can be selected to improve predictions (77).
Multi-omics interpretation. The use of clusters enables kinetic-specific enrichment analyses to be performed. These results can be compared with those obtained from the entire network. Both approaches tended to provide the same number of significant terms are instead complementary. It was possible to identify terms unique to each approach and terms only found in a particular cluster. Addition of non-measured molecules in ORA provided more significant terms with lower p-values. However, these extra molecules were highly shared between cluster or were linked to similar functions. Thus it did not bring cluster specific enrichment. The sparse multi-block PLS allows the identification of a signature of kinetic clusters. This signature is often composed of a small list of molecules and does not always allow a positive biological enrichment. However, the use of this sparse signature adds additional information on the network and allows the identification of nodes of interest. A good interpretation strategy is to use these signature molecules as seeds in the propagation analysis or if any seed can reach these nodes of interest.
Propagation analysis. Although random walk propagation analysis is the state of the art for association prediction, its application on heterogeneous multi-layered networks is still in its early stages. Multi-layer network describes a network composed of several layers. The layers contain the same nodes but the edges connecting these nodes are different. For example, a gene network may have coexpression links on a first layer and PPI interactions on a second layer.
In addition, to depict the relationships between different species of molecules, it is recommended to use heterogeneous networks with a bipartite relationship that describe all possible interaction between two sets of nodes. With more links and more possibilities to fine-tune the transition between layers, more complex network structures would improve the prediction of the random walk (21). As complex biological networks have several types of molecules and several types of interactions, it becomes essential to use heterogeneous multi-layer networks to describe all the relations of the system. However, tools to build such networks are limited by the number of layers or the num-ber of heterogeneous relations. Therefore, we were encouraged to use monoplex networks for each application. Although monoplex networks highlighted relevant interactions, comparison with heterogeneous multi-layer networks would have been worthwhile. Finally, merging data from different sources must be performed with care. Depending on the source of the data, the quality of the interactions can be heterogeneous and can lead to wrong predictions. To overcome this issue, it would be possible to use weighted edges according to the level of confidence granted to the inference algorithms or the given databases. However, using data with very heterogeneous quality scores can result in the highlighting of the same interactions and potential discoveries can be missed. It is recommended to mix inference approaches and databases to improve interaction predictions (78,79).

Contribution of network-based integration on public datasets
The reanalysis of multi-omics data with new methods can lead to a better understanding of the system. Both cases proposed in this paper are no exception. With the study of HeLa cells during the cell cycle, the authors were able to highlight functional groups, expressed by proteins, which may be over or under-regulated at different phases of the cycle. Combined with the other omics layers, they were able to compare the dynamics of an mRNA, its translation product and the associated protein and thus highlight the regulation mechanism during the cycle. Nevertheless, our methodology, based on network-based multi-omics integration could, in addition to answering the same questions, offer the advantage of going even further.
From a key function (Cell division, Chromosome segmentation, Telomere maintenance) it is possible to predict the molecules involved but also to know the events, interactions, regulation, which have allowed its expression. Moreover, by adding other molecular layers (e.g. metabolites) it is possible to observe the implications of this function at the system level.
The second case examining the response of maize to an aphid attack highlighted the dynamics of certain genes involved in the defense mechanisms, associated with the evolution of known metabolites. Although the initial analyses identified genes and metabolites that are highly variable with the presence of aphids, the authors did not take advantage of all the multi-layer interactions. Indeed, with a very targeted list of metabolites, having roles in defense mechanisms but also in other pathways, it is difficult to identify relevant multi-omic interactions. By adding other layers but also directly connected molecules, we have been able to considerably increase the diversity of the interactions that could be detected and thus increase the chances of identifying other potential targets to better understand maize defense mechanisms against aphids and possibly orient the development of future biological pesticides.
Finally with the third case study, we studied Type 2 diabetes with a more complex experimental design than the other case study with microbiome and clinical variables. This case study investigates the seasonality of omics expressions throughout the year. Aside from the GO terms, we enriched the network with a disease layer. This layer adds informative interactions regarding possible diabetes-related complications as well as the prediction of key genes in these mechanisms.
The widespread use of high-throughput technologies has enabled the profiling of multiple omics layers across multiple time points. This has opened the door to new study designs that can now address questions and identify novel biological mechanisms regulated by different biomolecular layers. However, one of the biggest challenges in this era of multi-omics is the integration and interpretation of the diverse large-scale omics data in a way that provides new and more complete biological insights.
To facilitate the interpretation of longitudinal multiomics data we have proposed an analytic strategy that consists of multi-omics kinetic clustering and multi-layer network. The use of multi-layer networks can evolve and be enriched by the addition of new layers (for example miRNA, disease, drugs, environmental variables, microbiome) and their respective cross-layered interactions (gene-disease association, host-microbe interaction, miRNA-mRNA regulation) and also intra-layered interactions that could defined different kinds of interactions for the same layer (coexpression, functional relationship, expression delays).
The analysis of three multi-omics longitudinal studies in three previously published datasets demonstrates that new multi-layered interactions can be unravelled. This approach can be applied to many fields such as pharmacology or dermocosmetics where the addition of information and the exploration of multi-omics networks can lead to the discovery of new therapeutic targets. Eventually, and with highly controlled experimental designs, we expect that dynamic networks can be built to model an entire biological system. Therefore, this methodology will open new avenues for exploration and interpretation from multi-omic studies.

DATA AVAILABILITY
The netOmics R package is available at https://github.com/ abodein/netOmics. HeLa Cell raw data for case study 1 can be found in (35). Maize raw data for case study 2 can be found in (80). Diabetes raw data for the case study 3 can be found in (45). Processed data and in-house scripts are available in a Github public repository: https://github.com/ abodein/netOmics-case-studies.