-
PDF
- Split View
-
Views
-
Cite
Cite
Dipanka Tanu Sarmah, Nandadulal Bairagi, Samrat Chatterjee, Tracing the footsteps of autophagy in computational biology, Briefings in Bioinformatics, Volume 22, Issue 4, July 2021, bbaa286, https://doi.org/10.1093/bib/bbaa286
- Share Icon Share
Abstract
Autophagy plays a crucial role in maintaining cellular homeostasis through the degradation of unwanted materials like damaged mitochondria and misfolded proteins. However, the contribution of autophagy toward a healthy cell environment is not only limited to the cleaning process. It also assists in protein synthesis when the system lacks the amino acids’ inflow from the extracellular environment due to diet consumptions. Reduction in the autophagy process is associated with diseases like cancer, diabetes, non-alcoholic steatohepatitis, etc., while uncontrolled autophagy may facilitate cell death. We need a better understanding of the autophagy processes and their regulatory mechanisms at various levels (molecules, cells, tissues). This demands a thorough understanding of the system with the help of mathematical and computational tools. The present review illuminates how systems biology approaches are being used for the study of the autophagy process. A comprehensive insight is provided on the application of computational methods involving mathematical modeling and network analysis in the autophagy process. Various mathematical models based on the system of differential equations for studying autophagy are covered here. We have also highlighted the significance of network analysis and machine learning in capturing the core regulatory machinery governing the autophagy process. We explored the available autophagic databases and related resources along with their attributes that are useful in investigating autophagy through computational methods. We conclude the article addressing the potential future perspective in this area, which might provide a more in-depth insight into the dynamics of autophagy.
Background
The importance of autophagy in maintaining cellular homeostasis is ineffable. In Greek, the term autophagy means self-eating. This term was first coined by Christian de Duve, and this nomenclature was solely based on the observed degradation of mitochondria and other intracellular structures within the lysosome of rat liver perfused with glucagon [1]. Degradation of unwanted materials like damaged mitochondria and misfolded proteins are essential for the maintenance of cellular homeostasis. Autophagy engulfs these unwanted materials and facilitates the process of cellular filtration. However, the contribution of autophagy toward a healthy cell environment is not only limited to the cleaning process; it is indeed much more than that. The degradation of proteins in the lysosome results in amino acids, which assist the protein synthesis even if the system lacks the inflow of amino acids from the extracellular environment due to diet consumptions [2]. Therefore, under stressed conditions like starvation, autophagy levels rapidly upregulate and initiate cellular degradation to provide nutrients to the cell [3–6].
Continuous divulgence of new autophagy-related genes (ATG), as well as pathways, have made autophagy a growing field. Deterioration of this aforementioned catabolic process is evidenced to be associated with various diseases (Figure 1). Autophagy acts as a Janus in cancer by playing a role both in tumor suppressor and tumor activator. Autophagy-related proteins are associated with the prevention of cancer cell growth in various cancers, including the likes of the colon, gastric, breast and prostate cancer [7–11]. However, autophagy also helps in tumorigenesis by promoting the proliferation of cancer-cell and tumor growth [12–14]. Again, abnormal lipid metabolism and the excessive accumulation of triglycerides stored in lipid droplets trigger the non-alcoholic fatty liver disease, which may eventually lead to non-alcoholic steatohepatitis (NASH) [15,16]. In vitro and in vivo studies have revealed that autophagy plays a protective role in NASH by selective degradation of these lipid droplets [17]. Hence, the autophagy pathway can be a potential target in the treatment of NASH. In various neurodegenerative diseases, including Parkinson’s disease, Alzheimer’s disease and Amyotrophic lateral sclerosis [18], misfolded protein accumulation is considered a pathological hallmark. Since the accumulation of misfolded proteins is directly affected by a decrease in the neuronal autophagy level, autophagy is considered as a target pathway in neurodegenerative diseases. The importance of autophagy can be mapped to insulin resistance and type 2 diabetes, as it plays an indispensable role in the physiology of beta cells. Autophagy takes part in the regulation of insulin homeostasis and is necessary for normal beta cell homeostasis [19–21]. The disrupted autophagic activity has been reported in the beta cells of type 2 diabetes mellitus (T2DM) patients [22]. Metformin has been widely used in type 2 diabetes clinical therapy and protects pancreas beta cells from injury through autophagy activation by the AMP-activated protein kinase (AMPK) pathway [23]. Due to its crucial role in cellular housekeeping, autophagy also plays a role in anti-aging mechanisms [24,25]. It also plays an essential role in cell remodeling during development [26] and in cellular defense against pathogens [27].

The process of autophagy is completed in five steps: initiation, elongation, autophagosome formation, fusion and degradation. Each step is monitored and modulated by a group of genes called ATG. Autophagy-related diseases can be categorized into two parts: organ-specific (shown in the inner circle) and multisystemic (shown in the outer circle). This figure shows the autophagy-disease interplay. The association with most human diseases, including varieties of cancer and immune disorders, has proved that autophagy is a quintessential process, and its manipulation can be targeted as a therapeutic strategy.
Nevertheless, despite playing a protective role in various diseases, uncontrolled autophagy may lead to excessive degradation of the cellular constituents and may cause cell death [28–31]. Hence, although important, the autophagy process needs to be strictly monitored for the smooth functioning of the cellular homeostasis [32,33]. To understand the significance of autophagy in cellular processes and its association with different diseases, we recommend reading two excellent reviews on autophagy [3,34].
There are three defined types of autophagy, viz. macro-autophagy, micro-autophagy and chaperone-mediated autophagy [34]. In this study, by autophagy, we mean macro-autophagy only (Figure 1). In macro-autophagy, the degradable constituent is first encompassed by a double membrane vesicle, which gradually extends to form autophagosome. The autophagosome then fuses with lysosome and forms autolysosome, where the degradation of the cytosolic cargo occurs. The mammalian target of Rapamycin complex 1 (mTORC1) and lysosome play a crucial role in macro-autophagy. A nutrient-rich environment within the cell enables mTORC1 to be located in the lysosome and promotes growth processes. Transcription factor EB (TFEB) is a protein that dictates the transcription levels of lysosomal and autophagy genes. mTORC1 suppresses autophagy by modulating the nuclear export of TFEB [35] and by the inhibition of the autophagy initiation complex [36]. This terminative effect of mTORC1 ceases when starvation triggers it to dissociate from the lysosome, which leads to autophagy induction. The process of autophagosome–lysosome fusion, i.e. the induction of autolysosomes, is facilitated by several proteins. Cooperative activity of soluble N-ethylmaleimide-sensitive factor attachment protein receptor proteins embedded in either membrane helps to overcome the high energy barrier of membrane fusion [37]. Again during fusion, the two vesicles must be kept close for which HOPS complex, PLEKHM115 and EPG5 simultaneously interact with proteins present on both the autophagosomal membrane and autolysosomal membrane [38–40]. Once the autolysosome forms, the inner autophagosomal membrane degrades, and more than 60 lysosomal hydrolases work simultaneously to digest the confined material [41].
This evolutionarily conserved process has always been an important research topic. A PubMed survey using the keyword ‘autophag*’ in the Title category revealed that the intensity of autophagy in the research field had increased exponentially in the last few years. Though the literature exists from 1956, its bloom is observed only after the year 2004–05. The autophagic mechanism is highly sensitive to perturbations of the intracellular or extracellular microenvironment. Its increasing and decreasing level may play lethal or survival roles in biological functions. Hence, a better knowledge about the autophagy pathway and its part is essential for improving future therapy that might contribute to recognizing alternative cell death and cell survival mechanisms in the presence or absence of some apoptotic pathways.
Biological processes are governed by specific rules, and systems biology acts as a hatchet to unveil these underlying principles [42,43]. Every system possesses a hierarchical structure and a systematic study of it helps to find how components are organized, viz. what lies in the core, and what remains on the periphery of the system. Again, these structures are interlinked together, where each lower level in the hierarchy creates the level immediately above (for example, cell to tissue, tissue to organ, organ to the organ system and so on) by means of some linkages. Systems biology is nothing but the study of both these structures and the linkages. By implementing various algorithms to analyze a disease network, the core modulators of the disease can be detected, and the dynamics of these core sets of modulators can be studied through mathematical modeling. The enrichment analysis can help to identify the core pathways involved in a disease in the protein–protein interaction (PPI) network. In other words, mathematical modeling helps to study the system in parts while system biological approaches (network analysis and enrichment analysis) help to explore the entire system. Various systems biology methods have been applied to delineate the process of autophagy (we have considered macro-autophagy only) using mathematical modeling and network analysis. Mathematical models have been combined with experimental studies to decipher the complex dynamics of macro autophagy. An application of mathematical modeling is to study the effect of starvation-induced autophagy in cell (yeast) population dynamics [44]. Again, these models study the dynamics of the proteins that govern and modulate macro-autophagy at different phases. In various diseases like cancer, Alzheimer’s, etc., the core set of proteins or pathways have been studied mathematically to capture the mechanism of the disease progression and to extract the influential parameter to restore the cell homeostasis in disease conditions [45–51]. The network analysis approaches have been applied to observe the topological behavior of autophagy-related proteins in various diseases. Network analysis of differentially expressed genes (DEG) in multiple diseases such as leukemia, pancreatic cancer, etc. has shown the abundance of autophagy-related proteins and autophagic pathways in the disease [52,53]. Again, the implementation of network centrality measures such as degree, betweenness centrality, clustering, etc., help to identify the core set of proteins in the network [54,55]. Although these two approaches are often studied separately, these studies may complement each other. For example, for a better understanding of the underlying mechanism of the association of autophagy proteins in disease, mathematical modeling can be done on the core set of proteins obtained from PPI network analysis. It may identify potential parameters that otherwise could not be explained by network analysis alone. This methodology is shown in Figure 2. Hence, with the in vivo and in vitro studies of autophagy which have explored many novel discoveries, the systems biology, with the potential of decrypting the system’s complexity both as a whole and in part, have significantly emerged and made tremendous contributions to the field of autophagy.

The journey of proteins from being inside the system to the arms of mathematical modeling. The palate (A) shows the processing of data. The raw gene expression data to study a particular disease is first corrected, including steps like dealing with the null values and the outliers. The data are then normalized, and the DEG are calculated. The palate (B) shows the autophagy specific study of the disease. The autophagic genes are first obtained from an autophagy database from where the differentially expressed autophagy genes are selected. Using PPI databases, a PPI network of the DEGs (which may be entirely autophagic DEGs or a mixture of autophagic and not-autophagic DEGs) is constructed. In the figure, the green color denotes the autophagic, and the orange color represents the non-autophagic genes. Implementing ML approaches, graph theoretical approaches or enrichment analysis (pathway analysis, disease analysis or gene ontology analysis), the significant modules or target proteins from the network are extracted. In the first case, the proteins driving the module can further be identified. Finally, the implementation of mathematical modeling approaches can explore the dynamics and underlying mechanism of the target proteins or the module.
In this review, we have encapsulated the overview of autophagy in computational biology explored via mathematical modeling and network analysis along with comprehensive insights about these approaches and their applications in the exploration of the autophagy process at various levels (molecules, cells, tissues). We have delineated several well-established methods such as mathematical models based on different types of differential equations, Petri net, agent-based models (ABM), enrichment analysis and centrality analysis to capture the dynamical behavior or the collective influencers in the network. Further, we have enlisted the available autophagic databases and the related resources and their feature selection and epitomized some conventional software and tools used for visualization and analysis in computational biology. We believe that this review provides an in-depth and improved understanding of autophagy in systems biology and will help the researchers to select the appropriate method for new studies to find potential targets in a plethora of diseases, including cancer, Alzheimer’s, metabolic and immune-related disorders.
Mathematical modeling for autophagy
Significance of mathematical models in studying biological systems
|${\tau}_i,i=1:n$| are time delays. They are measurable and may be constant. Sometimes, the initial or boundary conditions may not be sufficient to predict the future state of a system. For such a scenario, it is indispensable to know how the system behaved in the early stages, and hence, DDEs play a vital role in understanding a biological system, where the current state of some variables depends on the past states.
The mathematical modeling is based on four crucial pillars, where the first pillar is a literature survey of the system. The second pillar is the construction of the model, where the relationship between the model variables will be established using model parameters. This step is followed by the analysis of the model, and the last pillar is the validation part where the result of the model will be validated either by literature or by an experimental approach. If the model fails to deliver the appropriate output, the necessary changes will be implemented in the model until the desired outcome appears. Mathematical models are perfect examples of complexity and simplicity as they are ornamented with a set of equations, which are complex enough to replicate the properties of the system and concurrently simple enough to grab up the underlying phenomena of the system. Theoretically, these models can drive a system anywhere, but it has to follow some constraints in systems biology. For example, a negative concentration of a protein will make no sense, so as a species’ negative population. Similarly, there must always be an upper bound, be it concentration of a protein inside a cell or population of a species. But, although restricted to biological constraints, a mathematical model can help to find out the crucial parameters responsible for deciding the fate of the system. In other words, for a specific cellular process full of many regulatory patterns, mathematical modeling paves the way to pick the right one.
Different tools and packages have been built across multiple platforms (MATLAB, Python, R, etc.) to support mathematical modeling [57–63]. A structural diagram editor, Cell Designer [64], has also been developed to draw gene-regulatory and biochemical networks to make mathematical modeling a feasible approach in systems biology. CellML is an XML-based language designed to describe mathematical models in a machine-independent form suitable for sharing between different authors and archiving in a model repository [65].
Why model autophagy?
The process of autophagy consists of five steps, and all these stages are easily observable [66]. Different steps in the autophagy pathway may exert a different effect on the system. However, the biochemical reactions in autophagy are mostly nonlinear, i.e. a minute change in any of its stages will not necessarily exert a proportional effect throughout the system. Mathematical modeling endorses simplified abstractions and approximations to identify the steps of autophagy that are responsible for a particular behavior in the system. Moreover, the constant shift in the behavior of the system exerts randomness in the autophagy process. Mathematical modeling of autophagy keeps track of these factors and allows the researchers to investigate the dynamics of the system following any environmental conditions that may arise due to various external or internal perturbations or signals. Autophagy is a bridge between cell survival and cell death. Depending on certain extracellular or intracellular signaling, the process of autophagy may decide cell fate. At the single-cell level, these events may be mutually exclusive, indicating that cell death and cell survival events are different attractors of the system. Mathematical modeling can be done to understand the crosstalk between these two events using attractors, fixed points and limit cycle concepts.
Cell types differ in their response to autophagy stimuli. Addressing this cell to cell variability, various therapies have targeted autophagy manipulation in cancer therapy [67–69]. Mathematical modeling can help in planning and predicting the parameters that could be targeted and its outcome on the cell population. For example, autophagy helps in tumor cell survival under various stress conditions [70]. On the contrary, increased autophagy may lead to excessive cellular degradation and thus, may initiate cell death [71]. A mathematical model can perfectly utilize these conditions to identify the biological parameters that increase the autophagy process in disease conditions, so that the tumor cell gets less benefit from the basal level of autophagy and cell death initiates.
Thus, mathematical modeling can be used in a very effective way to decipher the process of autophagy and its role in various diseases or conditions. We have discussed below some of the modeling work done in the autophagy process to get an idea of the applicability of mathematical modeling in understanding the autophagy process.
Differential equations based models for autophagy
In 1975, Deter et al. [72] formulated the first mathematical model to delineate the glucagon induced autophagy in rat liver. This primitive study was based on experimental observations, collision theory and chemical kinetics, and mainly focused on studying the population of telolysosomes, autophagosomes and autolysosomes in rat liver. Thereafter, various studies on autophagy have incorporated different types of mathematical models, viz. ODE, DDE and SDE. The widely used ODE-based models are the simplest to study the process of autophagy. These models are entrenched on the assertion that the system considered is well-mixed, and there are sufficient numbers of components so that their numbers can be considered as continuous quantities. For the benefit of the readers, we have explained some terminologies associated with the model analysis in Table 1. Understanding the steady-state, stability, and other qualitative behavior of a model will unveil the system’s underlying mechanism. For example, response to cellular starvation is an intrinsic property of autophagy and was mathematically addressed by Jin et al. [44]. They classified the cells to normal phase and autophagic phase, and by taking nutrition as the third variable, a logistic type (3D) model of yeast cell population was constructed and analyzed. The model considered in this example has one unstable trivial equilibrium point when the nutrient concentration in the input flux and nutrient loss rate by output flux is constant and a locally asymptotically stable positive equilibrium point when the system is considered without autophagy. The model analysis concluded that an efficient autophagy level might be adequate to sustain a population during a long duration of starvation. However, the author did not incorporate any molecular regulation in their study. A hybrid model consisting of cell population dynamics and molecular regulation could has provided a better insight into cell fate regulation by autophagy. Addressing this issue, the same group later developed a hybrid model [49] to understand the molecular regulation and population dynamics of yeast by incorporating molecular level interactions, the amino acid exchange between cells, and cell behavior.
Equilibrium point or steady-state: An equilibrium point or a steady-state of a system of differential equations is the value of the state variables where the state variables do not change with time. In other words, it is a time-independent solution of the system. |
Stability and instability of an equilibrium: An equilibrium is said to be stable if solutions starting close to the equilibrium point remain close, otherwise it is said to be unstable. |
Bifurcation: Bifurcation in a system occurs when a small perturbation made to the parameter values of the system results in a sudden qualitative or topological change in its behavior. Such parameters are called bifurcation parameters. |
Bistability: In a dynamical system, bistability means the system under consideration possesses two stable states. |
Equilibrium point or steady-state: An equilibrium point or a steady-state of a system of differential equations is the value of the state variables where the state variables do not change with time. In other words, it is a time-independent solution of the system. |
Stability and instability of an equilibrium: An equilibrium is said to be stable if solutions starting close to the equilibrium point remain close, otherwise it is said to be unstable. |
Bifurcation: Bifurcation in a system occurs when a small perturbation made to the parameter values of the system results in a sudden qualitative or topological change in its behavior. Such parameters are called bifurcation parameters. |
Bistability: In a dynamical system, bistability means the system under consideration possesses two stable states. |
Equilibrium point or steady-state: An equilibrium point or a steady-state of a system of differential equations is the value of the state variables where the state variables do not change with time. In other words, it is a time-independent solution of the system. |
Stability and instability of an equilibrium: An equilibrium is said to be stable if solutions starting close to the equilibrium point remain close, otherwise it is said to be unstable. |
Bifurcation: Bifurcation in a system occurs when a small perturbation made to the parameter values of the system results in a sudden qualitative or topological change in its behavior. Such parameters are called bifurcation parameters. |
Bistability: In a dynamical system, bistability means the system under consideration possesses two stable states. |
Equilibrium point or steady-state: An equilibrium point or a steady-state of a system of differential equations is the value of the state variables where the state variables do not change with time. In other words, it is a time-independent solution of the system. |
Stability and instability of an equilibrium: An equilibrium is said to be stable if solutions starting close to the equilibrium point remain close, otherwise it is said to be unstable. |
Bifurcation: Bifurcation in a system occurs when a small perturbation made to the parameter values of the system results in a sudden qualitative or topological change in its behavior. Such parameters are called bifurcation parameters. |
Bistability: In a dynamical system, bistability means the system under consideration possesses two stable states. |
ODE models are also built to predict optimal drug schedules to control autophagy. Shirin et al. [73] formulated a nonlinear ODE model to predict optimal drug schedules to control autophagy. Focusing on four autophagosome production influencers and their specific inhibitors, the model figured out various drug pairs that are more effective when taken together. Mathematical models can qualitatively estimate the protein levels that are capable of deregulating homeostasis, like Ouzounoglou et al. [74] formulated a model to understand the dynamics of Alpha-synuclein (ASYN) in Parkinson’s disease.
Autophagy and apoptosis pathways are closely regulated, and some proteins, which regulate autophagy, can also regulate apoptosis [75–78]. Hence, proper knowledge of autophagy and apoptosis interconnections may help to stop or promote fatal cell decisions. Kapuy et al. [79] studied beclin1-mediated autophagy and caspases-mediated apoptosis by forming an ODE model. The model was built to address the B-cell lymphoma 2 (BCL2)-Beclin1-caspases minimal network. They have also considered the effect of stress on autophagy by taking it as a bifurcation input. Based on the observation, it was suggested that the autophagy apoptosis transition is adjudicated by a bistable switch and, depending upon the intensity and duration of stress levels, sequential activation of cellular response can be initiated by a combination of BCL2-dependent regulation and feedback loops between Beclin1 and caspases. Various other models have also been built to understand the autophagy–apoptosis interplay [80–82]. A key feature of autophagy is that it also plays a role in unfolded protein response (UPR). A literature study has revealed that the complex interactions between UPR, autophagy and apoptosis may determine cellular fate in response to drug treatment [83,84]. Cyto-protective or cyto-destructive UPR gets activated by anti-oestrogens or other drug therapies. Autophagy assists in the cyto-protective role of UPR, while the cyto-destructive role contributes to apoptosis [85]. Addressing these, a mathematical model of autophagy, apoptosis and UPR was proposed to understand the interactions that accomplish anti-estrogen resistance and the effects of GRP78 on both sensitive and resistant breast cancer cells [85]. The model provides a clear picture of interactions of autophagy, apoptosis and UPR to produce both sensitivity and resistance to antioestrogen therapy under various conditions.
The time delay associated with any biological process are not facilitated by ODE-based models. This is mainly addressed by DDE. These models address the time lags between biological processes and thus offer a better portrayal of biological systems. Time lag plays a vital role in autophagy, as in many biological processes. Various studies have implemented DDE-based mathematical modeling to understand the hidden mechanisms in the autophagy process. For example, in autophagy, the formation of autolysosome follows autophagosome formation indicating a time delay. Han et al. [47] formulated an 8D model using the delay to study the behavior of both resident (normal) and abnormal proteins along with the formation of autophagosomes and autolysosomes, the intracellular concentration of adenosine triphosphate (ATP), and amino acids. The study showed that intracellular levels of autophagosomes and autolysosomes display an oscillatory behavior. The same group later formed another mathematical model to explore the role of autophagy in the protein/organelle quality control when exposed to different physiological perturbations [86] and further extended their study to Alzheimer’s disease [87].
ODE-based models do not consider the effect of noise, which is an inherent property in many dynamical systems. This property is addressed by the SDE models, as done by Martin et al. [46], who studied autophagy vesicle dynamics in a single cell. They used live-cell fluorescent microscopy to measure the synthesis and lysosomal turnover of autophagic vesicles (AV). The data were used to build a 4D ODE model, followed by a 23D SDE model for the accurate prediction of AV dynamics in a cell. The SDE model has implemented a sequence of biochemical and physiological steps in the autophagic pathway from PtdIns3KC3 activation through LC3 conjugation that comprises of the nucleation of the phagophore, maturation of the AV and lysosomal degradation. The mechanistic model was a better portrayal of the autophagy dynamics in a cell. For example, correlating with the experimental data, the SDE model captured a time lag in the production of AV in response to treatment initiation, but no such behavior could be achieved with the deterministic model. The SDE model was also capable of accurately predicting that an 80% decrease in ATG9 content would result in a corresponding reduction in vesicle synthesis rate. It also stated the correlation between AV size and LC3 levels across single cells. The study can be taken as an example to quote that although ODE models are less complicated and can portray biological behavior, SDE models are a better illustrator of biological phenomena.
Agent-based models for autophagy
ABM is an alternative approach that relies on a predefined logical programming language. In ABM, the system consists of a collection of interacting autonomous decision-making bodies known as agents. ODE modeling presupposes a homogeneous environment, while ABM is capable of simulating a transient and spatial evolution of a system that each participant in the model is represented as an individual agent per its laws. One of the fundamental aspects of ABM is the occurrence of complex behavior from a set of simple rules. It simulates the interactions between multiple independent agents and evaluates their effect on the overall system. It captures the emerging phenomena of a complex system from the perspective of its constituent components, making ABM a bottom-up approach [88]. The benefits of ABM include their flexibility, the natural way of description of the system and the capability of capturing the emergent phenomena due to the interactions of individual entities [88]. ABM facilitates both discrete and continuum mathematical modeling approaches. The study of tumor cell density, nutrient distribution, etc. comes within the radar of the continuum modeling approach, whereas cellular automation is a representation of discrete mathematical modeling.
ABMs have been used extensively to explain biological phenomena in various biological systems. For example, a 3D agent-based Voronoi–Delaunay hybrid model was developed by Schaller and Meyer-Hermann [89], where reaction–diffusion equations depicted the spatiotemporal distribution of oxygen and glucose. Their study was an effort to test the hypothesized functional dependence of the absorption rates of glucose and oxygen, and to determine suitable mechanisms for necrosis induction. Another ABM was built by Engelberg et al. [90], where different spaces for tumor cells, oxygen, nutrient and toxic inhibitors were considered. The goal of the study was to create a model consisting of separate cells that fairly represents the behavior of an in vitro multicellular tumor spheroid.
However, the applications of ABM to study autophagy are very few. The creation, movement, fusion and deterioration of autophagy pathway vesicles are dynamic both temporally and spatially. To delineate the spatio-temporal aspects of autophagy regulation and its dynamic behavior, Borlin et al. [91] has constructed an ABM using the NetLogo ABM platform. The first agent being phagophore, which grows and matures to form the second agent autophagosome, which then fuses with the third agent lysosome to generate the last agent autolysosome. The newly formed autolysosomes can then either fuse with lysosomes, autophagosomes or other autolysosomes to grow. They inferred spontaneous motion for phagophores and autolysosomes to simulate organelle movements, while autophagosomes and lysosomes travel directly toward or directly away from the nucleus to replicate their active transport along the cytoskeleton, at a pace that is independent of its size. The key parameters of the model were fitted with an iterative method using a genetic algorithm and a predefined fitness function. The model, integrated with high-resolution fluorescence microscopy data, could successfully reproduce the short-term and long-term behavior and cell-to-cell variability.
Biological processes like autophagy involve complex mechanisms with many pathways and molecules that change over time and space, and understanding such systems ABM would play a vital role. These models can also help to mathematical portraiture of the biological phenomenon like the spatial and temporal requirement of autophagy-related protein to bacteria [92]. However, ABM has certain drawbacks. For instance, it demands more details to be provided about the system of interest, which may not always be reported in the literature. Another disadvantage of ABM is that it is more computationally expensive than partial differential equations (PDE) or ordinary differential equations.
Petri net
Petri net is the creation of Carl Adam Petri in his doctoral dissertation [93]. It is constructed using two types of nodes, viz. places, depicted as circles, and transitions represented as narrow black rectangles. In systems biology, places refer to chemical species such as metabolites, proteins, enzymes, DNA, RNA, etc. and transitions refer to chemical reactions such as activation, inhibition, phosphorylation, etc. Nodes are connected by arcs, which may only be directed from place to transition (input arcs) or transition to a place (output arcs). A Petri net is always bipartite. The stoichiometry of a reaction is indicated by the weight of the arc. Although initially designed to model only discrete processes, improvements have been made in Petri nets to deal with a continuous process [94,95].
Literature has witnessed many applications of Petri nets to different biochemical systems. For example, Koch et al. [96] built a metabolic Petri net (where the places represent metabolites and the transitions represent the biochemical reactions between metabolites) consisting of 17 places and 27 transitions, that qualitatively modeled the carbon metabolism in the potato tuber. Using this Petri net model as an example, the author has provided a method for model validation of metabolic networks using Petri net. Signal transduction pathways are commonly modeled with a set of ordinary differential equations, but unknown parameter estimation is a problem inherent with ODE modeling. To deal with this problem, Sackmann et al. [97] implemented the Petri net theory to model and analyze signal transduction pathways. The authors put forward a systematic model validation method for signal transduction pathways that depends only on the network structure. This method is then illustrated using the mating pheromone response pathway in Sacchromyces cerevisiae.
Minimal literature is available on the use of Petri net in the study of autophagy. Jennifer et al. [98] studied the Salmonella xenophagy in epithelial cells by designing a Petri net model. The model includes all biochemically proven and published processes of Salmonella xenophagy in epithelial cells and comprises 61 places (proteins/macromolecular complexes/organisms/signals) and 184 arcs. The model consists of 16 T-invariants describing biological subpathways in steady-state and represents the fundamental dynamics of the system. The author has implemented in silico knockouts of specific proteins to investigate the model behavior and the corresponding biological effect. So, the Petri net model helps to combine different molecular processes like ubiquitination, binding of the autophagy receptor, etc., which can then be used for various analyses like in silico knockout experiment. This type of model is advantageous in the absence of quantitative data. So, in the field-like autophagy, where a lot of pathways are involved, we believe the Petri net model would play a vital role. However, it has the limitation that it will not capture the mechanism, which one can obtain with the help of differential equation-based models.
Limitations of mathematical modeling
Despite being an excellent approach to study biological system dynamics, mathematical modeling possesses certain limitations and difficulties. These limitations must be taken into account in capturing the characteristics of a certain biological process with the help of mathematical modeling. Equations in a mathematical model contain parameters, and mathematical models are driven by these parameters. These parameters can be determined by experimental studies. However, many parameters still remain unknown because either the relevant experimental data are not available or the parameter values obtained in the literature are not from the system addressed by the model. For example, in a lung cancer model, the rate of degradation of beclin1 is a parameter, but in literature, this parameter value is reported in pancreatic cancer. Another difficulty in mathematical modeling is the different functioning time of various components of a pathway. For example, genetic regulatory processes are caused by metabolic reactions, but while the time taken by metabolic reactions is in seconds or minutes, the regulatory processes could occur for several hours or days. A mathematical model of a biological system should always be abided by biological constraints. The findings of the model need to be validated according to the objectives of the model. Hence, a qualitative or quantitative association of model output and biological data is very much necessary. But quantitative experimental data on the time course of interaction between model variables are often very limited. Biological systems possess hierarchical layers (cells–tissue–organs etc.). To understand a system, it is necessary to understand the dynamics of each layer. However, it is hard to model the entire system as a whole as the model formulated would be non-computable. Hence, modeling is limited to study the system in parts that necessitate the emergence of system biology approaches like network analysis, which can study the entire system as a whole.
Systems biology approach for autophagy
Importance of systems biology and network analysis
Biological systems can be portrayed as networks, and these networks depict the physical and spatial organization of the organism. Systems biology employs a pragmatic approach to elucidate the emergent properties of such networks with the aim of quantitative explanation and to foresee the biological processes occurring at molecular, cellular, tissue, organ and whole-body level. It focuses on a holistic analysis of biological networks of various processes, including autophagy. The process of autophagy is governed by a large number of proteins as being a protective and life-sustaining process; it is associated with various physiological processes and pathological conditions. Systems biology quests for the understanding of the extent to which the intermodular connectivity modulates the autophagy process. The systems biological approaches, especially the network analysis, are necessary to decipher the crosstalk between autophagy and various diseases.
Network analysis investigates the entire system as a whole. It is like a snapshot of the entire system at a particular time, where we can see all the nodes and their interactors. In systems biology, there are various types of the network depending on the nodes studied, such as protein– PPI network, where nodes are proteins, and the edges are the interaction between them; metabolic network, where the nodes are metabolites and the edges are the reactions between them; gene regulatory networks, where nodes are genes and edges are the physical and/or regulatory relationships between the genes; ecological networks, where species are nodes and edges are the interactions which can be either trophic or symbiotic. In this review, we have focused on PPI networks. They are scale-free networks in nature, i.e. the majority of proteins possess only a few interactions with other proteins. In contrast, some proteins are connected to many other proteins in the network and are coined hub proteins. The degree distribution of a PPI network follows a power law. The diameter of a PPI network (the shortest distance between two distant nodes) is smaller. This confirms the fact that these networks show a small world effect. Another essential property of PPI networks is transitivity, which is the proneness of proteins to form clusters together.
In systems biology, network analysis can be divided into two categories. Enrichment analysis belongs to the first category. Some genes possess similar biological attributes like involvement in disease, common pathways or localizations. Enrichment analysis helps to classify these genes and points out essential processes, pathways or diseases in a gene list generated from genome-scale experiments. The second category is based on algorithm-related works where the goal is to find potential targets, i.e. to find important nodes in a network. Since the word importance is vague, it gives rise to the centrality analysis, where many methods have been used to find the important nodes of the system. Some of these methods include degree centrality, betweenness centrality, radiality, clustering coefficient, etc. Novel algorithms have also been developed to find important nodes in a PPI network [99–101]. Although distinguished, these studies are not mutually exclusive. In literature, evidences of both these categories studying together have also been reported [99,102].
Many packages across various platforms have also been used to perform the network-based study. We have enlisted a few useful and most used packages in Table 2. Proper visualization of data is crucial for understanding the biological network. There is various visualization software that exist [103–111], but four of the most widely used visualization and analysis software are Cytoscape [112], Gephi [113], Tulip [114] and Pajek [115]. Web tools like CellNetVis [116], Biographer [117] are also used for network visualization purposes. A detailed review of different types of visualization software in biological networks can be found in the study of Pavlopoulos et al. [118]. Table 3 contains some useful Cytoscape plugins used in network analysis, visualizations and enrichment analysis. A detailed review of the Cytoscape plugins can be found in [119]. The next section will provide a concise description of network analysis studies done in autophagy (in humans only).
List of useful packages and software for the network-based study of systems biology
S. no. . | Package/software and Link . | Platform . | Description . | Ref. . |
---|---|---|---|---|
1 | dplyr https://CRAN.R-project.org/package=dplyr | R | The dplyr is a powerful R package that facilitates the manipulation, cleaning and summarizing unstructured data. It comes with many functions that perform widely used data manipulation operations such as specific column selection, applying a filter, data sorting, addition or deletion of columns, and aggregating data. The functions of dplyr are very user friendly and are very easy to learn. | [155] |
2 | ggplot2 https://CRAN.R-project.org/package=ggplot2 | R | ggplot2 is an excellent data visualization package and is based on the grammar of graphics. It is a flexible, mature and complete graphics system, but it has some cons like it is not capable of 3D graphics. | [156] |
3 | Bioconductor https://www.bioconductor.org/ | R | The Bioconductor project is a collaborative effort to create computational biology and bioinformatics extensible packages and software. It uses the R programming platform and is open source and open development. A total of 1903 software are available in the latest release of bioconductor (3.11) | [157] |
4 | mlr https://CRAN.R-project.org/package=mlr | R | mlr is an R package to perform ML tasks. This package offers a standardized, object-oriented and extensible framework for classification, regression, survival analysis and clustering. | [158] |
5 | linear models for microarray data (limma) https://bioconductor.org/packages/release/bioc/html/limma.html | R | This R package is used for the analysis of gene expression data coming from microarray or RNA-seq technologies. | [159] |
6 | Weighted correlation network analysis https://CRAN.R-project.org/package=WGCNA | R | WCGNA is a popular R analytical package that constructs a gene co-expression network and identifies gene modules. Biologically or clinically significant modules are then determined and topological properties of the network are evaluated. | [160] |
7 | biomaRt https://CRAN.R-project.org/package=biomartr | R | This open-source R package incorporates easy and user-friendly functions to capture all genomic data or data for selected proteomes, genomes, coding sequences and annotation files contained in the databases hosted by the National Center for Biotechnology Information (NCBI) and European Bioinformatics Institute (EMBL-EBI). | [161] |
8 | DESeq2 https://bioconductor.org/packages/DESeq2/ | R | DESeq2 is a widely used method for differential expression analysis of count data and freely available through Bioconductor. The differential expression tests of DESeq2 are based on a negative binomial generalized linear model. DESeq2 has several new features that allow for more quantitative analysis of RNA-seq comparative data with shrinkage estimators for dispersion and fold change. | [162] |
9 | CFinder http://www.cfinder.org/ | – | CFinder is a platform-independent, stand-alone application that locates overlapping groups of densely interconnected nodes in a network with the aid of the clique percolation method. It can predict the function(s) of a protein and can also be used to discover novel modules. | [163] |
10 | Enrichr https://amp.pharm.mssm.edu/Enrichr/ | – | This is a useful web-based and mobile software application to perform gene enrichment analysis and is facilitated by various interactive visualization approaches to display enrichment results. Enrichr uses different databases for the enrichment analysis and displays results obtained by each database separately. | [164] |
11 | PyPathway https://pypi.org/project/pypathway/#files | Python | PyPathway is a free and open-source python package that performs functional enrichment analysis, network modeling and network visualization. | [165] |
S. no. . | Package/software and Link . | Platform . | Description . | Ref. . |
---|---|---|---|---|
1 | dplyr https://CRAN.R-project.org/package=dplyr | R | The dplyr is a powerful R package that facilitates the manipulation, cleaning and summarizing unstructured data. It comes with many functions that perform widely used data manipulation operations such as specific column selection, applying a filter, data sorting, addition or deletion of columns, and aggregating data. The functions of dplyr are very user friendly and are very easy to learn. | [155] |
2 | ggplot2 https://CRAN.R-project.org/package=ggplot2 | R | ggplot2 is an excellent data visualization package and is based on the grammar of graphics. It is a flexible, mature and complete graphics system, but it has some cons like it is not capable of 3D graphics. | [156] |
3 | Bioconductor https://www.bioconductor.org/ | R | The Bioconductor project is a collaborative effort to create computational biology and bioinformatics extensible packages and software. It uses the R programming platform and is open source and open development. A total of 1903 software are available in the latest release of bioconductor (3.11) | [157] |
4 | mlr https://CRAN.R-project.org/package=mlr | R | mlr is an R package to perform ML tasks. This package offers a standardized, object-oriented and extensible framework for classification, regression, survival analysis and clustering. | [158] |
5 | linear models for microarray data (limma) https://bioconductor.org/packages/release/bioc/html/limma.html | R | This R package is used for the analysis of gene expression data coming from microarray or RNA-seq technologies. | [159] |
6 | Weighted correlation network analysis https://CRAN.R-project.org/package=WGCNA | R | WCGNA is a popular R analytical package that constructs a gene co-expression network and identifies gene modules. Biologically or clinically significant modules are then determined and topological properties of the network are evaluated. | [160] |
7 | biomaRt https://CRAN.R-project.org/package=biomartr | R | This open-source R package incorporates easy and user-friendly functions to capture all genomic data or data for selected proteomes, genomes, coding sequences and annotation files contained in the databases hosted by the National Center for Biotechnology Information (NCBI) and European Bioinformatics Institute (EMBL-EBI). | [161] |
8 | DESeq2 https://bioconductor.org/packages/DESeq2/ | R | DESeq2 is a widely used method for differential expression analysis of count data and freely available through Bioconductor. The differential expression tests of DESeq2 are based on a negative binomial generalized linear model. DESeq2 has several new features that allow for more quantitative analysis of RNA-seq comparative data with shrinkage estimators for dispersion and fold change. | [162] |
9 | CFinder http://www.cfinder.org/ | – | CFinder is a platform-independent, stand-alone application that locates overlapping groups of densely interconnected nodes in a network with the aid of the clique percolation method. It can predict the function(s) of a protein and can also be used to discover novel modules. | [163] |
10 | Enrichr https://amp.pharm.mssm.edu/Enrichr/ | – | This is a useful web-based and mobile software application to perform gene enrichment analysis and is facilitated by various interactive visualization approaches to display enrichment results. Enrichr uses different databases for the enrichment analysis and displays results obtained by each database separately. | [164] |
11 | PyPathway https://pypi.org/project/pypathway/#files | Python | PyPathway is a free and open-source python package that performs functional enrichment analysis, network modeling and network visualization. | [165] |
List of useful packages and software for the network-based study of systems biology
S. no. . | Package/software and Link . | Platform . | Description . | Ref. . |
---|---|---|---|---|
1 | dplyr https://CRAN.R-project.org/package=dplyr | R | The dplyr is a powerful R package that facilitates the manipulation, cleaning and summarizing unstructured data. It comes with many functions that perform widely used data manipulation operations such as specific column selection, applying a filter, data sorting, addition or deletion of columns, and aggregating data. The functions of dplyr are very user friendly and are very easy to learn. | [155] |
2 | ggplot2 https://CRAN.R-project.org/package=ggplot2 | R | ggplot2 is an excellent data visualization package and is based on the grammar of graphics. It is a flexible, mature and complete graphics system, but it has some cons like it is not capable of 3D graphics. | [156] |
3 | Bioconductor https://www.bioconductor.org/ | R | The Bioconductor project is a collaborative effort to create computational biology and bioinformatics extensible packages and software. It uses the R programming platform and is open source and open development. A total of 1903 software are available in the latest release of bioconductor (3.11) | [157] |
4 | mlr https://CRAN.R-project.org/package=mlr | R | mlr is an R package to perform ML tasks. This package offers a standardized, object-oriented and extensible framework for classification, regression, survival analysis and clustering. | [158] |
5 | linear models for microarray data (limma) https://bioconductor.org/packages/release/bioc/html/limma.html | R | This R package is used for the analysis of gene expression data coming from microarray or RNA-seq technologies. | [159] |
6 | Weighted correlation network analysis https://CRAN.R-project.org/package=WGCNA | R | WCGNA is a popular R analytical package that constructs a gene co-expression network and identifies gene modules. Biologically or clinically significant modules are then determined and topological properties of the network are evaluated. | [160] |
7 | biomaRt https://CRAN.R-project.org/package=biomartr | R | This open-source R package incorporates easy and user-friendly functions to capture all genomic data or data for selected proteomes, genomes, coding sequences and annotation files contained in the databases hosted by the National Center for Biotechnology Information (NCBI) and European Bioinformatics Institute (EMBL-EBI). | [161] |
8 | DESeq2 https://bioconductor.org/packages/DESeq2/ | R | DESeq2 is a widely used method for differential expression analysis of count data and freely available through Bioconductor. The differential expression tests of DESeq2 are based on a negative binomial generalized linear model. DESeq2 has several new features that allow for more quantitative analysis of RNA-seq comparative data with shrinkage estimators for dispersion and fold change. | [162] |
9 | CFinder http://www.cfinder.org/ | – | CFinder is a platform-independent, stand-alone application that locates overlapping groups of densely interconnected nodes in a network with the aid of the clique percolation method. It can predict the function(s) of a protein and can also be used to discover novel modules. | [163] |
10 | Enrichr https://amp.pharm.mssm.edu/Enrichr/ | – | This is a useful web-based and mobile software application to perform gene enrichment analysis and is facilitated by various interactive visualization approaches to display enrichment results. Enrichr uses different databases for the enrichment analysis and displays results obtained by each database separately. | [164] |
11 | PyPathway https://pypi.org/project/pypathway/#files | Python | PyPathway is a free and open-source python package that performs functional enrichment analysis, network modeling and network visualization. | [165] |
S. no. . | Package/software and Link . | Platform . | Description . | Ref. . |
---|---|---|---|---|
1 | dplyr https://CRAN.R-project.org/package=dplyr | R | The dplyr is a powerful R package that facilitates the manipulation, cleaning and summarizing unstructured data. It comes with many functions that perform widely used data manipulation operations such as specific column selection, applying a filter, data sorting, addition or deletion of columns, and aggregating data. The functions of dplyr are very user friendly and are very easy to learn. | [155] |
2 | ggplot2 https://CRAN.R-project.org/package=ggplot2 | R | ggplot2 is an excellent data visualization package and is based on the grammar of graphics. It is a flexible, mature and complete graphics system, but it has some cons like it is not capable of 3D graphics. | [156] |
3 | Bioconductor https://www.bioconductor.org/ | R | The Bioconductor project is a collaborative effort to create computational biology and bioinformatics extensible packages and software. It uses the R programming platform and is open source and open development. A total of 1903 software are available in the latest release of bioconductor (3.11) | [157] |
4 | mlr https://CRAN.R-project.org/package=mlr | R | mlr is an R package to perform ML tasks. This package offers a standardized, object-oriented and extensible framework for classification, regression, survival analysis and clustering. | [158] |
5 | linear models for microarray data (limma) https://bioconductor.org/packages/release/bioc/html/limma.html | R | This R package is used for the analysis of gene expression data coming from microarray or RNA-seq technologies. | [159] |
6 | Weighted correlation network analysis https://CRAN.R-project.org/package=WGCNA | R | WCGNA is a popular R analytical package that constructs a gene co-expression network and identifies gene modules. Biologically or clinically significant modules are then determined and topological properties of the network are evaluated. | [160] |
7 | biomaRt https://CRAN.R-project.org/package=biomartr | R | This open-source R package incorporates easy and user-friendly functions to capture all genomic data or data for selected proteomes, genomes, coding sequences and annotation files contained in the databases hosted by the National Center for Biotechnology Information (NCBI) and European Bioinformatics Institute (EMBL-EBI). | [161] |
8 | DESeq2 https://bioconductor.org/packages/DESeq2/ | R | DESeq2 is a widely used method for differential expression analysis of count data and freely available through Bioconductor. The differential expression tests of DESeq2 are based on a negative binomial generalized linear model. DESeq2 has several new features that allow for more quantitative analysis of RNA-seq comparative data with shrinkage estimators for dispersion and fold change. | [162] |
9 | CFinder http://www.cfinder.org/ | – | CFinder is a platform-independent, stand-alone application that locates overlapping groups of densely interconnected nodes in a network with the aid of the clique percolation method. It can predict the function(s) of a protein and can also be used to discover novel modules. | [163] |
10 | Enrichr https://amp.pharm.mssm.edu/Enrichr/ | – | This is a useful web-based and mobile software application to perform gene enrichment analysis and is facilitated by various interactive visualization approaches to display enrichment results. Enrichr uses different databases for the enrichment analysis and displays results obtained by each database separately. | [164] |
11 | PyPathway https://pypi.org/project/pypathway/#files | Python | PyPathway is a free and open-source python package that performs functional enrichment analysis, network modeling and network visualization. | [165] |
List of useful Cytoscape plugins. Among these, BINGO, MCODE, Agilent Literature Search and jActiveModules are the four most downloaded Cytoscape plugins in the order they are mentioned [119]. All these plugins are freely available at Cytoscape app store (https://apps.cytoscape.org/)
S. no. . | Plugin . | Description . | References . |
---|---|---|---|
1 | BiNGO | Quantifies GO terms that have been overrepresented in the network and portrays them as a network of relevant GO terms. | [166] |
2 | Mosaic and cerebral | These two are visualization plugins for Cytoscape and can compartmentalize the genes/proteins in a network according to their subcellular localization. | [167,168] |
3 | PathLinker | This package reconstructs signaling pathways from PPI networks. | [169] |
4 | CytoNCA | Perform centrality analysis of weighted and unweighted networks. | [170] |
5 | ClueGO | It helps to create and visualize a functionally grouped network of terms/pathways. | [171] |
6 | GeneMANIA | Uses public databases to import interaction networks from a list of genes with their annotations and putative functions. | [172] |
7 | BiNoM | It helps to access and analyze pathways. | [173] |
8 | PiNGO | Helps to locate candidate genes in a network that are linked with user-defined target GO terms. | [174] |
9 | MCODE | Create clusters in a given network based on the topology to identify densely connected regions. | [175] |
10 | ConsensusPathDBplugin | Retrieves interaction evidence for a given pair of genes or proteins. | [176] |
11 | AgilentLiteratureSearch | Mines scientific literature to find publications associated with the search term and to create an interaction network based on the search result. | [177] |
12 | jActiveModules | Detects clusters where nodes show significant changes in expression levels. | [178] |
13 | cytoHubba | By using various topological algorithms, this Cytoscape plugin can predict and find important nodes and subnetworks in a given network. | [179] |
S. no. . | Plugin . | Description . | References . |
---|---|---|---|
1 | BiNGO | Quantifies GO terms that have been overrepresented in the network and portrays them as a network of relevant GO terms. | [166] |
2 | Mosaic and cerebral | These two are visualization plugins for Cytoscape and can compartmentalize the genes/proteins in a network according to their subcellular localization. | [167,168] |
3 | PathLinker | This package reconstructs signaling pathways from PPI networks. | [169] |
4 | CytoNCA | Perform centrality analysis of weighted and unweighted networks. | [170] |
5 | ClueGO | It helps to create and visualize a functionally grouped network of terms/pathways. | [171] |
6 | GeneMANIA | Uses public databases to import interaction networks from a list of genes with their annotations and putative functions. | [172] |
7 | BiNoM | It helps to access and analyze pathways. | [173] |
8 | PiNGO | Helps to locate candidate genes in a network that are linked with user-defined target GO terms. | [174] |
9 | MCODE | Create clusters in a given network based on the topology to identify densely connected regions. | [175] |
10 | ConsensusPathDBplugin | Retrieves interaction evidence for a given pair of genes or proteins. | [176] |
11 | AgilentLiteratureSearch | Mines scientific literature to find publications associated with the search term and to create an interaction network based on the search result. | [177] |
12 | jActiveModules | Detects clusters where nodes show significant changes in expression levels. | [178] |
13 | cytoHubba | By using various topological algorithms, this Cytoscape plugin can predict and find important nodes and subnetworks in a given network. | [179] |
List of useful Cytoscape plugins. Among these, BINGO, MCODE, Agilent Literature Search and jActiveModules are the four most downloaded Cytoscape plugins in the order they are mentioned [119]. All these plugins are freely available at Cytoscape app store (https://apps.cytoscape.org/)
S. no. . | Plugin . | Description . | References . |
---|---|---|---|
1 | BiNGO | Quantifies GO terms that have been overrepresented in the network and portrays them as a network of relevant GO terms. | [166] |
2 | Mosaic and cerebral | These two are visualization plugins for Cytoscape and can compartmentalize the genes/proteins in a network according to their subcellular localization. | [167,168] |
3 | PathLinker | This package reconstructs signaling pathways from PPI networks. | [169] |
4 | CytoNCA | Perform centrality analysis of weighted and unweighted networks. | [170] |
5 | ClueGO | It helps to create and visualize a functionally grouped network of terms/pathways. | [171] |
6 | GeneMANIA | Uses public databases to import interaction networks from a list of genes with their annotations and putative functions. | [172] |
7 | BiNoM | It helps to access and analyze pathways. | [173] |
8 | PiNGO | Helps to locate candidate genes in a network that are linked with user-defined target GO terms. | [174] |
9 | MCODE | Create clusters in a given network based on the topology to identify densely connected regions. | [175] |
10 | ConsensusPathDBplugin | Retrieves interaction evidence for a given pair of genes or proteins. | [176] |
11 | AgilentLiteratureSearch | Mines scientific literature to find publications associated with the search term and to create an interaction network based on the search result. | [177] |
12 | jActiveModules | Detects clusters where nodes show significant changes in expression levels. | [178] |
13 | cytoHubba | By using various topological algorithms, this Cytoscape plugin can predict and find important nodes and subnetworks in a given network. | [179] |
S. no. . | Plugin . | Description . | References . |
---|---|---|---|
1 | BiNGO | Quantifies GO terms that have been overrepresented in the network and portrays them as a network of relevant GO terms. | [166] |
2 | Mosaic and cerebral | These two are visualization plugins for Cytoscape and can compartmentalize the genes/proteins in a network according to their subcellular localization. | [167,168] |
3 | PathLinker | This package reconstructs signaling pathways from PPI networks. | [169] |
4 | CytoNCA | Perform centrality analysis of weighted and unweighted networks. | [170] |
5 | ClueGO | It helps to create and visualize a functionally grouped network of terms/pathways. | [171] |
6 | GeneMANIA | Uses public databases to import interaction networks from a list of genes with their annotations and putative functions. | [172] |
7 | BiNoM | It helps to access and analyze pathways. | [173] |
8 | PiNGO | Helps to locate candidate genes in a network that are linked with user-defined target GO terms. | [174] |
9 | MCODE | Create clusters in a given network based on the topology to identify densely connected regions. | [175] |
10 | ConsensusPathDBplugin | Retrieves interaction evidence for a given pair of genes or proteins. | [176] |
11 | AgilentLiteratureSearch | Mines scientific literature to find publications associated with the search term and to create an interaction network based on the search result. | [177] |
12 | jActiveModules | Detects clusters where nodes show significant changes in expression levels. | [178] |
13 | cytoHubba | By using various topological algorithms, this Cytoscape plugin can predict and find important nodes and subnetworks in a given network. | [179] |
Omics technologies and autophagy
The emergence of omics technologies such as genomics, epigenomics, transcriptomics, proteomics, metabolomics, microbiomics, etc. has embraced new possibilities to study a biological system to an extraordinary detailed level. Genomics is the study of the genome of an organis, epigenomics aims at exploring global epigenetic changes that offer crucial insights to mechanisms and function of gene regulation across several genes in a cell or organism, transcriptomics relies on the qualitative and quantitative genome-wide study of RNA levels, while proteomics facilitates the study of the whole proteome of an organism [120]. Mass spectrometry-based proteomics is an indispensable approach to delineate protein expression, PPI, subcellular localization and post-translational modifications. Similarly, metabolomics is the large-scale study of metabolites within cells, biofluids, tissues or organisms [121]. The study of the microorganism in a given community comes under the focus of microbiomics [120]. Throughout the times, new dimensions have been added to omics such as lipidomics, nutrigenomics, etc. Omics methods produce data to provide biological understanding based on methodological inferences from large (in most cases) datasets. These technologies allow different molecular level investigations of the system that, too in a highly parallelized manner. For example, the microarray analysis measures the expression of almost all the genes in the system at a time. This parallelization encourages scientists not only to track an organism’s anticipated but also unexpected responses.
The integrated method of these omics strategies and PPI networks would enable a better understanding of the autophagy process. There are studies that incorporate a large-scale multi-omics approach to study the broad framework of autophagy and its association with other biological processes. These studies have deciphered the role of autophagy in host–pathogen interactions, tumor growth, various cancers, nervous systems etc. [122–125]. Considering that ‘omics’-based studies are a pivotal area of current research to provide a more systematic view of biological processes, these approaches have driven our insights into the regulation of autophagy.
Network analysis for autophagy
Throughout the decades, the advancement of high-performing data collection technology has resulted in a large number of autophagy-related data. Network analysis approaches have been implemented in these data to delineate the association of autophagy with various diseases and biological processes. Network analysis also helps to uncover the organizing principles of a disease and identifies the potential targets accountable for the pathogenesis of the disease.
The network analysis is well supported by autophagic databases, which play a crucial role in delineating the role of autophagy in various diseases. Various studies have been done by the implementation of the specific autophagic information obtained from these databases [123,126,127]. Lin et al. [127] carried out a comprehensive study of ATG and associated noncoding RNAs and transcription factors to investigate the association of autophagy with digestive system tumors (DST). The Cancer Genome Atlas database was used to get the digestive tumor transcription details. The autophagy genes were extracted from the Human autophagy modulator database. The study, facilitated by WGCNA, crosstalk connection, pivot analysis and functional analysis, revealed that the autophagic genes control the pathogenesis of DST and highlighted the potential role of autophagy in the treatment of DST. Wang et al. [123] constructed a disease autophagy network where disease genes were taken from online Mendelian inheritance in man (OMIM) [128] and autophagic genes were extracted from the human autophagy database (http://www.autophagy.lu/), the autophagy database [129] and the autophagy regulatory network database [130]. The autophagy genes were observed to act as a bridge between diseases and were found to be topologically important in the disease–autophagy network.
Network-based studies often facilitate the identification of hubs and modules. Modularity is an essential property of a network. It refers to the organization of nodes in clusters. Module-based analyses can contribute to a deeper understanding of biological systems. Hub proteins are also crucial in maintaining the global network structure. A study carried out by Durocher and co-researchers [55] to elucidate the gene network in the peripheral blood transcriptome associated with human intracerebral hemorrhage. Using the WGCNA package in R, they identified the hubs and the modules in the network, and used ingenuity pathway analysis (IPA) and the DAVID Bioinformatics Database [131] to find the associated pathways and processes.
Various studies [53,54,132] have performed a network-based analysis on autophagy by using the dataset obtained from the Gene Expression Omnibus (GEO) repository [133]. After following the preliminary analysis, the WGCNA package in R has been used to identify significant modules and hubs in the network. Databases like OMIM and the cancer genome atlas (https://www.cancer.gov/tcga) have also been used to facilitate the network-based study of autophagy processes.
Although network analysis approaches have been applied extensively to study autophagy, methods like network stability, control theory, percolation, etc., are yet to be integrated to study the autophagy process. Given the importance of these methods, their implications will surely help to identify novel targets and pathways related to autophagy. The lack of sufficient temporal data to understand a disease progression has also limited the network-based study of autophagy processes. Nonetheless, with time the data are growing, and we believe in the coming years, we will have enough data to make better and more accurate predictions.
Artificial intelligence associated research of autophagy
With the recent explosion of data, the applicability of artificial intelligence (AI) is getting excellent attention and has emerged as a promising field in systems biology. It creates algorithms that help in the classification, pattern recognition and predictions using available data. As in many other biological processes, AI-based approaches have also been incorporated in the field of autophagy. In a recent study, Zhaoyue et al. [134] applied machine learning (ML) techniques to classify renal cell carcinoma (RCC) subtypes using autophagy proteins. The expression data of the key autophagy proteins in RCC were measured by immunohistochemical images. The data were then normalized with mean and standard deviation. The ML algorithm, K-Nearest Neighbor (KNN) algorithm, was applied to the normalized data for classification. Their study identified the basal level of autophagy as a potential measurement for discrimination of RCC. In an early work, by Janos and co-researchers [135], an image analysis pipeline was developed using the support vector machine for the determination of novel selective pharmacological inducers of autophagy in human cancer cell lines. A variety of software incorporating a broad range of ML algorithms has been developed recently. For example, Serrano et al. [136] have used the software Scikit-learn [137] to study the effect of mRNA alterations of some autophagic genes, one proapoptotic gene and one antiapoptotic gene in HIV-infected patients effectively treated with combined antiretroviral therapy.
In the past two decades, the pharmacological modulation of autophagy has gathered a great deal of attraction. The process of autophagy gets manipulated by various autophagy modulators. ML methods can be blended to study the mechanism of actions of these autophagy modulators to gain knowledge on various factors that include side effects, drug repurposing and development of novel polypharmacological strategies [138]. AI approaches are powerful tools that associate important molecular changes with an observed phenomenon. However, these approaches remain silent on the underlying mechanism for such observations. To capture the possible mechanism, we need to take help from differential equation-based models.
Databases with the information related to autophagy
Biological databases play a central role in systems biological studies. They offer the opportunity to access a wide variety of biologically relevant data, which include PPI information, disease-protein association information, microarray, next-generation sequencing, protein localization, post-translational modification, the structural details of a protein or compound and pathways associated with proteins, etc. However, databases containing exclusively autophagic information are very few. In Table 4, we have enlisted 11 most used databases in autophagy. These databases contain various information like disease associations, pathways, the specific effect on autophagy, etc. In Figure 3, we have compared the features of these databases.
Some of the most used databases in autophagy. The features of these databases are shown in Figure 3
S. no. . | Name . | Full form . | URL . | Ref. . |
---|---|---|---|---|
1 | HAMDb | Human Autophagy Modulator Database | http://hamdb.scbdd.com | [180] |
2 | ARN | Autophagy Regulatory Network | http://autophagyregulation.org/ | [130] |
3 | Autophagy database | Autophagy database | http://www.tanpaku.org/autophagy | [129] |
4 | ncRDeathDB | The noncoding RNA (ncRNA)-associated cell death database | http://www.rna-society.org/ncrdeathdb | [181] |
5 | ACDB | Autophagic compound database | http://www.acdbliulab.com | [182] |
6 | THANATOS | Autophagy, Necrosis, Apoptosis OrchestratorS database | http://thanatos.biocuckoo.org | [183] |
7 | HADb | Human Autophagy Database | http://www.autophagy.lu/ | – |
8 | AutophagySMDB | Autophagy Small Molecule Database | http://www.autophagysmdb.org | [184] |
9 | ATD | Autophagy to disease | http://auto2disease.nwsuaflmz.com | [185] |
10 | iLIR | In silico identification of functional LC3 interacting region motifs database | https://ilir.warwick.ac.uk | [186] |
11 | ATdb | Autophagy and Tumor Database | http://www.bigzju.com/ATdb/#/ | [187] |
S. no. . | Name . | Full form . | URL . | Ref. . |
---|---|---|---|---|
1 | HAMDb | Human Autophagy Modulator Database | http://hamdb.scbdd.com | [180] |
2 | ARN | Autophagy Regulatory Network | http://autophagyregulation.org/ | [130] |
3 | Autophagy database | Autophagy database | http://www.tanpaku.org/autophagy | [129] |
4 | ncRDeathDB | The noncoding RNA (ncRNA)-associated cell death database | http://www.rna-society.org/ncrdeathdb | [181] |
5 | ACDB | Autophagic compound database | http://www.acdbliulab.com | [182] |
6 | THANATOS | Autophagy, Necrosis, Apoptosis OrchestratorS database | http://thanatos.biocuckoo.org | [183] |
7 | HADb | Human Autophagy Database | http://www.autophagy.lu/ | – |
8 | AutophagySMDB | Autophagy Small Molecule Database | http://www.autophagysmdb.org | [184] |
9 | ATD | Autophagy to disease | http://auto2disease.nwsuaflmz.com | [185] |
10 | iLIR | In silico identification of functional LC3 interacting region motifs database | https://ilir.warwick.ac.uk | [186] |
11 | ATdb | Autophagy and Tumor Database | http://www.bigzju.com/ATdb/#/ | [187] |
Some of the most used databases in autophagy. The features of these databases are shown in Figure 3
S. no. . | Name . | Full form . | URL . | Ref. . |
---|---|---|---|---|
1 | HAMDb | Human Autophagy Modulator Database | http://hamdb.scbdd.com | [180] |
2 | ARN | Autophagy Regulatory Network | http://autophagyregulation.org/ | [130] |
3 | Autophagy database | Autophagy database | http://www.tanpaku.org/autophagy | [129] |
4 | ncRDeathDB | The noncoding RNA (ncRNA)-associated cell death database | http://www.rna-society.org/ncrdeathdb | [181] |
5 | ACDB | Autophagic compound database | http://www.acdbliulab.com | [182] |
6 | THANATOS | Autophagy, Necrosis, Apoptosis OrchestratorS database | http://thanatos.biocuckoo.org | [183] |
7 | HADb | Human Autophagy Database | http://www.autophagy.lu/ | – |
8 | AutophagySMDB | Autophagy Small Molecule Database | http://www.autophagysmdb.org | [184] |
9 | ATD | Autophagy to disease | http://auto2disease.nwsuaflmz.com | [185] |
10 | iLIR | In silico identification of functional LC3 interacting region motifs database | https://ilir.warwick.ac.uk | [186] |
11 | ATdb | Autophagy and Tumor Database | http://www.bigzju.com/ATdb/#/ | [187] |
S. no. . | Name . | Full form . | URL . | Ref. . |
---|---|---|---|---|
1 | HAMDb | Human Autophagy Modulator Database | http://hamdb.scbdd.com | [180] |
2 | ARN | Autophagy Regulatory Network | http://autophagyregulation.org/ | [130] |
3 | Autophagy database | Autophagy database | http://www.tanpaku.org/autophagy | [129] |
4 | ncRDeathDB | The noncoding RNA (ncRNA)-associated cell death database | http://www.rna-society.org/ncrdeathdb | [181] |
5 | ACDB | Autophagic compound database | http://www.acdbliulab.com | [182] |
6 | THANATOS | Autophagy, Necrosis, Apoptosis OrchestratorS database | http://thanatos.biocuckoo.org | [183] |
7 | HADb | Human Autophagy Database | http://www.autophagy.lu/ | – |
8 | AutophagySMDB | Autophagy Small Molecule Database | http://www.autophagysmdb.org | [184] |
9 | ATD | Autophagy to disease | http://auto2disease.nwsuaflmz.com | [185] |
10 | iLIR | In silico identification of functional LC3 interacting region motifs database | https://ilir.warwick.ac.uk | [186] |
11 | ATdb | Autophagy and Tumor Database | http://www.bigzju.com/ATdb/#/ | [187] |

Comparison of some of the well-explored autophagy databases in literature. The columns contain the name of the databases, and features are placed in the row. The orange color means that the particular feature is present in the database, and the blue color means it is absent. Here, the agent feature includes drugs, chemicals and small molecules. Only autophagy-related proteins are considered for the ARN database. (##autophagy related genes extracted using text mining, #human autophagy related proteins).
Data repositories like GEO [133] and Array Express [139] can also be used for mining information such as microarray, next-generation sequencing and other forms of high-throughput functional genomic data. There are currently 510, and 181 studies on autophagy are available in GEO and ArrayExpress, respectively. The information regarding the biological pathways and diseases can be derived from databases such as the Kyoto Encyclopedia of Genes and Genomes (KEGG) [140] and reactome [141].
One single process and various computational approaches: which door to choose?
The Mathematical modeling and network analysis approaches can grasp the underlying dynamics and topology of any biological system. We have summarized the applications of mathematical and computational biology tools to study autophagy with differential environmental conditions (Figure 4). Nevertheless, the complexity and the choice of the approach can vary from system to system, depending on the perspective of the study. A mathematical model, as we have already stated, unveils the fundamental nature of a biological phenomenon. From the initiation to degradation, the process of autophagy comes under the influence of many proteins and stresses. Taking a few or all of them together, a mathematical model helps to understand how the dynamics of these sets of proteins influence the progression of autophagy by taking a deterministic approach. These models can predict cellular fate through autophagy by using a suitable set of parameters and a core set of autophagy modulators. They can also be used to study the randomness in the process of autophagy occurring due to the variability of the stress and frequent changes in the cell’s energy requirements. ABM can range from continuous to discrete based on the requirement. Petri nets facilitate both the qualitative and quantitative models and hence can be used to model the involvement of autophagy in the cellular biochemical reactions.

Application of mathematical and computational biology tools to study autophagy in different environmental conditions. Abbreviations: DEBM, differential equation-based mathematical models; ABM, agent-based model; PN, Petri net; NA, network analysis and ML, machine learning.
On the other hand, network theory can be used to identify crucial autophagy-related proteins responsible for the progression of diseases. Different sets of targets will be obtained for the same disease owing to the method applied, which hence will further require a biological validation. For example, if the intention is to select only the most connected proteins, the proper method will be to measure the degree centrality. But, if the goal is to find the proteins that can disperse information very effectively, the closeness centrality would be the best approach to consider. Contrary to the analysis of the topology of the system by network analysis, enrichment analysis focuses on extracting the pathways, localization and functions of the proteins present in the disease network. These pathways can then further be studied by constructing an autophagy specific PPI network to detect influential proteins in that pathway.
In the conclusion of this section, the choice of a computational approach depends on the perspective of the study. Although modeling methods always possess limitations in terms of the realistic portrayal of the biological phenomenon, they can still postulate conditions, parameters or factors that are necessary to govern the system.
Future direction
The number of papers published in autophagy is growing year after year, showing its increasing importance. But the ratio of computational studies to experimental studies is meager. For a better insight into the dynamics of the autophagy pathways, these gaps need to be narrowed down. Computational methods are rapidly evolving. Although the literature has reported the application of various computational methods in systems biology, many methods are yet to be appended to the autophagy domain, like models based on PDE and image-based modeling (IBM).
PDE-based modeling is well established in many biological fields. Crucial insights into biological systems can be obtained by the implementation of PDE as they are used to classify the spatiotemporal evolution of biological systems. PDE, unlike ODE, investigates spatial patterns of inherently heterogeneous systems with regionally varying fields. A literature study has revealed the application of these models in various fields, including bone remodeling [142], dengue [143] and cancer [144]. Studies complimented with PDE and ML approaches have also opened a new window for disease treatment [145,146]. The same goes for IBM, which relies on interpreting images as quantitative measurements. It acts as a bridge between systematic quantitative image data collection and spatiotemporal systems modeling. The first step in IBM is to collect images and to quantify objects, shapes, intensities or motion trajectories [147]. This step is succeeded by the model formulation part. The domain of the models can be divided into four categories, viz. discrete/stochastic, discrete/deterministic, continuous/stochastic and continuous/deterministic [148]. This step is followed by a simulation phase. Different types of models require various forms of simulations, which can be either custom made or some predefined methods/software packages [149]. The last phase of IBM is the parameter estimation and validation part to check whether the output of the formulated model correlates with the desired output or not. A lot of studies have been done on various diseases with IBM [150–152].
The process of autophagy has been a prominent focus of research as it is still a puzzle with various missing pieces due to its complex mechanism in numerous biological processes and diseases. AI-based approaches could be the ideal tools to locate these missing pieces. Such methods can be used to capture the association of autophagy with various other diseases to gain a more comprehensive insight. For example, the complex role of autophagy in cancer is yet to be fully elucidated. Integration of ML in cancer to study the role of autophagy can help a detailed insight to the process as well as to cancer prognosis and personalized medicine approach. The ML methods can also help to identify autophagy related genes and pathways which have not been addressed before.
From a network analysis perspective, temporal analysis of genes is done in diseases like obesity [100], but none has been reported in autophagy. Such a study will help to decipher the change in behavior of ATG with respect to time and identify potential drug targets. A structure-based study can be done to determine the possible binding sites of the targets. The identification of binding sites of a protein can help in the rational designing of the therapeutic agents [153,154]. This study can also be facilitated by the mathematical modeling of the identified targets. Such a pipeline based study in autophagy is lacking and will surely help to provide fruitful insights into the autophagy process.
Conclusion
Autophagy is a quintessential biological process that breaks down the unwanted cellular constituents and thus plays a crucial role in maintaining cellular homeostasis. Due to its immense importance toward the modulation of cellular fate, the process of autophagy remains at the crossroads of various cellular processes and signaling pathways. The tracking of signals that modulate autophagy, and genes, which have a role in the autophagy process, has encouraged detection and controlling of the autophagy pathway, and any hindrance to either of them may lead to various diseases. In the last decade, there is eminent progress in understanding the role of autophagy in different processes, but the underlying mechanism that leads to the observed phenomenon is still far from being captured. This is because autophagy is a very complex process and exhibits different behaviors depending upon the situation. To understand the underlying complexity, comprehensive knowledge of autophagy is important, necessitating a joint effort of experimental and theoretical biologists to unravel this complexity.
Decades of research on autophagy have resulted in considerable accumulation of experimental data. Comprehensive information on the core set of proteins that govern the process of autophagy and their underlying dynamics can be availed by the implementation of system biology approaches on these data. In this review, we attempted to elucidate how these approaches are being used for the study of the autophagy process, both from modeling and network perspectives. For the benefit of the readers, we have provided multiple examples from different types of mathematical models and computational tools that are applied to understand the autophagy process. We have shown that mathematical modeling is a way to illustrate the underlying complex dynamics of the autophagic process emanating from the interaction of the individual components of the system. However, modeling approaches possess various limitations like complexity, parameter values and lack of experimental data for model validations. We have also shown that to get a global perspective of the process, we need to take the help of systems biology tools such as network analysis that will decipher the crosstalk between autophagy and various diseases. Despite various systems biology tools, the retrieval of valuable information from omics data is still a challenging task in computational biology, and AI associated research has exhibited an unprecedented performance in doing so. These approaches can detect and predict the association of autophagy genes with various diseases leading to a more detailed understanding of the disease. By the identification of crucial autophagy-related proteins, AI can provide researchers more holistic insight into the complex autophagy process. We have discussed the different environmental conditions in autophagy and illustrated that the choice of analysis depends upon the question that a researcher wants to address.
Our present review is aimed at reaching a broader audience and supporting researchers with and without prior expertise on mathematical and computational tools. We have, therefore, provided some essential tools, plugins and software that are being used by researchers to study and visualize biological systems. The hope is that, by such means, this review will help researchers working in the field of autophagy to analyze their generated data in a better way. We have also drawn attention to some of the possible and helpful applications of computational methods that are yet to be integrated to study autophagy. In other words, our study limits autophagy with computational tasks that have been done before and can be performed in the near future, and thus serves as the footsteps of autophagy in computational biology.
Autophagy is a quintessential evolutionary conserved process and plays a crucial role in maintaining cellular homeostasis through the degradation of unwanted cellular materials.
Computational biology approaches like mathematical modeling and network analysis have been applied to delineate the underlying dynamics of complex autophagy mechanisms.
These approaches are bound up with constraints such as complexity, realistic description of the biological system, etc., and thus the method to be implemented should be chosen very carefully.
Although there is a plethora of PPI databases available in the literature, databases containing exclusively autophagic information are very few.
With the rapid development of the field, current challenges lie in how to choose the appropriate computational method that will address the problem one is studying.
Mathematical modeling enhances the mechanistic understanding of a system but works better in parts, while a network studies the changes associated with the whole system. For a better understanding of the process of autophagy, it is of utmost importance to link these two approaches and study the pipeline connecting them.
Acknowledgement
The authors thank the anonymous reviewers for their useful suggestions and comments that have improved the content and presentation of the manuscript.
Funding
The research work is supported by Department of Biotechnology (Government of INDIA) ref no BT/PR15426/BRB/10/1459/2015.
Conflicts of interest
No potential conflicts of interest were disclosed.
Dipanka Tanu Sarmah is a senior research scholar at the complex analysis group, Translational Health Science and Technology Institute, Faridabad, India, with a mathematics background. His current research focuses on the delineation of underlying principles of complex biological systems with the aid of the mathematical modeling and protein–protein interaction network analysis.
Nandadulal Bairagi is a professor at the Centre for Mathematical Biology and Ecology, Department of Mathematics, Jadavpur University, Kolkata, India. His current research area includes Ecology, Epidemiology, Physiology, Systems Biology and Nonlinear Dynamics.
Samrat Chatterjee is an associate professor at the complex analysis group, Translational Health Science and Technology Institute, Faridabad, India. His current research area is mathematical biology including, population dynamics and cellular behavior. The theme of his research is to understand the mechanism involved in a biological process and how it perturbs under disease conditions using mathematical and computational tools.