-
PDF
- Split View
-
Views
-
Cite
Cite
Kavita Sharma, Aastha Mishra, Himanshu N Singh, Deepak Parashar, Perwez Alam, Tashi Thinlas, Ghulam Mohammad, Ritushree Kukreti, Mansoor Ali Syed, M A Qadar Pasha, High-altitude pulmonary edema is aggravated by risk loci and associated transcription factors in HIF-prolyl hydroxylases, Human Molecular Genetics, Volume 30, Issue 18, 15 September 2021, Pages 1734–1749, https://doi.org/10.1093/hmg/ddab139
- Share Icon Share
Abstract
High-altitude (HA, >2500 m) hypoxic exposure evokes several physiological processes that may be abetted by differential genetic distribution in sojourners, who are susceptible to various HA disorders, such as high-altitude pulmonary edema (HAPE). The genetic variants in hypoxia-sensing genes influence the transcriptional output; however the functional role has not been investigated in HAPE. This study explored the two hypoxia-sensing genes, prolyl hydroxylase domain protein 2 (EGLN1) and factor inhibiting HIF-1α (HIF1AN) in HA adaptation and maladaptation in three well-characterized groups: highland natives, HAPE-free controls and HAPE-patients. The two genes were sequenced and subsequently validated through genotyping of significant single nucleotide polymorphisms (SNPs), haplotyping and multifactor dimensionality reduction. Three EGLN1 SNPs rs1538664, rs479200 and rs480902 and their haplotypes emerged significant in HAPE. Blood gene expression and protein levels also differed significantly (P < 0.05) and correlated with clinical parameters and respective alleles. The RegulomeDB annotation exercises of the loci corroborated regulatory role. Allele-specific differential expression was evidenced by luciferase assay followed by electrophoretic mobility shift assay, liquid chromatography with tandem mass spectrometry and supershift assays, which confirmed allele-specific transcription factor (TF) binding of FUS RNA-binding protein (FUS) with rs1538664A, Rho GDP dissociation inhibitor 1 (ARHDGIA) with rs479200T and hypoxia upregulated protein 1 (HYOU1) with rs480902C. Docking simulation studies were in sync for the DNA-TF structural variations. There was strong networking among the TFs that revealed physiological consequences through relevant pathways. The two hydroxylases appear crucial in the regulation of hypoxia-inducible responses.
Background
Mountains have attracted humans for millennia, despite the adverse effects of the hypobaric hypoxic environment (1–3). Sojourners suffer from various mountain disorders, and among these, high-altitude pulmonary edema (HAPE) has attracted great attention (4,5). Over time, humans have learned the clinical, physiological and genetic implications for survival under that extreme environment (6,7). The last decade has focused on hypoxia-associated signaling pathways. Several loci have been found to associate with high-altitude (HA) adaptation or maladaptation (6,8–10). Under hypoxia, hypoxia-inducible factor-1α (HIF-1α) is a master regulator that plays crucial roles in cellular responses; however equally crucial are the two oxygen sensors, Prolyl hydroxylase domain protein 2 (EGLN1) and Factor inhibiting HIF-1α (HIF1AN) in the regulation of HIF-1α signaling (11,12). The hydroxylation of specific prolyl residues, Pro-402 and Pro-564, of HIF-1α by the EGLN1 generates a binding site for the pVHL tumor suppressor protein that leads to polyubiquitination and destruction of HIF-1α under the normoxic condition. On the other hand, HIF1AN hydroxylates a specific asparagine residue, Asn-803 of HIF-1α that disrupts the interaction between HIF and its P300/CBP coactivators resulting in transcriptional inactivation. Genetic variations in EGLN1 and HIF1AN genes can govern the development of complex phenotypes in response to hypoxia (6,7,10). In addition to the genetic loci, however, the transcription factors (TFs) contribute to the physiological responses, although it remains to be seen as to how and how much each contributes (13,14). Our prior work on EGLN1 polymorphisms had emerged significant and we could predict a few TFs on these loci (8).
Most of the genome is regulatory, and TFs have preferential binding at a given locus, and a change of allele/variant may distract that TF but in the process may attract another TF (15,16). It is here that a vital role is hypothesized for TFs in the regulation of a physiological function and thus in diseases. Altered TF binding results in differential gene expression that brings out phenotypic differences in associated diseases (13,14). The Encyclopedia of DNA elements (ENCODE) project consortium has mapped active transcription sites in the human genome that helped in identifying the functions (ENCODE 2012). Based on these understandings, we hypothesize that the two hydroxylase genes, EGLN1 and HIF1AN, may regulate the physiological processes through their locus-specific TFs. We notice substantial gap among the several reports on the variants of these two genes and how the variants contribute to disease susceptibility through the transcriptional regulation (17–19).
This study was hence aimed at investigating the effect of genetic alterations in EGLN1 and HIF1AN in the three well-defined study groups and to functionally characterize the impact of these loci on the regulation of expression. Based on the genotyping data of several single nucleotide polymorphisms (SNPs), we narrowed down to seven EGLN1 and three HIF1AN SNPs. Further, based on the investigation of functionality of these variants, the three EGLN1 regulatory SNPs emerged relevant in the regulation of both genes and proteins. Through numerous experimental assays such as luciferase, electrophoretic mobility shift assay (EMSA), liquid chromatography with tandem mass spectrometry (LC–MS/MS) and super shift, we evidenced allele-specific differential binding of TFs and thereby differential regulatory effects on EGLN1 expression. The in silico analyses, especially annotation and docking simulation studies defined the structural variations at and around each locus and its plausible impact on TF binding. Thus, our study underlines a functional association between EGLN1 genetic variation and expression under hypobaric hypoxic environment.

Clinical characteristics associated with the three study groups. (A) Depicts the chest X-ray of a HAPE-p immediately upon admission and after recovery at third day. (B) A color Doppler showing Tricuspid Regurgitation jet and Pulsed Doppler measurement of the jet showing pulmonary arterial hypertension, right ventricular systolic pressure of 56.5 mmHg in a patient. (C) Levels of SaO2%. (D) Levels of MAP, mmHg. (E) PASP, mmHg. SaO2 levels were lower in the HAPE-p compared with the HAPE-f and HLs (P < 0.0001). HAPE-p had higher MAP and PASP values compared with the HAPE-f (P = 0.003 and <0.0001, respectively) and HLs (P ≤ 0.0001 and <0.0001, respectively). Each box plot in C, D and E shows the minimum, 25th percentile, median, 75th percentile and maximum values.
Results
Clinical profiling reveals disease susceptibility pattern in a hypobaric hypoxic environment
HAPE-patients (HAPE-p) imaging showed patchy infiltration on chest on the day of admission, which improved significantly upon treatment (Fig. 1A i, ii). Doppler echocardiography (Fig. 1B i, ii) confirmed the elevated pulmonary arterial pressure compared with HAPE-free sojourners (HAPE-f) (50.1 ± 9.0 versus 29.2 ± 3.77 mm Hg). Pulmonary arterial systolic pressure (PASP), systolic blood pressure (SBP) and arterial oxygen saturation (SaO2) differed significantly in HAPE-p compared with HAPE-f (P < 0.05; Fig. 1C–E). HAPE-p showed an average elevation of 03 mm Hg of mean arterial pressure (MAP), pulse rate (PR) of ~20%, PASP of 20 mm Hg and a decrease of >30% in SaO2 compared with the HAPE-f (Supplementary Material, Table S1). It is apparent from these clinical characteristics that in susceptible subjects the vascular system was stressed contributing to vasoconstriction and the disorder.
The two hydroxylases, EGLN1 and HIF1AN, are upregulated in HAPE
The EGLN1 and HIF1AN expression was significantly higher in HAPE-p compared with HAPE-f and highlanders (HLs) (P < 0.05; Fig. 2A and B). The mRNA expression of EGLN1 and HIF1AN was elevated by ~4.5- and 1.5-folds, respectively, in HAPE-p compared with HAPE-f. The plasma levels of EGLN1 and HIF1AN further complemented mRNA findings and differed significantly among the three groups (P < 0.0001; Fig. 2C and D); the levels being highest in HAPE-p. Interestingly, the levels of both the biomarkers were higher in HLs compared with HAPE-f (P = 4.0E−5; Fig. 2C and D) and thus suggesting physiological adaptive advantage in this group.
Potential protection and risk loci are revealed in EGLN1 and HIF1AN
Sequenom mass spectrometry-based genotype assay showed significant difference in seven EGLN1, SNPs rs1538664G/A, rs479200C/T, rs2486729G/A, rs2790879T/G, rs480902T/C, rs2486736G/A and rs973252A/G between HAPE-p and HAPE-f after adjustment with age, gender and body mass index (BMI) (P between 0.049 and 2.02E−5; Table 1). Multivariate logistic regression analysis revealed a significantly higher risk of HAPE for the seven minor alleles rs1538664A, rs479200T, rs2486729A, rs2790879G, rs480902C, rs2486736A and rs973252G; the adjusted risk-odds being above 1.0; as a consequence, these alleles were recognized as risk alleles in HAPE-p (P < 0.05). While the HLs revealed a significant selection of major alleles, rs1538664G, rs479200C, rs2486729G, rs2790879T, rs480902T, rs2486736G and rs973252A against two healthy control populations, HAPE-f and HapMap-CHB (P between 4.60E−4 and 7.63E−33; Table 1). Of note, the HapMap-CHB data were retrieved from NCBI database. A multivariants analysis by the software multifactor dimensionality reduction 1.2.2 (MDR) highlighted that the variants of the seven SNPs rs1538664G/A, rs479200C/T, rs2486729G/A, rs2790879T/G, rs480902T/C, rs2486736G/A and rs973252A/G of EGLN1 were in coherence (P < 0.05; Supplementary Material, Fig. S1A a–f, B a–f and C a–f). Overall, rs1538664G/A emerged as the widely distributed SNP in almost all the MDR models irrespective of the group pairing; it is followed by rs479200 (P < 0.05). SNP rs480902 made the strongest presence in all such MDR interactions, especially in the controls (P < 1.0E−15; Supplementary Material, Fig. S1B and C). Forest plot analysis further supported these MDR findings, wherein with each increasing risk allele, the risk-odds [odds ratio (OR)] increased (Supplementary Material, Fig. S1D a–c) with respect to HAPE-p versus HAPE-f, HAPE-p versus HLs and HAPE-f versus HLs.
In HIF1AN, SNPs rs11190604A/G, rs10883512A/G and rs147628176A/G differed significantly between HAPE-p and HAPE-f (P < 0.05; Supplementary Material, Table S2). Multivariate logistic regression analysis revealed a significantly increased risk of HAPE in the presence of minor alleles rs11190604A, rs10883512A and rs147628176A; the adjusted risk-odds being above 1.0 (P < 0.01); as a consequence, these alleles were recognized as risk alleles in HAPE-p (P < 0.05). While the alleles rs11190604G and rs10883512G were overrepresented in HAPE-f and HLs (P < 0.05), hence were recognized as protective alleles.
HAPE-associated SNPs reveal a causal role
Risk alleles contribute to the higher EGLN1 level
The regression coefficient analysis revealed a significant association between the EGLN1 risk alleles rs1538664A, rs479200T and rs480902C and increased EGLN1 protein levels in the study groups HAPE-f, HAPE-p and HLs (P < 0.009, 0.001 and 0.02, respectively; Supplementary Material, Table S3). Whereas the three HIF1AN SNPs poorly associated with HIF1AN levels (P > 0.05; Supplementary Material, Table S4).
Correlations within clinical parameters and between EGLN1 and HIF1AN levels are apparent
The correlations among the various clinical, physiological and biochemical characteristics were highly relevant. We first evaluated the four parameters BMI, MAP, SaO2 and PR individually against age because age is a crucial parameter in defining health; it provided interesting insights (Fig. 3A i–iii). The three groups showed an elevation in BMI and MAP against age (P = 0.045 to 1.57E−25, 0.006 to 5.62E−7, respectively), whereas, SaO2 correlated inversely with age in the two sojourn groups, i.e. HAPE-f and HAPE-p (P ≤ 7.82E−6). In this analysis, only SaO2 insignificantly correlated with age in HLs (P = 0.17), perhaps a sign of adaptation. PR seems indifferent to age effect (P > 0.05), although a decline did appear with age. In another correlation analysis, BMI was compared against MAP and SaO2 (Fig. 3B i–iii). MAP was positively correlated with BMI in both HAPE-p and HAPE-f (P < 0.05; Fig 3B i, ii), whereas SaO2 was negatively correlated with BMI in HAPE-f (P < 0.05; Fig. 3B i). The HLs again showed adaptability to these parameters (P > 0.05). These analyses suggested that the characteristics individually and in correlation play a significant role in the susceptibility to HAPE in sojourners and adaptability in HLs (5,14). The protein levels, HIF1AN and EGLN1, were correlated by Pearson’s correlation (r) in the three populations (Fig. 3C i–iii). A significant positive correlation existed between the two levels in the two healthy groups (P < 0.05; Fig. 3C i, iii).
Functional validation of the significantly associated SNPs
Here, we highlight the outcome of the several experiments that were performed to evaluate the SNPs for functionality and contribution to physiological behavior (Fig. 4A–D).
In silico annotation exercise corroborates regulatory loci
RegulomeDB performed functional annotation of seven SNPs of EGLN1 and three SNPs of HIF1AN (Supplementary Material, Fig. S2A and Supplementary Material, Table S5). It provided four EGLN1 SNPs rs1538664, rs479200, rs480902 and rs2486736 that affected the binding of co-factors and thus the gene expression, whereas the tool Variant Effect Predictor estimated three SNPs rs1538664, rs479200 and rs480902 as EGLN1 regulator. Based on the output of these two tools, the three SNPs rs1538664, rs479200 and rs480902 appeared crucial in the EGLN1 regulation, specifically affecting the DNA binding of TFs and hence were preferred for further analyses.
Screening of the alleles in cell lines defines the regulatory role
We evaluated the three EGLN1 SNPs rs1538664, rs479200 and rs480902 for dual-luciferase activity in human Umbilical Vein Endothelial Cell (HUVEC), human embryonic kidney (HEK293) and A549 cell lines (Fig. 4B i–iii). Supporting our in silico findings, these three EGLN1 SNPs emerged significant with allele-specific differential expression (P between 0.02 and 0.0017). The risk variants as compared with the wild-type showed elevated expression (P < 0.05) that could be detrimental to HIF1α activation.
EMSA highlights the preference of TFs by the variants in regulating EGLN1 expression
EMSA identified the binding and specificity of a TF on the TF-binding site (TFBS) of the three EGLN1 SNPs rs1538664, rs479200 and rs480902; a shift was apparent for the variants in the DNA–protein (TFs) interactions (Fig. 4C i–iii). The competitive assay with unlabeled oligonucleotides established the specificity with loss of intensity of each band (right-hand part Fig. 4C i–iii). Supplementary Material, Table S6 provides the unique proteins as obtained through LC–MS/MS after EMSA of the three EGLN1 SNPs.
Regulatory loci-associated TFs/proteins identified and confirmed by LC–MS/MS
LC–MS/MS identified the EMSA proteins/TFs with a cut-off protein score of >46 and P < 0.05 (Supplementary Material, Table S7A–C) and the Venn diagram circumscribed the proteins to either the risk or protective alleles or both (Fig. 4D). Furthermore, the extensive network analyses of these proteins by STRING v. 11 provided selective interactions and preferences for a given allelic site (Supplementary Material, Fig. S2A–F). Heat shock proteins (HSPs) were abundantly present in these six clusters suggesting the significance of HSP family under hypoxia; of relevance, the family proteins differed among the wild-type (wt) and variant-type (vt) alleles. Here, for SNP rs1538664, the HSP family A (Hsp70) member 8 (HSPA8) emerged as the most interacting protein followed by HSPA90AA. Next abundant family was of heterogeneous nuclear ribonucleoproteins (HNRNP); here, HNRNPD through Apoptosis Inhibitor 5 interacted with other proteins depending upon the alleles (Supplementary Material, Fig. S2A and B). Importantly, it interacted with FUS RNA binding protein (FUS) that extended its network with several HSPs, HNRNPs, Serpins (SERPs) and Glyceraldehyde 3-phosphate dehydrogenase. HSP8 seemed to bridge between the HSPs and HNRNP complexes and numerous other proteins. Of interest, these two families were sided by a few more pertinent proteins. Among the other proteins, Rho GDP dissociation inhibitor alpha (ARHGDIA), Hypoxia upregulated protein 1 (HYOU1), Far upstream element-binding protein 1 (FUBP1), Mitogen-activated protein kinase (MAPK) and several others were noteworthy.
The SNP rs479200 also networked differentially with several proteins (Supplementary Material, Table S7B and Supplementary Material, Fig. S2C and D). The vt allele rs479200T was inclined toward ARHGDIA; simultaneously, it interacted with both the families, i.e. HNRNPs and HSPs with HSP8 bridging in between; interestingly, however, the aldehyde dehydrogenase and chromodomain helicase DNA-binding family members diminished. Concerning rs480902 (Supplementary Material, Table S7C and Supplementary Material, Fig. S2E and F), the wt allele had varied interactions with several proteins, including the abundantly present families. In this network, HYOU1 protein was inclined toward the HSP family in the presence of the variant rs480902C (Supplementary Material, Fig. S2F) and through that to the HNRNP family, thereby widening the interactions. Overall, the intense networking interactions between the variants of the three EGLN1 SNPs and the TFs seemingly play a stronger role in the regulation of the gene under hypoxic condition.
Docking simulations reveal variants that recruit new TFs, thereby effecting gene regulation
The web servers coexpressMAP and Distant Regulatory Elements (DiRE) predicted the TFs based on the coexpression coefficient score. Specific TFs were identified at cut-off detection efficiency of 80%, significance at <0.001 and false discovery rate (FDR) at <0.05. Molecular docking simulations assessed the geometric arrangements of interatomic contacts and identified the specific TFs with lowest atomic contact energy (ACE, kcal/mol), varied intermolecular atomic level hydrogen bond interactions and coexpression constant of >0.60 of protein–DNA complexes (Fig. 5A, it represents 21 subfigures that include 6 alleles and 15 TFs–allele interactions, as a result the numbering becomes multiple and complex). It provided three best TFs for rs1538664, nine TFs each for the remaining two SNPs, although for the latter, only three to four TFs emerged significant (Supplementary Material, Table S7). Overall, against the TFBS-rs1538664, the identified TFs are SERP1, SERP2 and FUS; for TFBS-rs479200, the TFs are FUBP1, Growth arrest-specific protein 1 (GAS1) and ARHGDIA; for the TFBS-rs480902, the TFs are General transcription factor IIIC subunit 1 (GTF3C1), MAPK7 and HYOU1.

Gene expression and biochemical parameters of the three study groups. The relative expression levels of (A) EGLN1 and (B) HIF1AN were evaluated by real-time PCR and are expressed as fold change in the HAPE-p and HLs compared with the HAPE-f. The expression of EGLN1 and HIF1AN was elevated by ~4.5- and 1.5-folds, respectively, in HAPE-p compared with HAPE-f and HLs (n = 30; P < 0.05). (C) Plasma levels of EGLN1, pg/ml and (D) HIF1AN, pg/ml. The levels were higher in the HAPE-p compared with HAPE-f and HLs (n ≅ 550; P < 0.0001). Bars represent the mean ± SD.
Of note, it is apparent that the choice of a TF is allele-specific (Fig. 5A A–C). In case of the SNP rs1538664, the wt locus or wtTFBS-rs1538664 carried the TF SERP1 (Fig. 5A A1 a–c); however, the risk locus or vtTFBS-rs1538664, in addition to the TF SERP1, also attracted the TF FUS1; the ACE values so clearly differed (Fig. 5A A2 a–d). FUS1 regulates expression of specific target genes (20). For SNP rs479200, the wtTFBS-rs479200 locus preferred FUBP1, Thioredoxin (TXN) and GAS1 (Fig. 5A B1 a–d) and vtTFBS-rs479200 locus preferred FUBP1, Homeobox containing 1 (HMBOX1) and ARHGDIA (Fig. 5A B2 a–d). Rho-kinase may downregulate NOS3, resulting in impaired endothelial responses because of decreased production of NO and endothelium-dependent hyperpolarization (21). Likewise, for SNP rs480902, only three TFs showed the highest preference for the TFBS-rs480902 locus. TF GTF3C1 was common to both wt and vt TFBS-rs480902 (Fig. 5A C1 a b, C2 a b), whereas in addition only MAPK7 was preferred by wtTFBS-rs480902 (Fig. 5A C1 c) and only HYOU1 was preferred by the vtTFBS-rs480902 (Fig. 5A C2 c). HYOU1 is a member of the Hsp70 family. A cis-acting segment in the 5′ UTR of this gene results in the accumulation of this protein in the endoplasmic reticulum under hypoxic conditions (22). Subsequently, the tool coexpression MAP highlighted the best three TFs in each group with the maximum coexpression (Supplementary Material, Table S8). Overall, the variant alleles of the three SNPs interacted with those TFs that may have role in the manifestation of the pathophysiology of the HA disorders such as endothelium dysfunction. This evaluation underlined the relevance and causal role of the identified TFs in the pathological exemplifications. It, however, does not diminish the importance of the other TFs because each contributes to the overall physiological outcome (23).
Supershift assay confirms the allele-specific TFs
Supershift assay confirmed the specificity of the TFs with the binding of the respective antibody with the DNA-protein complexes, i.e. Fus-rs1538664A, ARHGDIA-rs479200T and HYOU1-rs480902C than the wild-types loci rs1538664G, rs479200C and rs480902T (Fig. 5B i–iii). ARHGDIA-rs479200T is represented with the un-cropped autoradiograph image (Supplementary Material, Fig. S3). Figure 5C represents the fold change for each SNP as revealed by the dual-luciferase assay.
Genotype and allele distribution of the EGLN1 SNPS rs1538664, rs479200, rs480902 and HIF1AN SNPs rs11190604, rs10883512 and rs147628176 in HAPE-p, HAPE-f and HLs
SNPs . | Genotype distribution . | OR (95% CI) . | P value . | Genotype distribution . | OR (95% CI) . | P value . | ||
---|---|---|---|---|---|---|---|---|
. | . | . | . | . | . | . | . | . |
. | HAPE-f (%) . | HAPE-p (%) . | . | . | HLs (%) . | HapMap- CHB (%) . | . | . |
EGLN1 | ||||||||
rs1538664 | ||||||||
GG | 53 (25) | 32 (12) | 1 | 245 (56) | 30 (22) | 1 | ||
GA | 93 (44) | 146 (57) | 2.60 (1.56–4.33) | 0.00024 | 158 (36) | 68 (49) | 3.51 (2.18–5.64) | 2.0E−07 |
AA | 66 (31) | 81 (31) | 2.03 (1.18–3.51) | 0.011 | 33 (8.0) | 40 (29) | 9.89 (5.45–17.98) | 5.13E−14 |
G | 199 (47) | 210 (41) | 1 | 648 (74) | 128 (46) | 1 | ||
A | 225 (53) | 308 (59) | 1.29 (1.00–1.68) | 0.049 | 224 (26) | 148 (54) | 3.34 (2.52–4.43) | 3.85E−18 |
rs479200 | ||||||||
CC | 33 (16) | 19 (8) | 1 | 243 (56) | 8 (17) | 1 | ||
CT | 96 (46) | 105 (42) | 1.95 (1.04–3.65) | 0.036 | 159 (36) | 23 (50) | 4.39 (1.91–10.06) | 0.00046 |
TT | 81 (38) | 126 (50) | 2.60 (1.38–4.91) | 0.003 | 34 (8.0) | 15 (33) | 13.40 (5.28–33.96) | 4.52E−08 |
C | 162 (39) | 143 (29) | 1 | 645 (74) | 39 (42) | 1 | ||
T | 258 (61) | 357 (71) | 1.49 (1.13–1.95) | 0.004 | 227 (26) | 53 (58) | 3.86 (2.48–5.99) | 1.81E−09 |
rs480902 | ||||||||
TT | 43 (20) | 20 (8) | 1 | 226 (52) | 24 (17) | 1 | ||
TC | 90 (43) | 120 (48) | 3.15 (1.72–5.75) | 0.00018 | 171 (39) | 72 (52) | 3.96 (2.39–6.55) | 7.92E−08 |
CC | 77 (37) | 110 (44) | 3.08 (1.66–5.71) | 0.00034 | 39 (9.0) | 42 (31) | 10.14 (5.53–18.58) | 6.65E−14 |
T | 176 (42) | 160 (32) | 1 | 623 (71) | 120 (43) | 1 | ||
C | 244 (58) | 340 (68) | 1.45 (1.11–1.89) | 0.006 | 249 (29) | 156 (57) | 3.25 (2.45–4.30) | 1.40E−16 |
HIF1AN | ||||||||
rs11190604 | ||||||||
AA | 89 (44) | 94 (51) | 1 | 141 (90) | 136 (81) | 1 | ||
AG | 89 (44) | 80 (44) | 0.85 (0.56–1.29) | 0.450 | 68 (32) | 28 (12) | 0.427 (0.26–0.70) | 0.001 |
GG | 25 (12) | 9 (5) | 0.39 (0.19–0.81) | 0.007 | 7 (3) | 4 (2) | 0.592 (0.17–2.07) | 0.407 |
AG + GG | 114 (56) | 89 (49) | 0.74 (0.49–1.10) | 0.139 | 75 (35) | 32 (19) | 0.442 (0.27–0.71) | 0.001 |
A | 267 (66) | 268 (73) | 1 | 350 (81) | 300 (89) | 1 | ||
G | 139 (34) | 98 (27) | 0.70 (0.51–0.95) | 0.025 | 82 (19) | 36 (11) | 0.512 (0.33–0.78) | 0.001 |
rs10883512 | ||||||||
AA | 106 (50) | 101 (56) | 1 | 158 (72) | 134 (81) | 1 | ||
AG | 79 (37) | 72 (40) | 0.95 (0.62–1.45) | 0.836 | 55 (25) | 28 (17) | 0.6 (0.36–1) | 0.048 |
GG | 26 (12) | 8 (4) | 0.32 (0.14–0.75) | 0.006 | 4 (2) | 4 (2) | 1.179 (0.29–4.8) | 0.818 |
AG + GG | 105 (50) | 80 (44) | 0.80 (0.53–1.19) | 0.271 | 59 (27) | 32 (19) | 0.64 (0.39–1.042) | 0.071 |
A | 291 (69) | 274 (82) | 1 | 371 (86) | 296 (89) | 1 | ||
G | 131 (31) | 88 (24) | 0.71 (0.52–0.97) | 0.036 | 63 (14) | 36 (11) | 0.716 (0.46–1.11) | 0.133 |
rs147628176 | ||||||||
AA | 144 (67) | 145 (79) | 1 | 169 (79) | — | 1 | ||
AG | 69 (32) | 38 (21) | 0.54 (0.34–0.86) | 0.009 | 46 (21) | — | — | — |
GG | 0 | 0 | — | — | 0 | — | — | — |
AG + GG | 69 (32) | 38 (21) | 0.54 (0.34–0.86) | 0.009 | 46 (21) | — | — | — |
A | 357 (84) | 328 (90) | 1 | 384 (89) | — | — | ||
G | 69 (16) | 38 (10) | 0.59 (0.39–0.91) | 0.017 | 46 (11) | — | — | — |
SNPs . | Genotype distribution . | OR (95% CI) . | P value . | Genotype distribution . | OR (95% CI) . | P value . | ||
---|---|---|---|---|---|---|---|---|
. | . | . | . | . | . | . | . | . |
. | HAPE-f (%) . | HAPE-p (%) . | . | . | HLs (%) . | HapMap- CHB (%) . | . | . |
EGLN1 | ||||||||
rs1538664 | ||||||||
GG | 53 (25) | 32 (12) | 1 | 245 (56) | 30 (22) | 1 | ||
GA | 93 (44) | 146 (57) | 2.60 (1.56–4.33) | 0.00024 | 158 (36) | 68 (49) | 3.51 (2.18–5.64) | 2.0E−07 |
AA | 66 (31) | 81 (31) | 2.03 (1.18–3.51) | 0.011 | 33 (8.0) | 40 (29) | 9.89 (5.45–17.98) | 5.13E−14 |
G | 199 (47) | 210 (41) | 1 | 648 (74) | 128 (46) | 1 | ||
A | 225 (53) | 308 (59) | 1.29 (1.00–1.68) | 0.049 | 224 (26) | 148 (54) | 3.34 (2.52–4.43) | 3.85E−18 |
rs479200 | ||||||||
CC | 33 (16) | 19 (8) | 1 | 243 (56) | 8 (17) | 1 | ||
CT | 96 (46) | 105 (42) | 1.95 (1.04–3.65) | 0.036 | 159 (36) | 23 (50) | 4.39 (1.91–10.06) | 0.00046 |
TT | 81 (38) | 126 (50) | 2.60 (1.38–4.91) | 0.003 | 34 (8.0) | 15 (33) | 13.40 (5.28–33.96) | 4.52E−08 |
C | 162 (39) | 143 (29) | 1 | 645 (74) | 39 (42) | 1 | ||
T | 258 (61) | 357 (71) | 1.49 (1.13–1.95) | 0.004 | 227 (26) | 53 (58) | 3.86 (2.48–5.99) | 1.81E−09 |
rs480902 | ||||||||
TT | 43 (20) | 20 (8) | 1 | 226 (52) | 24 (17) | 1 | ||
TC | 90 (43) | 120 (48) | 3.15 (1.72–5.75) | 0.00018 | 171 (39) | 72 (52) | 3.96 (2.39–6.55) | 7.92E−08 |
CC | 77 (37) | 110 (44) | 3.08 (1.66–5.71) | 0.00034 | 39 (9.0) | 42 (31) | 10.14 (5.53–18.58) | 6.65E−14 |
T | 176 (42) | 160 (32) | 1 | 623 (71) | 120 (43) | 1 | ||
C | 244 (58) | 340 (68) | 1.45 (1.11–1.89) | 0.006 | 249 (29) | 156 (57) | 3.25 (2.45–4.30) | 1.40E−16 |
HIF1AN | ||||||||
rs11190604 | ||||||||
AA | 89 (44) | 94 (51) | 1 | 141 (90) | 136 (81) | 1 | ||
AG | 89 (44) | 80 (44) | 0.85 (0.56–1.29) | 0.450 | 68 (32) | 28 (12) | 0.427 (0.26–0.70) | 0.001 |
GG | 25 (12) | 9 (5) | 0.39 (0.19–0.81) | 0.007 | 7 (3) | 4 (2) | 0.592 (0.17–2.07) | 0.407 |
AG + GG | 114 (56) | 89 (49) | 0.74 (0.49–1.10) | 0.139 | 75 (35) | 32 (19) | 0.442 (0.27–0.71) | 0.001 |
A | 267 (66) | 268 (73) | 1 | 350 (81) | 300 (89) | 1 | ||
G | 139 (34) | 98 (27) | 0.70 (0.51–0.95) | 0.025 | 82 (19) | 36 (11) | 0.512 (0.33–0.78) | 0.001 |
rs10883512 | ||||||||
AA | 106 (50) | 101 (56) | 1 | 158 (72) | 134 (81) | 1 | ||
AG | 79 (37) | 72 (40) | 0.95 (0.62–1.45) | 0.836 | 55 (25) | 28 (17) | 0.6 (0.36–1) | 0.048 |
GG | 26 (12) | 8 (4) | 0.32 (0.14–0.75) | 0.006 | 4 (2) | 4 (2) | 1.179 (0.29–4.8) | 0.818 |
AG + GG | 105 (50) | 80 (44) | 0.80 (0.53–1.19) | 0.271 | 59 (27) | 32 (19) | 0.64 (0.39–1.042) | 0.071 |
A | 291 (69) | 274 (82) | 1 | 371 (86) | 296 (89) | 1 | ||
G | 131 (31) | 88 (24) | 0.71 (0.52–0.97) | 0.036 | 63 (14) | 36 (11) | 0.716 (0.46–1.11) | 0.133 |
rs147628176 | ||||||||
AA | 144 (67) | 145 (79) | 1 | 169 (79) | — | 1 | ||
AG | 69 (32) | 38 (21) | 0.54 (0.34–0.86) | 0.009 | 46 (21) | — | — | — |
GG | 0 | 0 | — | — | 0 | — | — | — |
AG + GG | 69 (32) | 38 (21) | 0.54 (0.34–0.86) | 0.009 | 46 (21) | — | — | — |
A | 357 (84) | 328 (90) | 1 | 384 (89) | — | — | ||
G | 69 (16) | 38 (10) | 0.59 (0.39–0.91) | 0.017 | 46 (11) | — | — | — |
P values were obtained after adjustment with age, gender and BMI by multivariate logistic regression analysis using SPSS 15.0 software. The genotype distribution and allele frequency were compared by chi square test. n, number; (%), percent distribution.
Genotype and allele distribution of the EGLN1 SNPS rs1538664, rs479200, rs480902 and HIF1AN SNPs rs11190604, rs10883512 and rs147628176 in HAPE-p, HAPE-f and HLs
SNPs . | Genotype distribution . | OR (95% CI) . | P value . | Genotype distribution . | OR (95% CI) . | P value . | ||
---|---|---|---|---|---|---|---|---|
. | . | . | . | . | . | . | . | . |
. | HAPE-f (%) . | HAPE-p (%) . | . | . | HLs (%) . | HapMap- CHB (%) . | . | . |
EGLN1 | ||||||||
rs1538664 | ||||||||
GG | 53 (25) | 32 (12) | 1 | 245 (56) | 30 (22) | 1 | ||
GA | 93 (44) | 146 (57) | 2.60 (1.56–4.33) | 0.00024 | 158 (36) | 68 (49) | 3.51 (2.18–5.64) | 2.0E−07 |
AA | 66 (31) | 81 (31) | 2.03 (1.18–3.51) | 0.011 | 33 (8.0) | 40 (29) | 9.89 (5.45–17.98) | 5.13E−14 |
G | 199 (47) | 210 (41) | 1 | 648 (74) | 128 (46) | 1 | ||
A | 225 (53) | 308 (59) | 1.29 (1.00–1.68) | 0.049 | 224 (26) | 148 (54) | 3.34 (2.52–4.43) | 3.85E−18 |
rs479200 | ||||||||
CC | 33 (16) | 19 (8) | 1 | 243 (56) | 8 (17) | 1 | ||
CT | 96 (46) | 105 (42) | 1.95 (1.04–3.65) | 0.036 | 159 (36) | 23 (50) | 4.39 (1.91–10.06) | 0.00046 |
TT | 81 (38) | 126 (50) | 2.60 (1.38–4.91) | 0.003 | 34 (8.0) | 15 (33) | 13.40 (5.28–33.96) | 4.52E−08 |
C | 162 (39) | 143 (29) | 1 | 645 (74) | 39 (42) | 1 | ||
T | 258 (61) | 357 (71) | 1.49 (1.13–1.95) | 0.004 | 227 (26) | 53 (58) | 3.86 (2.48–5.99) | 1.81E−09 |
rs480902 | ||||||||
TT | 43 (20) | 20 (8) | 1 | 226 (52) | 24 (17) | 1 | ||
TC | 90 (43) | 120 (48) | 3.15 (1.72–5.75) | 0.00018 | 171 (39) | 72 (52) | 3.96 (2.39–6.55) | 7.92E−08 |
CC | 77 (37) | 110 (44) | 3.08 (1.66–5.71) | 0.00034 | 39 (9.0) | 42 (31) | 10.14 (5.53–18.58) | 6.65E−14 |
T | 176 (42) | 160 (32) | 1 | 623 (71) | 120 (43) | 1 | ||
C | 244 (58) | 340 (68) | 1.45 (1.11–1.89) | 0.006 | 249 (29) | 156 (57) | 3.25 (2.45–4.30) | 1.40E−16 |
HIF1AN | ||||||||
rs11190604 | ||||||||
AA | 89 (44) | 94 (51) | 1 | 141 (90) | 136 (81) | 1 | ||
AG | 89 (44) | 80 (44) | 0.85 (0.56–1.29) | 0.450 | 68 (32) | 28 (12) | 0.427 (0.26–0.70) | 0.001 |
GG | 25 (12) | 9 (5) | 0.39 (0.19–0.81) | 0.007 | 7 (3) | 4 (2) | 0.592 (0.17–2.07) | 0.407 |
AG + GG | 114 (56) | 89 (49) | 0.74 (0.49–1.10) | 0.139 | 75 (35) | 32 (19) | 0.442 (0.27–0.71) | 0.001 |
A | 267 (66) | 268 (73) | 1 | 350 (81) | 300 (89) | 1 | ||
G | 139 (34) | 98 (27) | 0.70 (0.51–0.95) | 0.025 | 82 (19) | 36 (11) | 0.512 (0.33–0.78) | 0.001 |
rs10883512 | ||||||||
AA | 106 (50) | 101 (56) | 1 | 158 (72) | 134 (81) | 1 | ||
AG | 79 (37) | 72 (40) | 0.95 (0.62–1.45) | 0.836 | 55 (25) | 28 (17) | 0.6 (0.36–1) | 0.048 |
GG | 26 (12) | 8 (4) | 0.32 (0.14–0.75) | 0.006 | 4 (2) | 4 (2) | 1.179 (0.29–4.8) | 0.818 |
AG + GG | 105 (50) | 80 (44) | 0.80 (0.53–1.19) | 0.271 | 59 (27) | 32 (19) | 0.64 (0.39–1.042) | 0.071 |
A | 291 (69) | 274 (82) | 1 | 371 (86) | 296 (89) | 1 | ||
G | 131 (31) | 88 (24) | 0.71 (0.52–0.97) | 0.036 | 63 (14) | 36 (11) | 0.716 (0.46–1.11) | 0.133 |
rs147628176 | ||||||||
AA | 144 (67) | 145 (79) | 1 | 169 (79) | — | 1 | ||
AG | 69 (32) | 38 (21) | 0.54 (0.34–0.86) | 0.009 | 46 (21) | — | — | — |
GG | 0 | 0 | — | — | 0 | — | — | — |
AG + GG | 69 (32) | 38 (21) | 0.54 (0.34–0.86) | 0.009 | 46 (21) | — | — | — |
A | 357 (84) | 328 (90) | 1 | 384 (89) | — | — | ||
G | 69 (16) | 38 (10) | 0.59 (0.39–0.91) | 0.017 | 46 (11) | — | — | — |
SNPs . | Genotype distribution . | OR (95% CI) . | P value . | Genotype distribution . | OR (95% CI) . | P value . | ||
---|---|---|---|---|---|---|---|---|
. | . | . | . | . | . | . | . | . |
. | HAPE-f (%) . | HAPE-p (%) . | . | . | HLs (%) . | HapMap- CHB (%) . | . | . |
EGLN1 | ||||||||
rs1538664 | ||||||||
GG | 53 (25) | 32 (12) | 1 | 245 (56) | 30 (22) | 1 | ||
GA | 93 (44) | 146 (57) | 2.60 (1.56–4.33) | 0.00024 | 158 (36) | 68 (49) | 3.51 (2.18–5.64) | 2.0E−07 |
AA | 66 (31) | 81 (31) | 2.03 (1.18–3.51) | 0.011 | 33 (8.0) | 40 (29) | 9.89 (5.45–17.98) | 5.13E−14 |
G | 199 (47) | 210 (41) | 1 | 648 (74) | 128 (46) | 1 | ||
A | 225 (53) | 308 (59) | 1.29 (1.00–1.68) | 0.049 | 224 (26) | 148 (54) | 3.34 (2.52–4.43) | 3.85E−18 |
rs479200 | ||||||||
CC | 33 (16) | 19 (8) | 1 | 243 (56) | 8 (17) | 1 | ||
CT | 96 (46) | 105 (42) | 1.95 (1.04–3.65) | 0.036 | 159 (36) | 23 (50) | 4.39 (1.91–10.06) | 0.00046 |
TT | 81 (38) | 126 (50) | 2.60 (1.38–4.91) | 0.003 | 34 (8.0) | 15 (33) | 13.40 (5.28–33.96) | 4.52E−08 |
C | 162 (39) | 143 (29) | 1 | 645 (74) | 39 (42) | 1 | ||
T | 258 (61) | 357 (71) | 1.49 (1.13–1.95) | 0.004 | 227 (26) | 53 (58) | 3.86 (2.48–5.99) | 1.81E−09 |
rs480902 | ||||||||
TT | 43 (20) | 20 (8) | 1 | 226 (52) | 24 (17) | 1 | ||
TC | 90 (43) | 120 (48) | 3.15 (1.72–5.75) | 0.00018 | 171 (39) | 72 (52) | 3.96 (2.39–6.55) | 7.92E−08 |
CC | 77 (37) | 110 (44) | 3.08 (1.66–5.71) | 0.00034 | 39 (9.0) | 42 (31) | 10.14 (5.53–18.58) | 6.65E−14 |
T | 176 (42) | 160 (32) | 1 | 623 (71) | 120 (43) | 1 | ||
C | 244 (58) | 340 (68) | 1.45 (1.11–1.89) | 0.006 | 249 (29) | 156 (57) | 3.25 (2.45–4.30) | 1.40E−16 |
HIF1AN | ||||||||
rs11190604 | ||||||||
AA | 89 (44) | 94 (51) | 1 | 141 (90) | 136 (81) | 1 | ||
AG | 89 (44) | 80 (44) | 0.85 (0.56–1.29) | 0.450 | 68 (32) | 28 (12) | 0.427 (0.26–0.70) | 0.001 |
GG | 25 (12) | 9 (5) | 0.39 (0.19–0.81) | 0.007 | 7 (3) | 4 (2) | 0.592 (0.17–2.07) | 0.407 |
AG + GG | 114 (56) | 89 (49) | 0.74 (0.49–1.10) | 0.139 | 75 (35) | 32 (19) | 0.442 (0.27–0.71) | 0.001 |
A | 267 (66) | 268 (73) | 1 | 350 (81) | 300 (89) | 1 | ||
G | 139 (34) | 98 (27) | 0.70 (0.51–0.95) | 0.025 | 82 (19) | 36 (11) | 0.512 (0.33–0.78) | 0.001 |
rs10883512 | ||||||||
AA | 106 (50) | 101 (56) | 1 | 158 (72) | 134 (81) | 1 | ||
AG | 79 (37) | 72 (40) | 0.95 (0.62–1.45) | 0.836 | 55 (25) | 28 (17) | 0.6 (0.36–1) | 0.048 |
GG | 26 (12) | 8 (4) | 0.32 (0.14–0.75) | 0.006 | 4 (2) | 4 (2) | 1.179 (0.29–4.8) | 0.818 |
AG + GG | 105 (50) | 80 (44) | 0.80 (0.53–1.19) | 0.271 | 59 (27) | 32 (19) | 0.64 (0.39–1.042) | 0.071 |
A | 291 (69) | 274 (82) | 1 | 371 (86) | 296 (89) | 1 | ||
G | 131 (31) | 88 (24) | 0.71 (0.52–0.97) | 0.036 | 63 (14) | 36 (11) | 0.716 (0.46–1.11) | 0.133 |
rs147628176 | ||||||||
AA | 144 (67) | 145 (79) | 1 | 169 (79) | — | 1 | ||
AG | 69 (32) | 38 (21) | 0.54 (0.34–0.86) | 0.009 | 46 (21) | — | — | — |
GG | 0 | 0 | — | — | 0 | — | — | — |
AG + GG | 69 (32) | 38 (21) | 0.54 (0.34–0.86) | 0.009 | 46 (21) | — | — | — |
A | 357 (84) | 328 (90) | 1 | 384 (89) | — | — | ||
G | 69 (16) | 38 (10) | 0.59 (0.39–0.91) | 0.017 | 46 (11) | — | — | — |
P values were obtained after adjustment with age, gender and BMI by multivariate logistic regression analysis using SPSS 15.0 software. The genotype distribution and allele frequency were compared by chi square test. n, number; (%), percent distribution.

Scatter plots for the various correlation analyses in the three study groups. (1A-C) Each of the clinical parameters, BMI, MAP, SaO2 and PR, plotted versus age. (2A-C) MAP and SaO2 plotted versus BMI. (3A-C) Plasma levels of EGLN1 versus HIF1AN. The three study groups are: (i) HAPE-f, (ii) HAPE-p and (iii) HLs. Significance was maintained at P ≤ 0.05.

Functional validation of the EGLN1 regulatory SNPs rs1538664, rs479200 and rs480902. (A) Functional annotation of the SNPs on the basis of RegulomeDB and Variant Effect Predictor. (B) Normalised dual-luciferase activity of the EGLN1 SNPs rs1538664, rs479200 and rs480902 measured by using the three cell lines HEK 293, A549 and HUVEC. The statistical significance of expression between the alleles was calculated. (C) EMSA of the SNPs using allele-specific oligonucleotide probe. Autoradiograph of the double-stranded allele-specific 32P-labeled oligonucleotides mixed with the nuclear extract of A549. The shifted extra bands in remaining lanes correspond to protein–DNA complexes. When an excess of unlabeled oligonucleotide is added to the reaction (competition), the labeled oligo is displaced from the complexes and the extra bands cannot be seen. The Supplement figure shows differential nuclear protein binding to alleles of rs1538664G/A, rs479200C/T and rs480902T/C, respectively. The results of EMSA demonstrated gel-shift bands for both allelic probes. (D) Venn diagram representing the interactions including cross-interactions; subfigures represents respective SNP/allele.

Validation of the three EGLN1 regulatory SNPs through Docking analyses and the Supershift assay. (A) Docking analysis. This figure represents 21 subfigures that include 6 alleles and 15 TFs–allele interactions, as a result the numbering becomes multiple and complex. Docking complexes of the wild-type (A1 a, B1 a, C1 a) and variant-type (A2 a, B2 a, C2 a) transcription binding sites of the respective SNPs rs1538664G/A, rs479200C/T and rs480902T/C of EGLN1 with corresponding TFs (DNA–TF docking complexes). The ACE (kcal/mol) of each TF with corresponding TFBS and intermolecular atomic-level hydrogen bond interactions is shown in the corresponding figures A1 a, b; B1 b–d; C1 b, c for wild-type and A2 b–d; B2 b–d; C2 b, c for variant-type alleles of the respective SNPs. TFs are differentiated with colors i.e. TF1, magenta; TF2, green; TF3, blue. (B) The supershift assay with specific monoclonal Ab against rs1538664A-Fus protein, rs479200T-ARHGDIA protein and rs480902C-HYOU1 protein. It depicts the protein complexes with distinct super-shifted bands in a differential manner between DNA fragments containing the vt and wt alleles. N.E., nuclear extract of A549 cells; Ab, Antibody. (C) Dual-luciferase depicting the fold-change expression and the statistical significance output in the three cell lines as observed in the comparisons of the alleles of the three SNPs.
TFs at these intronic SNPs contribute to strong networking, relevant pathways and physiological consequences
Here, STRING v. 11 identified node 1 and node 2 proteins and 78 unique proteins that interacted with the DNA–protein complexes of Fus-rs1538664A, ARHGDIA-rs479200T and HYOU1-rs480902C (Fig. 6A). The network highlights the interplay of FUS protein with all the three risk loci. ARHGDIA (RhoGDI) is bridged between the complexes HNRP–FUS and HSP–HYOU1. Also, it is visible that FUS plays a significant role than the other two proteins by interacting simultaneously at the three risk loci, rs1538664A, rs479200T and rs480902C; though, its notable interaction remains with rs1538664A. FUS and FUBP1 at rs479200T shared almost similar interacting nodes. FUBP1 is a single-stranded DNA-binding protein that binds multiple DNA elements such as the protein Far upstream element that is located upstream of c-myc. This permutation derives multiple synthetic functions necessary for rapid cell division, while, at the same time, it inhibits the expression of genes with anti-proliferative functions (20). Besides, several other pertinent proteins are among the critical interacting proteins/TFs that may form and stabilize DNA–protein interactions (Fig. 6A).

Network landscape and annotation of proteins associated with EGLN1 risk alleles. (A) Networking among the most interacting proteins and or TFs strongly supports the involvement of the identified TFs. (B) and (C) Annotation of 78 identified risk allele associated proteins by PANTHER tool to their molecular functions and biological processes, respectively. It highlights few specific processes and the physiological outcome in the form of functions.

The two HIF-prolyl hydroxylases are relevant to the oxygen signaling. The differential distribution of three EGLN1 SNPs under hypobaric hypoxia condition emerged significant as the risk variants interacted with TFs such as Fus-rs1538664A, ARHGDIA-rs479200T and HYOU1-rs480902C. Such interactions may regulate the increase in EGLN1 expression associated with the pathophysiology of HAPE.
Furthermore, annotation of the 78 unique proteins by PANTHER ver. 14 tool systematized the molecular functions and biological processes. The molecular function analysis highlighted 72 functions under six categories; the binding and catalytic activity emerging as the two primary molecular functions (Fig. 6B). In the biological process, nine categories highlighted 106 hits, and at least four processes appeared most relevant, namely, cellular, metabolic, stimulus and physiological regulation (Fig. 6C). Overall, these exercises specified few most germane pathways such as upstream signalling, transcriptional activation, cytoskeletal regulation by Rho GTPase, Nicotinic acetylcholine receptor signalling and Cadherin signalling and others (P ≤ 1.0E−5). In addition, PANTHER 14.1 software was also used to generate the allele-specific pathways belonging to EGLN1 SNP rs1538664G/A, rs479200C/T and rs480902T/C (Supplementary Material, Table S9). The distinct pathways associating with specific variant of the polymorphisms indicated the salient roles of the variants and the TFs in the physiological regulations.
Discussion
High heterogeneity among clinical, genetic, circulating biochemical and the numerous other biological factors contribute significantly to the overall physiological presentation of a sojourner under the hypobaric hypoxic environment; concerning to these parameters, the present study highlighted the that the two regulatory oxygen-sensing genes EGLN1 and HIF1AN are critical in their hypoxia-inducible responses under the hypobaric hypoxic environment.
The anomalous clinical parameters such as SaO2, PASP and MAP are first-line physiological markers for HAPE. The elevated levels of MAP and PASP in patients may lead to hypoxic pulmonary vasoconstriction, which is fundamental to the pathogenesis of HAPE (1,5,14,24). Additionally, HAPE susceptible individuals show greater heterogeneity in pulmonary blood flow (1). Such individuals possibly have blunted ventilatory responses that lead to vasoconstriction under hypoxia (4). This effect, besides, may be abetted by depleted vasodilators such as nitric oxide and factors of sympathetic overactivity (14,25–27). Similarly, SaO2 levels are also consistently depleted in HAPE (6,13,27,28); depletion of oxygen in the body perceptibly suppresses several physiological functions (11). It is possible that these abnormalities are aided by the genetic setup of susceptible individuals.
The role of the genome in a human system is cardinal. Minutest changes in its organization may produce significant diverse phenotypic effects because an SNP, individually and also in combination may influence gene regulation effectively (10,17,29,30). We though investigated several SNPs of the two genes but seven SNPs of EGLN1 and three SNPs of HIF1AN were differentially distributed; overrepresentation of the variant alleles verified the susceptibility to the disease. Moreover, the heterozygotes of EGLN1 were prevalent in the patients, whereas the homozygotes in the control groups. Each SNP seems important in one or the other combination. Another observation that could be relevant in dealing with health or diseases was that greater the number of variants, higher the susceptibility to the disorder once exposed to the hypobaric hypoxic environment. It all advocated EGLN1 to be a predisposition marker. It also implies that such combinations may be advantageous in preventing the disorder if the subjects are screened prior to induction to HAs.
We explored these SNPs for contribution to the regulation of the gene function (17,31) and affirmed that EGLN1 variants rs1538664A, rs479200T and rs480902C increased the gene expression. Higher the expression of EGLN1, higher the blocking of HIF1α through hydroxylation of the two prolines leading to ubiquitination of the latter. It prevents translocation of HIF1α to the nucleus, thereby hampering several physiological processes including, importantly, the oxygen signalling (11,24). Of concern, these variants may bring in newer factors that may alter the basic physiology to add to the severity. Encouraged by these findings and observations, we hence explored for the additional contributions coming from the associated secondary molecules, especially the TFs. Here, we identified the TFs that associated with the three SNPs. Importantly, adeptly supported by docking simulations, few TFs were validated that verified three TFs namely FUS, ARHGDIA and HYOU1 for the three SNP variants rs1538664A, rs479200T and rs480902C, respectively. It was amply evident that the risk variant-containing DNA regions had specific preferences for the TFs. In addition to enhancing or suppressing the gene transcription, these cofactors associate with other interactors to contribute to varied physiological functions (17,32).
When we look at the functions of the TFs through detailed network analyses, we realize that at any one locus one or more TFs may play a specific role and also multiple roles accordingly regulating a gene in question and even its surroundings. Further, a shift from normoxic to the hypoxic state may add to the varied regulation of these molecules. Thus, a physiological consensus, good or bad, is envisaged amongst the allele-specific proteins through these complex interactions. We highlight the function of FUS1, which is located in the nucleus. In a normal hypoxic state, the HIF1α gene and its associates translocate to the nucleus from the cytosol to activate several molecules and thus the physiological functions. Disruption of this system may lead to two potential disease-causing mechanisms: suppression of function in the nucleus and gain of deleterious function in the cytoplasm, which is perceived in the case of FUS (33,34). In the present case, it may upregulate EGLN1/prolyl hydroxylase activity promoting hypoxia tolerance through the rs1538664G → A change (20). SERP1 and SERP2, which similar to FUS attract the same TFBS rs1538664A, are associated with stress management (35), and stress is a major stimulant at HA. Likewise, ARHGDIA that binds to rs479200T is a multifunctional molecule (36). It seemingly associates with hematopoietic and diuretic activities, which are major clinical issues in susceptible subjects in the HA environment. The third major TF HYOU1 that binds to rs480902C belongs to the family of HSPs. Suppression of this protein is associated with accelerated apoptosis, though it is suggested to have an important cytoprotective role in hypoxia-induced cellular perturbation (37).
Of further relevance, our network analyses revealed these TFs to having multiple interactions with several other TFs; among these, the HSPs, HNRNPs and the glycolysis cycle molecules are majorly involved in the regulation of a physiological function (38,39). The preferences of specific members within these families strongly suggest deviations from a regular function. Our analyses advocate that these TFs may contribute to several pathways and among these the regulation of HIF1α through the EGLN1 appears to be prominent.
Conclusions
In summary, few attributes are worth highlighting. The two genes are relevant to the oxygen signalling, though in the final analysis EGLN1 emerged significant with three of its SNPs, and both gene and protein expressions amalgamated well to contribute to the susceptibility and protection against HAPE (Fig. 7). Perhaps, HIF1AN hypoxia-sensing mechanism needs further exploration attributed to its promiscuous availability for hydroxylation reactions of substrates other than HIF-1α and less sensitivity to oxygen changes unlike EGLN1 (40). The correlation analyses of major genotypes with the biolevels of EGLN1 and even with the clinical characteristics emphasized the functionality of the risk alleles. The genetic setup appears relevant in the regulation of the genes through the differential distribution of their variant alleles and the respective TFs in the healthy and susceptible subjects. Various techniques validated the specificity between a TF and allelic variants, such as Fus-rs1538664A, ARHGDIA-rs479200T and HYOU1-rs480902C. This was further validated by the annotation and docking simulation studies. Our method holds promise in elucidating the biological basis of a complex disease and thereby exploring the therapeutic applications. We do realize that validation of our concept outcome is vital for translational implications.
Methods
Human study subjects and sample collection
Study participants
Blood samples were obtained from subjects that were categorized into three well-defined groups: 1) HAPE-p were sojourners, who suffered the disorder upon exposure to HA (3500 m); 2) HAPE-f were the healthy subjects, who visited HA under similar conditions and carried out routine strenuous physical activities but did not suffer from the disorder and 3) healthy HL natives. HAPE-p and HAPE-f were permanent residents of low altitude (<200 m) of north India and were of Indo-Aryan ethnicity. They traveled to altitudes (3500–5600 m) for reasons such as professional assignments, recreation and adventure. HLs were permanent residents of altitude at and above 3500 m for many generations with Tibeto-Burman ethnicity. Approximately 1000 subjects for the three groups were recruited through Sonam Norboo Memorial (SNM) hospital, Leh (3500 m), Ladakh, India. General clinical parameters for each subject were recorded. Diagnosis of HAPE was based on published clinical criteria (24). Sample size of each group will differ according to the experiment and the results; exact numbers are depicted where applicable.
Expression analysis of EGLN1 and HIF1AN
Quantitative real-time PCR
Gene expression was determined on 10 samples each of HAPE-p, HAPE-f and HLs. Total RNA was extracted from an aliquot of 2 ml whole blood by TRI reagent RT blood (Molecular Research Centre, Cincinnati, OH). RNA quantity and quality were determined on a NanoDrop ND-1000 spectrophotometer and integrity was checked on 1.5% agarose gel. Total RNA, 1.0 μg, was used to generate cDNA by EZ-first strand cDNA synthesis kit for reverse transcriptase-polymerase chain reaction (RT-PCR) (Biological Industries, Beit HaEmek, Israel). Real-time PCR was performed in triplicate with primers (Pearl Primer software; Supplementary Material, Table S10) and SYBR Green PCR Master Mix on an ABI Prism 7300 Sequence Detection System (Applied Biosystems, Foster City, CA). The relative transcript quantity was calculated using the ΔΔCt method against 18SrRNA endogenous reference.
Estimation of biomarkers
Plasma EGLN1 and HIF1AN levels in ~550 samples of HAPE-p, HAPE-f and HLs were measured by immunoassay kit (USCN Life Science, Wuhan, China) on a high-throughput SpectraMax plus384 Spectrophotometer (Molecular Devices, San Jose, CA).
Selection of EGLN1 and HIF1AN polymorphisms
In the replication study comprising larger sample sizes (n = 1000 total), we shortlisted 30 SNPs of each gene based on functional relevance and use in previous studies. We employed Sequenom mass spectrometry-based genotype assay using iplex gold technology for EGLN1 (Supplementary Material, Table S11) and SNPType assays from Fluidigm Nanofluidic 48 (Fluidigm, San Francisco, CA) with consecutive next-generation sequencing for HIF1AN. The SNPType assay generated images of the data in the real time that were assessed for variant calling; ambiguous results were omitted. The data were analyzed using the BioMark SNP Genotyping Analysis software version 3.1.2.
Combinatorial associations through gene–gene interactions
The interactions of genotypes, gene–gene and gene–environment were analyzed using software MDR 1.2.2 (41). Pearson’s correlation (r) evaluated the correlation between the two continuous variables. The association of major genotypes and haplotypes was assessed using binary logistic regression analysis. The unpaired Student’s t-test compared the groups. Values are represented as mean ± standard deviation (SD). A P value <0.05, after adjustment with confounders and Bonferroni’s multiple corrections, was considered statistically significant.
Correlation analysis
Multiple correlation analyses using SPSS 16.0 were performed that included the various clinical parameters versus age, within the clinical parameters and between the plasma levels of EGLN1 and HIF1AN. Correlations of the alleles and genotypes were also assessed against clinical parameters and biomarkers. These analyses provided correlations between and among several molecules and thus evaluated whether these worked in sync or opposite to each other, eventually assessing the physiological pathways that develop.
Prioritization of functional SNPs using annotation approaches
Prioritization of the most functional SNPs at each risk locus was achieved with two well-characterized functional annotation tools, Variant Effect Tool (VET) (42) and RegulomeDB (43). VET determines the effect of the variants on genes, transcripts, protein sequence as well as regulatory regions. Whereas RegulomeDB annotates functional variants using a variety of data from ENCODE (44), including ChIP-Seq, FAIRE, DNase I hypersensitive sites and eQTL. The ratings of RegulomeDB range between 1 and 6, wherein a smaller rating suggests a higher probability and thereby a functional SNP. Thus, the SNP with the smallest rating was defined as top functional SNP. It helped us identify a few potential SNP candidates.
Validation of the allele-specific expression
Dual-luciferase assay
Cell culture
The HUVEC line was purchased from Thermo Fisher Scientific, Waltham, MA (cat no. C0035C), human embryonic kidney cell line HEK-293 was procured from the National Centre for Cell Sciences Pune, India and the human alveolar basal epithelial A549 cell line was procured from ATTCC (cat no. CCL-185). HEK293 and A549 cells were cultured in high-glucose Dulbecco’s Modified Eagle’s medium (C11995500BT, Gibco, NY) containing 10% fetal bovine serum (10091148, Gibco, NY), penicillin and streptomycin (100 U/ml; 10378016, Gibco, New York, NY). HUVECs were maintained in endothelial growth medium-2 and 2% fetal calf serum (Clonetics, San Diego, CA). All cells were maintained at 37°C in a 5% CO2 incubator. Antibiotics were withdrawn before performing assays, and all cells were cultured at 37°C with 5% CO2 and 95% air. No mycoplasma contamination was found for the cell lines used in this study.
Vector construction
A DNA sequence, ~ 400–600 bp, containing the EGLN1 or HIF1AN test SNPs was amplified using primers (Supplementary Material, Tables S12 and S13) linked with homologous arms that were identical with the sequence located at the multiple clone sites of pGL3-promoter vectors. PCR products were purified with the DNA Purification Kit (cat no. 28104, Qiagen, Germany). The pGL3-Promoter vector (E1761, Promega, Madison, WI) was digested with KpnI (FD0524, FastDigest, Thermo Fisher Scientific, USA) and XhoI (FD0694, FastDigest, Thermo Fisher Scientific, USA), and the digested products were purified with DNA Purification Kit. The purified DNA fragments bearing the test SNP were inserted into the multiple cloning sites of pGL3-Promoter Vector. All fragments were cloned 5′ immediately upstream of the promoter of the luciferase gene. The ligated vectors were used to transform Escherichia coli DH5α competent cells. Luria Broth agar plates with ampicillin were used to select the transformed cells, and the recombinant plasmids were extracted from the transformed cells grown from a single colony. A PCR-mediated point mutation was performed using Q5® Site-Directed Mutagenesis Kit (NEB, Ipswich, MA) to generate the desired DNA fragments containing the alternative alleles of each SNP. Sanger sequencing verified all the sequences of the inserted DNA fragments.
Cell transfection and dual-luciferase reporter gene assays
Both vectors (wild and mutant) carrying targeted variants were transformed separately in the cell lines. Transcription was observed by analyzing luciferase activity normalized with renilla activity using the dual-glow luciferase assay system (cat no. E1910, Promega, Madison, WI) on a luminometer. The constructed vector at 100 ng for 96-well plate and 500 ng for 24-well plate, and internal control plasmid pRL-TK (E2241, Promega, Madison, WI) at 10 ng for 96-well plate and 50 ng for 24-well plate were co-transfected into the tested cell lines. All the cell lines were transfected using Lipofectamine® LTX reagent (Invitrogen Corp., Carlsbad, CA). For HEK293, dual-Luciferase reporter gene assay was performed in a 24-well white plate containing 500 μl medium at 5 × 105 cells/ml and for HUVEC, in a 96-well white plate containing 150 μl medium at 2 × 105 cells/ml. After post-transfection, luciferase activity was measured in cellular extracts using the Dual-Glo Luciferase reporter assay on a luminometer. All constructs were transfected in triplicate and the results represent the mean of three independent experiments. Two-tailed Student’s t-test compared the significance that was set at P < 0.05.
Functional validation of the regulatory SNPs
Electrophoretic mobility shift assay
The A549 cell line was cultured and nuclear extracts were prepared using NE-PER nuclear and cytoplasmic extraction kit (Thermo-Fisher Scientific, Waltham, MA). Nuclear proteins were quantified by Bicinchoninic acid protein assay kit (cat no. 71 285-M, Sigma-Aldrich, St. Louis, MO). To prepare the double-stranded oligonucleotide fragments of 20–40 bp, two single-stranded oligonucleotides of the three SNPs of EGLN1 (Supplementary Material, Table S14) were synthesized (Sigma-Aldrich, St. Louis, MO) and annealed. Forward and reverse oligonucleotides at 1 pmol/μl were mixed with 10× annealing buffer (1 M Tris, pH 8.0, 0.5 M EDTA and 5 M NaCl), heated at 95°C for 5 min and cooled to room temperature. The 5′-ends of desalted synthetic oligonucleotides were radiolabeled with [Y-32P] ATP and T4 polynucleotide kinase. Probes were purified using the QIAquick nucleotide removal kit (cat no. 28304, Qiagen, Hilden, Germany). Fifty μl of the binding reaction mixture contained probe with 1.5 pmol concentration and 50 000–100 000 cpm, 10 μg nuclear extracts, 10× binding buffer (100 mM Tris, 500 mM KCl, 10 mM DTT; pH 7.5), 0.5 μg of double-stranded poly (dI-dC) (Sigma-Aldrich, St. Louis, MO), 50% glycerol and 100 mM MgCl2. The control contained labeled probe alone, and competition experiments were performed with an additional 10- and 50-fold molar excess of unlabeled probe. Reaction mixtures were incubated at 4°C for 30 min, followed by non-denaturing 6% polyacrylamide gel (39:1) electrophoresis at 4°C with 1-fold TBE running buffer (45 mM Tris, 45 mM boric acid, 1 mM EDTA; pH 8.3) at 20 mA for ~3 h to resolve protein–DNA complexes.
Extraction of proteins by in-gel trypsin digestion
The protein band was excised, cut into ~1 mm3 pieces, washed with water and destained using 1:1 mixture of potassium ferricyanide and sodium thiosulfate. The destained gel pieces were washed thrice with liquid chromatography–mass spectrometry grade water and partially dehydrated with a 1:1 mixture of acetonitrile ammonium bicarbonate (50 mM) and dried in a SpeedVac concentrator for 20 min. Samples were reduced with 25 mM DTT in 50 mM ammonium bicarbonate at 55°C for 20 min, followed by 30 min alkylation with 55 mM iodoacetamide in 50 mM ammonium bicarbonate at room temperature in dark, washed with acetonitrile and dried in a SpeedVac concentrator. Each gel sample was added with 12 ng/μl of sequencing grade modified trypsin (cat no. V511A, Promega, Madison, WI) in 25 mM ammonium bicarbonate and layered with 10 mM ammonium bicarbonate. The digestion was continued overnight on a shaker at 37°C. Next, 40% acetonitrile containing 0.1% formic acid was added, vortexed for 20 min and sonicated for 15 min to extract the peptides from the gel pieces; this step was repeated. All the fractions were pooled, and the volume was reduced to 10–20 μl on a SpeedVac.
LC–MS/MS to identify the TFs
LC–MS/MS was performed using a capillary HPLC system (LC Packings, Amsterdam, the Netherlands) coupled with a QSTAR XL quadruple time-of-flight mass spectrometer (TOF MS) (ABI/MDS Sciex, San Jose, CA) through a nanoelectrospray ionization source (Protana, Odense, Denmark). Analyst QS software was used for system control and data collection. The desired volume of protein solution was injected by the autosampler and desalted on a C18 trap column (300 μm × 1 mm, LC Packings) for 6 min at a flow rate of 10 μl/min. The sample was subsequently separated by a C18 reverse-phase column (75 μm × 15 cm, Vydac, Columbia, USA) at a flow rate of 220 ml/min. The mobile phases consisted of water with 0.1% formic acid (A) and 90% acetonitrile with 0.1% formic acid (B), respectively. A 90 min linear gradient from 5 to 50% B was typically used. After liquid chromatography separation, the sample was introduced into the mass spectrometer through a 10-μm silica tip (New Objective) adapted with a nanoelectrospray source (Protana, Odense, Denmark). Data were acquired in information-dependent acquisition mode. Each cycle typically consisted of a 1 s TOF MS survey from 400 to 1600 (m/z) and two 2 s tandem mass spectrometry (MS/MS) scans with a mass range of 65–1600 (m/z).
The LC–MS/MS data of the membrane fraction of A549 cells were submitted to a local MASCOT server for MS/MS ion search. The identified proteins were assessed for their coexpression using a coexpressMAP database, which includes 101 705 904 gene pairs (45). The protein pairs with high coexpression constants (<0.60) were considered as TFs associated with the corresponding sites.
Conformational analysis of DNA-protein interactions
Molecular docking
DNA–protein interactions were simulated using molecular docking studies. The TFBS was considered as 10 nucleotides up- and downstream from an SNP site. The sequence information of TFBS and TFs was extracted from the freely accessible browsers human genome assembly hg19 using UCSC genome browser (46) and UniProt Database [The Universal Protein Resource (UniProt), 2007], respectively. The TFBS signatures were predicted using the DiRE of co-regulated genes (47). The server determines the chromosomal location and functional characteristics of distant regulatory elements in higher eukaryotic genomes. It uses the gene coexpression data, comparative genomics and combinations of TFBS to find TFBS association signatures that can be used for discriminating specific regulatory functions.
Modelling the 3D structure of TFBSs and TFs
The 3D structure of TFs was extracted from the PDB (48). TFs with the unknown 3D structure were modelled using SWISS-Model (49). Further, the 3D structure of TFBS was modelled using the web tool, make-na server (http://structure.usc.edu/make-na/server.html), and the same was cross-validated with SWISS-Model.
Interaction of TFs with the corresponding TFBS
The binding affinity of TFBS carrying wt allele or vt allele with its corresponding TFs was assessed using PatchDock version 1.0 standalone software (50). PatchDock algorithm utilizes object recognition and image segmentation techniques for the docking simulations. It involves three steps: Molecular Shape Representation computes the molecular surface and detects geometric patches such as concave, convex and flat surfaces. Surface-Patch Matching tool matches the patches from the previous step, and third Filtering and Scoring, examines and ranks the candidate complexes from the previous step according to a geometric shape complementarity score. To identify the order of binding of TFs to TFBS, ACE was determined. TF binding with the lowest ACE was considered to be binding to the TFBS.
Intermolecular interaction analysis
The hydrogen and non-hydrogen bonds formed in the docked complexes of both wt and vt were identified by NUCPLOT v.1.1.4 (51). This tool generates schematic diagrams of protein–nucleic acid interactions for a given protein data bank (PDB) file. Chimera version 1.12 (52) visualized the structures of the complex and hydrogen bonds.
Supershift assay
Supershift assay was performed by adding 1 μg antibody each of FUS, ARHGDIA and HYOU1 (Cell Signalling Tech., Danvers, MA) to the sample reaction followed by 15 min incubation at 4°C then addition of labeled probe, so that it reveals specific binding of an antibody against FUS RNA-binding protein-rs1538664A, ARHGDIA-rs479200T and HYOU1-rs480902T DNA–protein complexes.
Statistical analyses
Genetic location is according to the February 2009 Human Genome Browser data. The individual role of SNPs in HAPE disease and health was established by multivariate logistic regression analysis using SPSS 16.0 software. Genotype and allele distributions, OR and 95% confidence interval (CI) were calculated and P values were adjusted with age and gender. Significance was maintained at ≤ 0.05 after FDR correction. Hardy–Weinberg equilibrium was checked using a χ2 goodness-of-fit test. To ascertain adaptive alleles, the genotype distribution of HLs was compared with that of Han Chinese (CHB, Han Chinese in Beijing, China and CHS, Southern Han Chinese, China) population that was in addition to HAPE-f and HAPE-p. The co-dominant models including additive and multiplicative models gave us the expected statistical power of our study as 99.9 with the inputs defined as minor allele frequency fixed at 5%, P value fixed at < 0.0005 and with ~1000 subjects were recruited in each group.
Acknowledgements
The authors thank the volunteers for their participation in this study, the staff and faculty at SNM hospital, Leh and CSIR-IGIB for their cooperation and support.
Conflict of Interest statement. None declared.
Funding
Cardiovascular Medical Research and Education Fund, Philadelphia, USA (grant CLP0020 to Q.P., A.M., K.S.); in part, Council of Scientific & Industrial Research (CSIR) India (grants MLP1401, BSC0123 to Q.P., A.M., K.S.); Indian Council of Medical Research, India (GAP0119/[ICMR No. 74/6/2015-Pers. EMS] to Q.P., D.P., H.N.S.).
Author Contributions
K.S. designed the study and performed the experiments, acquired, analyzed and interpreted the data of wet experiments, developed the genetic tables and figures and wrote the manuscript. A.M. designed the study, performed the experiments, acquired, analyzed and interpreted the data and wrote the manuscript. H.N.S. and D.P. performed the validation experiments, analyzed the data and wrote the manuscript. P.A. evaluated the data, reworked on all figures and wrote the related part. T.T. and G.M. handled all the subjects, diagnosed the patients, collected the clinical information and blood samples, interpreted the clinical findings and contributed to writing. R.K. and M.A.S. facilitated the research activities and wrote and read the manuscript. Q.P. conceived and designed the project, supervised all research activities, interpreted the data and results and wrote the manuscript and acquired all data.
Conflict of Interest statement. None declared