Functional validation of EIF2AK4 (GCN2) missense variants associated with pulmonary arterial hypertension

Abstract Pulmonary arterial hypertension (PAH) is a disorder with a large genetic component. Biallelic mutations of EIF2AK4, which encodes the kinase GCN2, are causal in two ultra-rare subtypes of PAH, pulmonary veno-occlusive disease and pulmonary capillary haemangiomatosis. EIF2AK4 variants of unknown significance have also been identified in patients with classical PAH, though their relationship to disease remains unclear. To provide patients with diagnostic information and enable family testing, the functional consequences of such rare variants must be determined, but existing computational methods are imperfect. We applied a suite of bioinformatic and experimental approaches to sixteen EIF2AK4 variants that had been identified in patients. By experimentally testing the functional integrity of the integrated stress response (ISR) downstream of GCN2, we determined that existing computational tools have insufficient sensitivity to reliably predict impaired kinase function. We determined experimentally that several EIF2AK4 variants identified in patients with classical PAH had preserved function and are therefore likely to be non-pathogenic. The dysfunctional variants of GCN2 that we identified could be subclassified into three groups: misfolded, kinase-dead, and hypomorphic. Intriguingly, members of the hypomorphic group were amenable to paradoxical activation by a type-1½ GCN2 kinase inhibitor. This experiment approach may aid in the clinical stratification of EIF2AK4 variants and potentially identify hypomorophic alleles receptive to pharmacological activation.


Introduction
Aberrant vascular remodelling in pulmonary arterial hypertension (PAH) raises pressures in the pulmonary vasculature to cause right heart failure [1].Affected young adults often suffer progressive disease leading to premature death.Although classical PAH is most frequently caused by mutations in the TGFβ/BMP signalling axis [2][3][4][5], rare subtypes such as pulmonary veno-occlusive disease (PVOD) and pulmonary capillary haemangiomatosis (PCH) have distinct genetic associations and are refractory to current clinical management [6].With no effective treatments apart from lung transplantation, death occurs within a year in 72% of patients diagnosed with these aggressive PAH subtypes [7].
Since the first report in 2014 linking biallelic mutations of EIF2AK4 to PVOD [8], approximately one hundred EIF2AK4 alleles have been reported to be associated with PAH and its subtypes [4,[8][9][10][11][12][13][14].Although frameshift mutations constitute a large proportion, approximately a third [34] of these alleles are missense variants, the functional consequences of which are unknown (Fig. 1) [15].Validating the pathogenicity of such variants of uncertain significance (VUSs) would aid in diagnosis, enable cascade genetic testing of relatives, and recruitment of patients to chemoprotective clinical trials [6].
EIF2AK4 encodes GCN2, a large serine/threonine kinase homodimer that responds to amino acid depletion by monitoring the efficiency of protein synthesis through its interaction with stalled ribosomes [16,17].GCN2 is comprised of an N-terminal RWD domain  involved in protein-protein interactions, pseudokinase (276-539) and kinase domains (585-1016) including GCN2's dimerisation interface, an HisRS-like domain (1058-1490) and C-terminal domain (1533-1649) where binding to ribosomes and recognition of uncharged tRNAs occur [18].When activated, GCN2 triggers a cellular signalling pathway termed the integrated stress response (ISR) by phosphorylating the α subunit of eukaryotic translation initiation factor 2 (eIF2α) [19,20].This attenuates most protein synthesis, while enhancing the translation of ISR-specific mRNAs owing to the presence of upstream open reading frames (uORFs) in their 5'UTRs [21]; an example is the transcription factor ATF4. PPP1R15A, a selective eIF2α phosphatase subunit, is similarly regulated and its expression eventually terminates the ISR [20][21][22].Due to its fundamental biological roles, disruption of the ISR is implicated in many diseases [23][24][25].
Technological advances have made genomic sequencing readily available in the clinic, leading to a proliferation in the Figure 1.Schematic of known patient-specific missense variants.GCN2 schematic and its domains.Variants over the domain schematic were from patients with classical PAH.Variants below were from patients with PCH and PVOD.Variants reported to be benign/likely benign in ClinVar database 33 are in blue, those reported as pathogenic/likely pathogenic are in red, black represents variants of uncertain significance.Occurrences represent cumulative allele counts in the gnomAD database 34 or published reports 8-14 .Variants highlighted in yellow are analysed experimentally in this study.number of VUSs encountered.Currently, strategies to predict the impact of novel genetic variants are largely restricted to computational methods that rely on evolutionary conservation (e.g.SIFT, PolyPhen) [26] or integrate a range of scores from existing predictive tools (e.g.CADD, REVEL) [27,28].Other methodologies include the computation of folding free energy differences (FoldX) [29], which predicts protein stability.More recently, deep learning approaches have been developed (EVE, AlphaMissense) [30,31] that account for both evolutionary constraints and structural information.Although these have improved predictive accuracy, computational approaches remain imperfect.We set out to examine patient-specific EIF2AK4 missense mutations both in silico and experimentally using existing bioinformatic tools and cell biological assays.In so doing, we subclassified PAH-associated GCN2 variants in functional (likely benign), destabilised/misfolded, or kinase impaired.Interestingly, a subset of the kinase impaired variants showed preserved target engagement.These hypomorphic variants were amenable to pharmacological rescue using an ATP-competitive GCN2 inhibitor [32].Based on these results, we propose a simple methodology for the experimental validation of the functionality of EIF2AK4 VUSs, which outperforms existing computational approaches.
We applied existing computational methods including SIFT, PolyPhen2, CADDv1.6,REVEL and AlphaMissense in an effort to predict the functional significance of these GCN2 variants [26][27][28]30] (Table 1).Higher scores represent higher predicted severity.We also used FoldX5.0 to estimate differences in Gibbs free energy ( G) between wildtype GCN2 and each variant (Table 1) [29].Although broadly concordat, the algorithms yielded several discordant results.For example, although H1202Y was predicted to maintain stable folding by FoldX, and tolerated by SIFT and CADD, it was categorised as severe/likely pathogenic by the other methods.Conversely, Y34C was predicted to be destabilised by FoldX ( G > 1.5 kcal/mol) and likely-pathogenic by PolyPhen, CADD and REVEL, but tolerated by SIFT and categorised as ambiguous by AlphaMissense.Nevertheless, amid a few uncertain results, by integrating these computational methods we predicted missense variants of GCN2 to be either (i) benign, (ii) misfolded, or (iii) kinase deficient (Summarised in Table 1).

Expression of GCN2 variants and ISR reporter activation.
To test the in silico predictions, we next performed experimental validation in cultured cells.A bioluminescent ISR reporter was generated by fusing the 5'UTR of ATF4 with NanoLuc ® luciferase (ATF4::NanoLuc, Fig. 2A).We obtained optimal results when using human rather than murine ATF4, driven by a CMV promoter/SV40 enhancer (data not shown).uORFs in the 5'UTR of ATF4 mRNA impose translational regulation on the downstream coding sequence [21].When endogenous GCN2 was deleted in reporter HeLa cells (GCN2 KO, Fig. S1), as expected activation of the ISR reporter by the histidyl-tRNA synthetase inhibitor histidinol was ablated (Fig. 2B).Re-expression of wild type human GCN2, but not a kinase-dead control mutant (K619R) rescued ATF4::NanoLuc responsiveness to histidinol, validating the system (Fig. 2B).
Sixteen GCN2 exemplar variants were selected as representative of each class of functional prediction, across a range of disease severities and distributed throughout the GCN2 protein (highlighted in Fig. 1).Each variant was expressed in the GCN2deleted reporter cells and bioluminescence was measured without or with histidinol treatment (Fig. 2B).We noted a striking correlation between ATF4::NanoLuc reporter activation and diagnosis (Table 2).In all but one case, when reporter activation was preserved (P15L, I839T, T943A, L1148S, H1202Y), the clinical diagnosis had been of classical PAH, rather than either PVOD or PCH.The exception was Y34C, which had preserved reporter activation Table 1.Known pulmonary hypertension-associated missense variants of GCN2.despite having been identified in an individual with PVOD (that patient also had a high impact variant in their other EIF2AK4 allele: Lys190GlufsTer8, c.567dup).All other variants identified in PVOD or PCH showed impaired ATF4::NanoLuc responsiveness to histidinol (Fig. 2B).
Expression levels of GCN2 variants therefore do not fully ref lect activity.

GCN2 autophosphorylation is necessary but not sufficient for ISR induction
Autophosphorylation of GCN2 at T899 correlates with kinase activation in most circumstances [32,35].Treatment with histidinol or starvation of amino acids, a physiological stimulus of the kinase, increased T899 phosphorylation of wildtype but not kinase-dead K619R GCN2 expressed in knockout cells (Fig. 3A-B & S1B-C).When the naturally occurring variants were tested, eleven were capable of T899 autophosphorylation, while five were not (L643R, A870V, S909R, R989W, L1295R) (Fig. 3C-F, summarised in Table 2).These results show that stressful stimuli triggered T899 autophosphorylation in all GCN2 variants capable of activating the ATF4::NanoLuc reporter, but also in five ISR-deficient variants (R585Q, V607G, G1109R, P1115L, H1202L) albeit only weakly (Figs.3C-F, summarised in Table 2).These results suggested that GCN2 autophosphorylation at T899 is necessary but not sufficient for ISR induction.From a structural perspective, the A870 (Fig. 4, in purple) is located on the kinase activation loop, in close proximity to residue K619 (in orange) of the N-lobe, essential for kinase activity (K619R yields a dead kinase).Mutations in these key residues led to total loss of autophosphorylation and kinase activity.L643, S909 and R989 (Fig. 4, in magenta) are essential residues for the tight packing of the kinase domain C-lobe.L643R, S909R, R989W introduce larger and differently charged sidechains into the hinge region or the Clobe of the kinase, disrupting the local protein structure required for kinase activity.On the other hand, I839 and T943 (Fig. 4, in blue) are located at the edge of the kinase helix.The I839T and T943A variants do not introduce larger sidechains nor disrupt local packing, hence retaining activity.
Inactive GCN2 exists as an antiparallel homodimer and transitions to a parallel conformation on activation [36,37].In yeast, stabilisation of the active state depends on the establishment of an intramolecular salt-bridge in the active conformation between residues R594 and D598, corresponding to R585 and E589 in the human protein [38] (Fig. S2).The patientderived R585Q and V607G variants, localised in the kinase domain N-lobe (Fig. 4, in green) were unable to activate the ATF4::NanoLuc reporter despite preserved expression (Fig. 2) and autophosphorylation (Fig. 3).Since the R585Q substitution is predicted to reposition the dimerisation interface salt-bridge required for activation, we sought to test if dimerisation was impaired.GCN2 constructs were generated tagged at the Cterminus with either 3xFlag or V5.Tagging did not impair GCN2 activity (Fig. 5A).When co-expressed in GCN2 deleted cells, wildtype GCN2-3xFlag and GCN2-V5 formed mixed dimers detectable by anti-FLAG co-immunoprecipitation (Fig. 5B).The R585Q variant similarly formed mixed dimers that could be coimmunoprecipitated (Fig. 5C).The ISR-deficient kinase domain variants V607G, L643R, and S909R, predicted in silico to affect the folding of the kinase domain (Table 1), were similarly able to form mixed dimers, suggesting that, at least for these variants, loss of dimerisation does not contribute to their impaired function (Fig. 5C-F).

Hypomorphic GCN2 variants can be activated by an ATP-competitive inhibitor.
We then sought to test whether autophosphorylation-competent, but ISR-deficient variants might have lost the ability to engage the substrate eIF2α.Tagged GCN2 was recovered by immunoprecipitation of 3xFlag from lysates of cells starved of amino acids (Fig. 6A).Immunopurified kinases were then tested for their ability to phosphorylate the N-terminal domain of eIF2α (eIF2α-NTD) in vitro (Fig. 6B-D).Wildtype but not kinase-dead K619R GCN2 showed enhanced T899 autophosphorylation when incubated with Mg-ATP, leading to increased phospho-GCN2 immunoreactivity and slower migration on SDS-PAGE (Fig. 6B-D).When incubated with Mg-ATP and eIF2α-NTD, wildtype but not kinase dead K619R GCN2 phosphorylated eIF2α-NTD on serine 51 (Fig. 6C-D).The autophosphorylation-competent variants R585Q, V607G, G1109R and P1115L, but not the kinase-dead S909R (Figs. 3C,E and 6B) also autophosphorylated their activation loop at T899 when incubated with Mg-ATP in vitro, and phosphorylated eIF2α-NTD albeit only weakly (Fig. 6C-E).The variants L643R, S909R, H1202L and L1295R, which showed either weak or no autophosphorylation combined with low expression (Fig. 6A,C) failed to phosphorylate eIF2α-NTD (Fig. 6C-D).These data suggest that while some Mutations that led to the loss of kinase activity include A870V, located in the activation loop (in purple), L643R in the hinge region (in magenta), S909R and R989W which are in the core of the C-lobe (in magenta).Variants I839T and T943A (in cyan), localised at the edge of the kinase helix without introducing bigger sidechains, did not affect folding and retained kinase activity.Variants R585Q and V607G localised in the Nlobe (in green) were hypomorphic.Bound ATP is shown in yellow.
PVOD/PCH-associated variants preserve target engagement in vitro, their weak kinase activity appears insufficient for downstream signalling in cells.We classified these variants as hypomorphic.
It was recently shown that GCN2 can be activated paradoxically by sub-inhibitory concentrations of ATP-competitive kinase inhibitors [39,40].Carlson et al. showed activity of GCN2-R585Q, one of the hypomorphic variants identified here, though not the L643R variant, by treatment with the type-1 1 / 2 kinase inhibitor Gcn2iB [32].We therefore tested the ability of such a small molecule, to activate the ATF4::NanoLuc reporter in cells expressing PAH-associated GCN2 variants.We found that all the identified hypomorphs (R585Q, V607G, G1109R and P1115L) were rescued in their reporter activation by Gcn2iB, while misfolded or kinase-dead variants were not (K619R, L643R and S909R; Fig. 7).

Discussion
Detection of biallelic pathogenic EIF2AK4 mutations establishes the diagnosis of PVOD or PCH without the need for histological confirmation [6].Validating the pathogenicity of EIF2AK4 variants is therefore of significant diagnostic value, which is important since therapies developed for classical PAH can be detrimental in these rare subtypes [7].Segregation studies, although the gold standard, are not always feasible.By contrast, genetic testing is routinely used in clinical practice.We found that integrative computational analysis failed to identify some ISR-defective variants of GCN2.However, our in cellulo assay using a sensitive ATF4::nanoLuc reporter cell line could reliably identify pathogenic Figure 5. Dimerisation of variants of GCN2 variants.(A) Representative immunoblot of GCN2 KO HeLa cells transiently transfected with construct encoding 3xFlag tagged human GCN2 (hGCN2_3xFLAG) treated with either 7 mM histidinol or starved of methionine and leucine (−met -Leu) for 7 h.n = 3 biological replicates (B-F) anti-FLAG immunoprecipitation-immunoblots from GCN2 KO cells transiently transfected with constructs encoding GCN2 variants tagged at the C-terminus with either 3xFLAG (GCN2_3xFLAG) or V5 (GCN2_V5).Lysates (lanes 1-3) and immunoprecipitates (lanes 4-6).Note co-immunoprecipitation of each pair of constructs consistent with dimer formation.variants of GCN2.This approach is simple and inexpensive, making it a feasible addition to characterisation workf lows in specialist clinical practice.
We showcased this approach by evaluating sixteen of the thirty-four known missense variants of GCN2 associated with PAH.Our observation that variants identified by genomic sequencing of individuals with classical PAH had preserved ISR reporter activity is consistent with GCN2 playing only a minor, or even no, role in that condition.Conversely, variants identified in individuals with either PVOD or PCH showed loss of ISR functionality, underlining the key role played by GCN2 in these disorders.The Y34C variant was a notable exception, maintaining some ISR reporter activity despite having been identified in an individual with PVOD.Of note, that patient also harboured a high-impact mutation of their second EIF2AK4 allele.In our study, GCN2 Y34C was expressed at a significantly reduced level compared to the wild-type protein, raising the possibility that when combined with a null allele, the level of GCN2 generated might be insufficient to prevent development of the disease.
Importantly, we identified a subset of GCN2 variants with preserved target engagement but reduced kinase activity.R585Q and V607G locate to the kinase domain N-lobe which is dominated by an anti-parallel beta-sheet and contains most of the residues involved in ATP binding.This portion of the kinase also participates in dimerisation and stabilisation of the active parallel conformation [41].Though our data exclude lack of dimerisation, it remains possible that mutants in the N-lobe could affect dimer activation.Conversely, G1109 and P1115, corresponding to G1085 and Q1091 in yeast, have been shown to be involved in tRNA binding [37].G1109R and P1115L mutations change mainchain f lexibility and introduce bulky sidechains, likely to alter such interaction.Strikingly, these hypomorphs could be activated by the ATP-competitive inhibitor Gcn2iB, a type 1 1 / 2 inhibitor that  stabilises the enzyme in a non-productive, yet active-like conformation (DFG-in, αC helix-out).It is believed that such binding to one protomer of a GCN2 dimer causes activation of the second drug-free promoter [39].This suggests a potential therapeutic strategy for individuals with such hypomorphic variants.Hypomorphs can be identified by our ATF4::nanoLuc reporter assay by treatment with Gcn2iB.
The experimental validation of GCN2 variants enabled us to evaluate computational predictive methods.It is recognised that evolutionary-based methods relying on homologous sequence alignment (e.g.SIFT, MAPP, PANTHER) are outperformed as standalone tools by approaches that integrate additional information such as structural features [42].We compared the integrative tools PolyPhen2 and CADD and found that out of sixteen variants they incorrectly predicted enzyme activity in 5 and 6 instances respectively, giving a positive predictive value of approximately only 60%.Indeed, functional predictions using these methods are not recommended for diagnostic purposes for this reason [43].Accuracy was improved with machine learning-based approaches, but even these rarely exceed 80% accuracy [44].It has been estimated that 75% of disease-causing variants are linked to protein destabilisation [45].The computation of Gibbs free energy variations with FoldX was recently reported as the best performing method for the identification of disease-causing mutations [46].FoldX predicted protein expression levels of GCN2 missense variants in our cell system, but the existence of stable but kinasedead or hypomorphic variants limits its value.Nevertheless, integrating conservation-based approaches with structural features improves predictive performance.Meta-predictors, such as the ensemble method REVEL [28], are better correlated with benchmark clinical datasets like ClinVar (reviewed in reference (47)).Although better than other tools, REVEL only returned 75% accuracy in our study.Recently AlphaMissense was developed, a deeplearning AlphaFold-derived system that combines residue structural context with unsupervised modelling of evolutionary constraints by comparing related sequences, claiming 90% accuracy in predicting the pathogenicity of missense variants when tested against the ClinVar dataset [31].When using ISR-reporter activation as our gold standard, AlphaMissense incorrectly assigned 3 of the 16 GCN2 variants examined.The complexity of modelling protein-protein and protein-ribosome interactions adds to the challenge of using computational approaches to classify GCN2 variants [16,35].
In summary, in cellulo evaluation of ISR signalling using an ATF4::Nanoluc reporter outperformed existing computational approaches.This approach can not only identify pathogenic variants but can also recognise hypomorphs that can be revitalised by an ATP-pocket-binding small molecule drug.

Cloning and plasmids
All cloning and mutagenesis were performed via Gibson Assembly.A human codon-optimised GCN2 ORF was cloned into pcDNA3.1-Hygro(+)vectors and tagged with either 3xFLAG or V5 at the Cterminus.Site-directed mutagenesis was performed with specific primers for 16 naturally occurring variants and a kinase-dead control.A human ATF4 5'UTR ORF was inserted into pGL4.2vectors and cloned in frame with Nluc-PEST ® luciferase.A stop codon was inserted before the C-terminal degron to allow accumulation of the reporter.

CRISPR/Cas9 knockout of EIF2AK4 in HeLa cells
Human EIF2AK4 specific guide RNAs were selected from the Brunello sgRNA library [48].After primer duplex formation, guides were inserted in pSpCas9(BB)-2A-mCherry plasmids.Parental HeLa cells, cultured in DMEM supplemented with 10% FBS were transfected in 6-well dishes using 1 μg DNA and Lipofectamine 2000 (1:3 ratio) in OptiMEM for 24 h.At day 3 post-transfection mCherry-positive cells were sorted on a DB Melody cell sorter, as single cells into 96 well plates.Clones were screened by GCN2targeting western blotting.Genetic mutations in KO clones were confirmed by genomic DNA extraction (100 mM Tris, 5 mM EDTA, 200 mM NaCl, 0.25% SDS, 0.2 mg/ml Proteinase K: incubation at 50 • C overnight, then for 20 minutes at 98 • C and clarification in a benchtop centrifuge at 10 000 g for 5 minutes), PCR to amplify the locus targeted by the guide and subsequent NGS.Data were analysed using MacVector.

Transfection and cell treatments
2×10 5 HeLa cells were plated in 6-well dishes and let attach overnight before transfection.1 μg of plasmid DNA was mixed with Lipofectamine 2000 (1:3 ratio) in OptiMEM and incubated for 20 minutes at room temperature.500 μL of transfection medium were added onto the washed cells and topped up with additional 500 μL of 10% FBS-DMEM.Transfection medium was removed after 24 h.On day 2 cells were split and transfected cells were selected by 300 μg/ml hygromycin treatment for 3 days.Cells were then maintained in 150 μg/ml hygromycin 10% FBS-DMEM.Cell treatments were carried out as follows: 7 mM histidinol for 7 h (for western blotting); amino acid starvation in SILAC medium supplemented with 10% dialysed FBS and 25 mM D-glucose (after 1× wash in PBS) without leucine only, leucine and methionine or lysine and arginine, for 7 h.

Figure 4 .
Figure 4. Structure of GCN2 kinase domain with mapped residues mutated in patients.Human GCN2 kinase domain structure (from PDB accession: 7QWK)41 highlighting the location of residues mutated in PAH, as well as K619 (in orange) mutated in the kinase-dead control.Mutations that led to the loss of kinase activity include A870V, located in the activation loop (in purple), L643R in the hinge region (in magenta), S909R and R989W which are in the core of the C-lobe (in magenta).Variants I839T and T943A (in cyan), localised at the edge of the kinase helix without introducing bigger sidechains, did not affect folding and retained kinase activity.Variants R585Q and V607G localised in the Nlobe (in green) were hypomorphic.Bound ATP is shown in yellow.

Figure 6 .
Figure 6.In vitro GCN2 kinase assay.(A) Representative immunoprecipitation-immunoblots from cells expressing 3xFlag tagged GCN2.Cells were starved of arginine and lysine for 6 h before harvesting.Intact GCN2 was eluted by competition via addition of excess FLAG peptide.Phosphorylation on threonine 899 reports autophosphorylation, note its absence for the K619R kinase-dead control.Note-For clarity two blots are joined as indicated by vertical black lines between 6&7.(B) Schematic of in vitro kinase assay drawn using Biorender.com.Tagged protein immunopurified using anti-FLAG beads then eluted with FLAG peptide.Bacterially expressed recombinant eIF2α-N-terminal domain (NTD) served as a specific substrate.(C).Representative immunoblots of reaction products.Note-For clarity two blots are joined as indicated by vertical black lines between lanes 12&13.(D) Quantification of C; data presented with median value.n = 4; 1-way ANOVA, compared to hGCN2-transfected control.* * * * P < 0.0001.