A unique mRNA decapping complex in trypanosomes

Abstract Removal of the mRNA 5′ cap primes transcripts for degradation and is central for regulating gene expression in eukaryotes. The canonical decapping enzyme Dcp2 is stringently controlled by assembly into a dynamic multi-protein complex together with the 5′-3′exoribonuclease Xrn1. Kinetoplastida lack Dcp2 orthologues but instead rely on the ApaH-like phosphatase ALPH1 for decapping. ALPH1 is composed of a catalytic domain flanked by C- and N-terminal extensions. We show that T. brucei ALPH1 is dimeric in vitro and functions within a complex composed of the trypanosome Xrn1 ortholog XRNA and four proteins unique to Kinetoplastida, including two RNA-binding proteins and a CMGC-family protein kinase. All ALPH1-associated proteins share a unique and dynamic localization to a structure at the posterior pole of the cell, anterior to the microtubule plus ends. XRNA affinity capture in T. cruzi recapitulates this interaction network. The ALPH1 N-terminus is not required for viability in culture, but essential for posterior pole localization. The C-terminus, in contrast, is required for localization to all RNA granule types, as well as for dimerization and interactions with XRNA and the CMGC kinase, suggesting possible regulatory mechanisms. Most significantly, the trypanosome decapping complex has a unique composition, differentiating the process from opisthokonts.


INTRODUCTION
Eukary otic 5 -3 mRNA deca y is initiated b y remov al of the poly(A) tail by a deadenylation complex, followed by 5 -end deca pping and 5 -3 exoribonucleol ytic degradation. It is the dominant mRNA decay pathway in many eukaryotes, including yeast and Kinetoplastida, and is highly conserved.
The Kinetoplastida separated early from the main eukaryotic linea ge b ut retain a conserved deadenylation complex (CAF1 / NOT ( 1 , 2 )) and conserved 5 -3 exoribonuclease (XRNA ( 3 , 4 )). Howe v er, the decapping reaction is mechanistically distinct as Kinetoplastida lack orthologues of the canonical decapping enzyme, the nudix domain protein Dcp2 and all associated factors (Dcp1, Edc1-3 and Pat1) that are present in other eukaryotes. Instead, the ApaH-like phosphatase ALPH1 (Tb927.6.640; UniProt ID: Q583T9) is the decapping enzyme in T. brucei ( 5 ) and likely all Kinetoplastida ( 6 ). A fraction of ALPH1 colocalizes with XRNA to a granular structure at the posterior pole of the cell (the PP-granule) that is devoid of most other RNA metabolizing proteins ( 5 ) and has no known function. ApaH-like phosphatases are unrelated to nudixdomain proteins but originate from the bacterial ApaH protein, a subgroup of the family of phosphoprotein phosphatases (PPP) ( 7 , 8 ).
Enzymes of the PPP family are present in all major branches of the eukaryotic lineage, albeit absent from some taxa, as for example mammals and land plants (6)(7)(8). Removal of an mRN A ca p by ApaH-like phosphatases is novel in eukaryotes, and recent bioinformatics analysis indicates probably restricted to Kinetoplastida ( 6 ). There is evidence for a bacterial origin for this mechanism as some bacterial mRNAs are capped with a nucleoside-tetraphosphate cap under certain conditions, and these caps are removed by ApaH, suggesting that Kinetoplastida continue to use a prokaryote-deri v ed mechanism (9)(10)(11).
Eukaryotic ApaH-like phosphatases remain poorly studied. To the best of our knowledge, apart from T. brucei ALPH1, only the Ppn2 ApaH-like phosphatase of S. cerevisiae had been experimentally characterized and is an endopolyphospha tase loca ted within the vacuolar lumen ( 12 ). The substr ate r ange of A paH / A paH-like phosphatases a ppears rather broad. Specifically, the bacterial ancestor enzyme ApaH cleaves pyrophosphate bonds of NpnN nucleotides where n ≥ 3 ( 13-15 ) and mRN As ca pped with NpnN ( 9 , 11 ) and also possesses phosphatase and ATPase activity ( 16 ). Further, we recently demonstrated that three randomly chosen eukaryotic ALPH enzymes from three different eukaryotic lineages all possess mRN A deca pping activity in vitro , even though all have predicted / demonstrated mitochondrial localizations rendering it unlikely that cytoplasmic mRNAs are physiological substrates ( 6 ). Despite this, Kinetoplastida ALPH1 has a highly specific role in mRN A deca pping ( 5 ).
The Kinetoplastida decapping ALPH ar chitectur es ar e distinct from other eukaryotic ALPH proteins by possessing unique C-terminal extensions of ∼250 amino acids, and most also possess a unique N-terminal extension between ∼200 and 500 amino acids. These extensions contain no identifiable motifs or sequence similarities to other proteins, but are implicated in mediating regulation and substrate specificity ( 5 , 6 ). In contrast, the vast majority of eukaryotic ApaH-lik e phosphatases, lik e their bacterial ancestor, consist of just the catalytic domain ( 6 ).
Here, we define the composition of the T. brucei ALPH1 m ulti-protein deca pping comple x, inv estigate the functions of the unique C-and N-terminal extensions of ALPH1 and provide a model for function and regulation of the complex. The C-terminus of ALPH1 is more important for ALPH1 function, protein interaction and localization than the Nterminus, which is dispensable in in vitro culture. Proximity labelling identified a cohort of ALPH1 interacting proteins: With the exception of XRNA all are unique to Kinetoplastida and some interactions r equir e either the N-or C-terminal domain of ALPH1. Confirmation by re v erse isolations and localization studies, together with demonstration of conservation of complex composition in Trypanosoma cruzi , suggests a trypanosome decapping complex with intriguingly unique composition and hence mechanism.

Structural analysis
Secondary structure and disorder predictions were done using Pr edictProtein ( 17 ). A pr edicted structural model cr eated with the AlphaFold Monomer v2.0 pipeline ( 18 ) was downloaded from the AlphaFold DB version 27 January 2022 ( 19 ). Molecular graphics and analyses were performed with UCSF ChimeraX ( 20 ).
Proteins were loaded to HiTrap Q HP columns (WT, N) or HiTrap Heparin HP columns (cat, C*) equilibrated with buffer A (50 mM Tris pH as dilution buffer, 100 mM NaCl, 10% glycerol, 0.5 mM TCEP). The protein was eluted with a gradient of 100 mM to 500 mM NaCl in the same buffer. Protein-containing fractions were pooled and diluted 1:1 with dilution buffer at pH 7.5 and the His-SUMO tag was removed by overnight incubation with 200 g selfmade SUMO protease at 4 • C. The last stage of protein purification was size exclusion chromato gra phy performed on a Super de x 200 Increase 10 / 300 GL (GE Healthcare) (WT, N) or Super de x 75 10 / 300 GL (GE Healthcare) ( C*, ca t) pre-equilibra ted in 50 mM HEPES pH 7.4, 150 mM NaCl, 5% glycerol, 0.5 mM TCEP. After purification, all proteins were flash frozen in liquid nitrogen and kept a t -80 • C .

Size-e x clusion chromatography -multiangle light scattering (SEC-MALS)
Size exclusion chromato gra phy coupled with m ultiangle light scattering (MALS) was used for protein molecular mass determination. The experiment was performed using NGC Scout 10 Medium-Pr essur e Liquid Chromatography System (BioRad) connected to a light detector miniDAWN TREOS (Wyatt Technologies) and Refrac-toMax 520 Refracti v e Inde x Detector (ERC Inc). The ALPH1 full length / N / C* / cat (100 l) were injected at 5 mg / ml on a Super de x 200 Increase 10 / 300 GL column (GE Healthcare) and run at 15 • C with 0.4 ml / min flow rate in PBS buf fer. The da ta were analysed using Astra 7 software.

Trypanosomes and cell culture
Trypanosoma brucei Lister 427 procyclic cells were used for most experiments. All over expr ession and RNAi experiments were done in Lister 427 pSPR2.1 cells that express a TET r epr essor ( 21 ); expr ession is induced with tetracyclin (1 g / ml for 24 h). Cells were cultured in SDM-79 ( 22 ) at 27 • C and 5% CO 2 . Transgenic trypanosomes were generated by standard procedures ( 23 ), using the Amaxa Nucleofector (Lonza Cologne AG, Germany) with home-made transfection buffer ( 24 ). All experiments used logarithmically growing trypanosomes. For starvation, one volume of cells was washed once in one volume PBS and cultured for 2 h in one volume PBS. Heat shock was done for 2 h at 41 • C either in a thermoblock or wa terba th. Actinomycin D was used at 10 g / ml from a 1:1000 stock in DMSO, sinefungin at 2 g / ml from a 1:1000 stock in water and cy clohe ximide at 50 g / ml from a 1:100 stock in water. Monomorphic bloodstream forms T. brucei strain Lister 427 expressing VSG221 were cultivated at 37 • C in 5% CO 2 in HMI-9 containing 10% (v / v) fetal bovine serum; transfections were done as in procyclic cells.
Trypanosoma cruzi Lister Dm28c ( 25 ) epimastigote cells were used for all experiments. Cells were cultured at 28 • C in li v er infusion tryptose (LIT) medium supplemented with 10% hea t-inactiva ted bovine fetal serum. All experiments used lo garithmicall y growing trypanosomes of 3 days of cultur e. Transgenic trypanosomes wer e gener ated by tr ansfection using Amaxa Nucleofector (Lonza Cologne AG, Germany) as previously described ( 26 ).

Plasmids and cloning
For endogenous tagging of proteins with different tags we used either the tagging system from ( 27 ) or the PCR-based appr oach fr om ( 28 ). For inducib le ov ere xpression we used a T7-polymerase independent system based on the PARP promoter regulated by a TET operator ( 21 ). A full list with details on plasmids / PCR products used for the generation of all transgenic cell lines with tagged proteins described in this work is available as Supplementary Table S4. To delete one ALPH1 allele, a blasticidin resistance cassette was flanked by the 560 nts upstream of the ALPH1 ORF and 607 nts downstream of the ALPH1 ORF to allow deletion of ALPH1 by homologous recombination ( = SK390). For replacing one endogenous allele with ALPH1 N we cloned the 550 nts upstream of the ALPH1 ORF, a neomycin resistance cassette, the ␤-␣ tubulin intergenic region and the nts corresponding to amino acids 222-478 of ALPH1 ( = SK464) into a pJET cloning vector; the entire cassette was released with restriction enzymes at both sites and used for transfection. The control for the Turbo-ID experiment was an eYFP-TurboID-HA fusion protein that was inducibly expressed ( 21 ) ( = SK547). The ALPH1 RNAi plasmid was previously described ( 5 ).
For T. cruzi episomal TcXRNA-GFP tagged expression, the PCR product was cloned into the pDONR221 TM vector from Gateway Technology (Invitrogen) and was then recombined into a pTcGWGFP N NH vector ( 29 ). Transfections were performed as described above and cells were selected with geneticin (50 g / ml).
Pr epar ation of trypanosomes for imaging / HaloTag ® Trypanosomes expr essing fluor escent proteins wer e fix ed over-night in 2.8% paraformaldehyde (PFA) in PBS and washed in PBS prior to imaging. To visualize the CMGCkinase fused to HaloTag, life cells were washed once in SDM-79, incubated with 1 M HaloTag ® TMR Ligand (Promega) in SDM-79 for 15 min, washed twice in PBS, fixed in 4% PFA for 20 minutes at RT, washed once with PBS / 500 mM glycine, washed once with PBS and imaged. Cells were mixed 1:1 with DAPI (4 ,6-diamidino-2phenylindole, 5 g / ml) on the slide prior to imaging.

FISH experiments combined with immunofluorescence
Fluorescence in situ hybridization and immunofluorescence was done as previously described ( 30 ), except that we used oligos with a direct fluorophore (Cy3 on both ends). The sequence used to probe for the poly(A) tail was T 30 , the sequence to probe the miniexon was: CAA TA TAGTACAGA AACTGTTCT AATAAT AGCGTT. The experiment was controlled by the respecti v e sense oligos.

Western blots and antibodies
Western blots were performed using standard procedures. Biotinylated proteins were detected with IRDye800 CW streptavidin and the HA tag was detected with rat-anti-HA (3F10, Sigma). ALPH1 antiserum was raised against recombinant ALPH N protein fused to an N-terminal 6xHis tag, that was expressed in RosettaTM (DE3) competent E. coli (Novagen). ALPH1 N was extracted from the soluble fraction of the bacterial cell lysate via nickel affinity using standard procedures and used to immunize rabbits. Serum was affinity-purified against recombinant ALPH1 N protein and specificity was tested on both Western blot and immunofluorescence, using an ALPH1 RNAi cell line as well as transgenic ALPH1 cell lines as controls (Supplementary Figure S2). Sadly, the antibody lost activity after 2 years and is not available any longer.

Cryomill affinity capture of protein complexes from T. brucei
XRNA was expressed as C-terminal mNeonGreen fusion and ALPH1 as N-or C-terminal eYFP fusion, respecti v ely, all from the endogenous locus in procyclic T. brucei cells. Cells were harvested, subjected to cryomilling and affinity ca pture, essentiall y as described previousl y, except that for XRNA-mNeonGreen mNeonGreenTrap magnetic agar ose (Chr omotek) was used ( 32 , 33 ). In brief, 2 litre cultures of PCF trypanosomes at a density of a pproximatel y 8 × 10 6 cells / ml were harvested at 1500*g and washed once with serum free SDM-79. Resuspended cells were then sedimented by centrifugation (1500*g) into a capped 20 ml syringe placed in a 50 ml Falcon tube. After discarding the supernatant, inserting the plunger and removing the cap the cells were passed slowly into liquid nitrogen in order to form small pellets suitable for subsequent cryomilling. These cell pellets were processed by cryomilling at 77 K into a fine po w der in a modified planetary ball mill (Retsch) ( 34 ). Six smidgen spoons of cell po w der were suspended in 6 ml icecold buffer A (20 mM HEPES pH7.4, 250 mM NaCl, 0.5% CHAPS, complete EDTA-free protease inhibitor cocktail (Roche)), sonicated with a microtip sonicator (Misonix Ultrasonic Processor XL) at setting 4 ( ∼20 W output) for 5 × 1 second, transferred into six Eppendorf LoBind tubes and insoluble material was removed by centrifugation (20000 g, 10 min, 4 • C). The clear lysate was incubated with either 3 l mNeonGreen Trap magnetic agarose (Chromotek), or GFP Trap magnetic agarose (Chromotek) for 30 min on a rotator, then washed three times with buffer A. After pooling the 6 samples, captured protein was eluted by incubation in 30 l 4 x NuPAGE LDS sample buffer (ThermoFisher), supplemented with 2 mM dithiothreitol, at 72 • C for 15 minutes and then run 1.5 cm into a NuPAGE Bis-Tris 4-12% gradient polyacrylamide gel (ThermoFisher). The respecti v e gel region was sliced out and subjected to tryptic digest and reducti v e alkylation using standard procedures. Eluted proteins were analysed by LC-MSMS on an Ulti-mate3000 nano rapid separation LC system (Dionex) coupled to an Orbitrap Fusion mass or Q-e xacti v e mass spectrometer (Thermo Fisher Scientific). At least 3 replicate experiments were performed. Wild type cells (at least 3 replicates) served as control.

Affinity capture of protein comple x es from T. cruzi
Cytoplasmic lysates from T. cruzi Dm28c expressing T cXRNA-GFP or T cALPH1-GFP were generated after cell disruption through cavitation as described ( 35 ) with modifications. Lo garithmicall y growing cells were harvested by centrifugation at 3.000g at RT, washed once with ice cold PBS, followed by resuspension in lysis buffer (20 mM HEPES-KOH, pH 7.4, 75 mM potassium acetate, 4 mM magnesium acetate, 2 mM DTT, supplemented with cOmplete ™ Pr otease Inhibitor Cocktail, fr om Merck) to a concentration of 1 × 10 9 cells / ml. The resuspended cells wer e transferr ed into the Cell Disruption Vessel 4639 (P arr) and incubated at 4 • C under 70 bar pr essur e for 40 min, followed by rapid decompression. The lysates were microscopically controlled for completion of cell lysis and then centrifuged at 17,000g for 10 min to remove cellular debris.
The immunoprecipitation assays were performed using anti-GFP nanobodies ( 36 ) bound to Dynabeads ® M-270 Epoxy magnetic beads, as previously described ( 34 ). 1 ml lysate was incubated with 3 ul anti-GFP magnetic beads at 4 • C under agitation for 1-2 h. The beads were washed three times with lysis buffer and proteins were eluted by boiling in sample buffer for 5 minutes. For mass spectrometry, the eluted proteins were loaded onto 13% SDS-PAGE gels and allowed to migrate into the resolving gel. Gel slices containing the whole IP products were then excised and submitted to an in-gel tryptic digestion and mass spectrometry analysis as previously described ( 37 , 38 ).

Affinity enrichment of biotin ylated pr oteins and on-beads tryptic digests
The expression of TurboID fusion proteins was induced for 24 h with tetracy clin, unless e xpression was from the endogenous locus. No extra-biotin was added, as we found the biotin concentration in the SDM79 medium (827 nM) to be sufficient for high le v els of biotinylation. 5 × 10 8 cells were harvested at a cell density of 1 × 10 6 to 10 7 cells per ml at 1400 g , washed once with serumfree medium and pellets were ra pidl y frozen in liquid nitrogen and stored a t -80 • C . For isolation of biotinylated proteins, each cell pellet was resuspended in 1 ml lysis buffer (0.5% octylpheno xypolyetho xyethanol (IGEPAL), 0.1 M piperazine-N,N -bis(2-ethanesulfonic acid) (PIPES)-NaOH pH 6.9, 2 mM ethylene glycol-bis( ␤-aminoethyl ether)-N ,N ,N ,N -tetraacetic acid, 1 mM MgSO 4 , 0.1 mM ethylenediaminetetraacetic acid (EDTA), complete protease inhibitor cocktail (Roche)) and incubated for 15 min a t room tempera tur e in an orbital mix er. Soluble and nonsoluble fractions were separated by centrifugation (14 000 g , 5 min, 4 • C) and the soluble fraction incubated with 100 l streptavidin-linked Dynabeads (MyOne Streptavidin C1, Thermofisher) for 1 h at 4 • C under gentle mixing. Beads were washed twice in 1 ml buffer 1 (2% (w / v) SDS in water) once in 1 ml buffer 2 (0.1% (w / v) deoxycholate, 1% Triton X-100, 1 mM EDTA, 50 mM HEPES pH7.5, 500 mM NaCl), once in 1ml buffer 3 (250 mM LiCl, 0.5% IGEPAL, 0.5% (w / v) deoxycholate, 1 mM EDTA, 10 mM Tris-HCl pH 8.1) and once in 1 ml buffer 4 (50 mM Tris-HCl pH 7.4, 50 mM NaCl); each washing step was eight minutes at room temperature (RT) under orbital shaking. Beads were then pr epar ed for tryptic digestion by washing thr ee times in 500 l ice-cold 50 mM NH 4 HCO 3 , resuspension in 40 l of the same buffer supplemented with 10 mM dithiothreitol and incubation in a thermomixer at RT for 1h. Iodoacetamide was added to a concentration of 20 mM, followed by incubation in the dark at RT for 30 min. Finally, 5 g / ml proteomics-grade trypsin (SOLu-Trypsin, SigmaAldrich) was added to the beads. The digest was done overnight at 30 • C in a thermomixer (1000 rpm). After removal of the first eluate, beads were resuspended in 50 l 50 mM NH 4 HCO 3 supplemented with 10 mM dithiothreitol and 5 g / ml mass spectrometry (MS) grade trypsin and incubated in a thermomixer at 37 • C for 1h. The eluate was combined with the first eluate, and both were lyophilized in a Speed-vac (Christ alpha 2-4). Peptides were resuspended in 50 mM NH 4 HCO 3 and passed over C 18 stage tip columns as described ( 39 ). After removal of polymers by the HiPPR procedure (Thermo Fisher Scientific), peptides were analysed by liquid chromato gra phy-tandem mass spectrometry (LC-MSMS) on an Ultimate3000 nano rapid separation LC system (Dionex) coupled to an LTQ Q-exactive mass spectrometer (Thermo Fisher Scientific). LoBind tubes (Eppendorf) were used throughout. Triplicate experiments were perf ormed f or each cell line.

Analysis of proteomics data
Spectra were processed using the intensity-based label-free quantification (LFQ) in MaxQuant version 1.6.16 ( 40 , 41 ). LFQ data were analysed using Perseus ( 42 ). For statistical analysis, LFQ values were log 2 transformed and missing values imputed from a normal distribution of intensities around the detection limit of the mass spectrometer. These values were subjected to a Student's t -test comparing an untagged control ( wt parental cells) triplicate sample group to the bait triplicate sample groups.
For BioID, a second control, cells expressing a eYFP-TurboID fusion, served to identify proteins biotinylated in a non-specific manner. -log 10 t -test p -values were plotted versus t -test difference to generate multiple volcano plots (Hawaii plots). Potential interactors were classified according to their position in the Hawaii plot, a ppl ying cut-off curves for significant class A (SigA; FDR = 0.01, s0 = 0.1), significant class B (SigB; FDR = 0.05, s0 = 0.1) and significant class C (SigC; FDR = 0.05, s0 = 2.0). The cut-off is based on the false discovery rate (FDR) and the artificial factor s0, controlling the relati v e importance of the t -test p -value and difference between means (At s0 = 0 only the p -value matters, while at non-zero s0 the difference of means contributes).
For XRNA affinity capture and for the CMGC-kinase BioID, enrichment ratios were calculated in addition to the statistical analysis.

Modelling predicts the ALPH1 N-terminal extension as unstructured and the C-terminal extension as ␣ -helical
T. brucei ALPH1 comprises 734 amino acids and consists of a catalytic domain, flanked by N-terminal and C-terminal extensions of similar size (Figure 1 A). We analysed the sequence and predicted structure of ALPH1, which suggests that the N-terminal region is largel y unstructured, w hereas the C-terminal region contains a novel, structured domain connected to the central catalytic domain by a disordered Secondary structur e pr edictions for ALPH1. The N-terminal r egion of ALPH1 is pr edicted to be mostly unstructur ed, e xposed and disor dered, with the longest almost continuous disorder ed r egion at r esidues 57-166. The central r egion r esponsible for the catalytic activity contains a predicted ␣/ ␤-domain w hich overla ps with the annota ted Metallo-dependent phospha tase-like r egion (Interpro superfamily IPR029052, r esidues 253-519). The catalytic domain is followed by a disordered linker and a small C-terminal domain comprised of ␣-helices. ( C-E ) Structural model of ALPH1 predicted by AlphaFold. Shown in cartoon depiction are the catalytic domain (residues 253-543, in pink) and the C-terminal domain (residues 598-734, in rainbow). Regions with low model confidence (the N-terminal region and two disordered linkers at residues 421-437 and 544-597) were omitted. The central part of the catalytic domain overlaps well with ALPH2 crystal structure (PDB entry 2QJC, ( 47 )) with 44% sequence identity in this region (residues 263-511, in hot pink). Highlighted are key residues forming the acti v e site coming from the 4 phosphatase motifs (in khaki) and 2 ALPH motifs (in teal) ( 6 ). The inset ( D ) shows superposition of these residues with those from ALPH2 (PDB entry 2QJC, in light blue), including two Mn 2+ ions (purple), 3 water molecules (their oxygens in red) and a phosphate ion (orange). ( E ) The predicted aligned error for the AlphaFold model (available at the AlphaFold Protein Structure Database entry for UniProt ID Q583T9) indicates two domains with little confidence regarding their mutual orientation (values above 15 Å in the region corresponding to the inter-domain accuracy for the catalytic and C-terminal domains, indicated by the blue rectangle). linker (Figure 1 B). An AlphaFold model of ALPH1 is in agreement with these observations (Figure 1 C). The Nterminal extension, up to residue 250, is predicted as a continuous disorder ed r egion with per-r esidue confidence scor e (pLDDT) < 50 (omitted in Figure 1 C) suggesting that this region is unstructured ( 18 ). The catalytic domain is predicted to closely resemble the crystal structure of T. brucei ALPH2 (PDB entry 2QJC ( 47 )) followed by a linker containing an additional ␣-helix (residues 528-542) that docks in the vicinity of the acti v e site and possib ly regulates accessibility. The acti v e site residues of the trypanosome ALPH1 model superimpose with the T. brucei ALPH2 acti v e site (Figure 1 D), including those residues coordinating divalent cations ( 6 ). The C-terminal region (598-743) was modelled as a small, structured domain comprised of se v en ␣-helices. A PDBeFold ( 48 ) search with the latter domain model re v ealed structural similarity to the Ge-1 domain of Drosophila melanogaster Edc4 (pdbID 2VXG; 105 residue structural alignment; Q-score = 0.28) ( 49 ). Edc4 (enhancer of decapping 4) is a metazoan specific decapping factor that bridges the Dcp1-Dcp2 interaction while interacting with XRN1 ( 50 , 51 ). Whereas the position of the ␣-helix (528-542) at the C-terminus of the catalytic domain is predicted confidently, the relati v e orienta tion of the ca talytic and C-terminal domains is uncertain as indicated by predicted aligned error values in excess of 15 Å (Figure 1 E). The linker connecting these two domains was predicted with low pLDDT score and is likel y mostl y unstructured. Although it cannot be excluded that the C-terminal domain interacts with and partially occludes the catalytic domain regulating accessibility, we conclude that the mutual positioning of these domains in the AlphaFold model is likely largely artificial, and we have no evidence suggesting their interaction.  Size exclusion chromato gra phy coupled with multiangle light scattering was used for protein molecular mass determination of wild type ALPH1 (WT), ALPH1 N (amino acids 222-734), ALPH1 C* (amino acids 120-552) and ALPH1cat (amino acids 222-552). All proteins were expressed in E. coli fused to a His-SUMO tag, purified using Nickel-af finity chroma to gra phy followed by Ion-exchange chromato gra phy (Q or Heparin column) and the His-SUMO tag was cleaved off with SUMO pr otease. All pr oteins were loaded a t a concentra tion of 5 mg / ml. Full length ALPH1 and ALPH1 N elute as a dimer, while ALPH1 C* and ALPH1cat elute as monomers.

ALPH1 is a dimer and dimerization r equir es the C-terminus
We produced recombinant ALPH1 to characterize its oligomeric state. ALPH1 was expressed in E. coli fused to an N-terminal His-SUMO Tag and purified using nickelaf finity chroma to gra phy. The tag was removed with a SUMO protease and the molecular mass of the recombinant protein analysed by size exclusion chromato gra phy coupled with multiangle light scattering (SEC-MALS) (Figure 2 ). ALPH1 has a predicted molecular weight of 79.3 kDa but elutes with a molecular weight of 166.7 ± 1.3 kDa, indicating it is dimeric. The experiment was repeated with ALPH1 lacking the N-terminal domain (ALPH1 N; amino acids 222-734, theoretical molecular weight: 56.7 kDa) with similar results: ALPH1 N elutes at 118.5 ± 1.8 kDa and is thus also dimeric. ALPH1 truncations lacking the C-terminal domain, either with most of the N-terminus still intact (ALPH1 C*, amino acids 120-552, theoretical molecular weight 47.9 kDa) or without the N-terminus (ALPH1cat; amino acids 222-552, theoretical molecular weight 36.7 kDa) elute at 49.9 ± 0.5 and 36.8 ± 0.8 kDa respecti v ely, indica ting tha t both proteins are monomeric. Hence ALPH1 is dimeric in solution and dimerization requires the C-terminus.

The N-terminus of ALPH1 mediates localization to the posterior pole but is dispensable in cultured trypanosomes or for targeting to stress granules
The N-terminus of ALPH1 is poorly conserved between Kinetoplastida ( 5 ) and entirely absent in the ALPH1 homolo gues of Tr ypanosoma grayi and Leptomonas pyrrhocoris ( 6 ). It is predicted as unstructured ( Figure 1 ) and it is dispensable for dimerization ( Figure 2 ). All the data point towards a non-essential function of this domain. To test this possibility, one allele of ALPH1 was replaced by an Nterminal truncated ALPH1 (ALPH1 N; 222-734) to create ALPH1 N / +, and the second allele replaced by a blasticidin resistance gene to create ALPH1 N / -. Correct insertions were confirmed by western blotting (Supplementary Figure S1). Generation of clones of ALPH N / -cells was facile in two different trypanosome life cycle stages (b loodstream and procy clic forms) with only a minor proliferati v e defect ( Figure 3 A and B). We conclude that the N-terminus of ALPH1 is nonessential in culture, albeit the impact to proliferation indicates a decreased fitness.
A hallmark of ALPH1 and XRNA is localization to the posterior pole ( 5 , 31 ). Interestingl y, imm unofluorescence of ALPH1 N / -cells with ALPH1 antiserum (Supplementary Figure S2) indicated a complete loss of ALPH1 localization from the posterior pole (Figure 3 C). To investigate further, we expressed ALPH1-eYFP and N-terminally truncated versions via inducible over expr ession ( 21 ) in cells also expressing mChFP-DHH1, which does not localize to the PP-granule, from the endogenous locus (Figures 4 A and  B). This resulted in an a pproximatel y 4-fold increase in the le v el of ALPH1-eYFP within 24 h of induction but with no significant effects on ALPH1 localiza tion, prolifera tion or global mRNA le v els ov er 96 h (Supplementary Figure S3).
We quantified ALPH1 PP localization by dividing the percentage of ALPH1 within a defined circular area at the posterior pole by the percentage of DHH1 fluorescence in the same ar ea (Figur e 4 B). Any value above one indicates ALPH1 localization to the PP granule, as DHH1 does not localize to the PP. An N-terminal truncation of up to 141 amino acids showed no reduction in ALPH1 PP localization, but rather a significant increase from an average of 4.8 (WT ALPH) to 7.4 ( P = 9.5E-6) and 6.9 ( P = 0.0004) for ALPH120-734 and ALPH142-737, respecti v ely. Additionally removing 21 or more amino acids from the ALPH1 Nterminus leads to near complete loss of PP granule localization, indica ting tha t posterior pole targeting r equir es amino acids 142 to 162 of the ALPH1 N-terminus.
Most RNA binding proteins, including T. brucei ALPH1 and DHH1, localize to stress granules on starvation ( 5 ). Stress granules are aggregates of proteins and RNAs and for a protein to localize r equir es interactions with RNA and / or RNA binding proteins. We quantified stress granule localization of ALPH1 and ALPH1 N-terminal truncations, using mChFP-DHH1 as a stress granule marker (Figure 4 C). The general tendency in PP localization efficiency was unaffected by starvation, albeit there was a significant increase in PP localization ( P << 0.0005) for the ALPH1 variants with efficient PP localization (WT, 120-734 and 142 734), indicating that starvation stress triggers PP localization. By contrast, when the fraction of ALPH1 in stress granules was quantified, we found that stress granule localization of most N-terminal ALPH1 truncations was similar to the wild type protein. The only exceptions were ALPH1120-734 and ALPH1142-734 that had significantly reduced ( P << 0.0005) stress granule localization, likely caused by their increased localization to the PP granule.
In conclusion, while the N-terminal domain of ALPH1 is r equir ed for ef ficient localiza tion to the posterior pole, it is neither r equir ed for proliferation in culture nor for localization to starvation stress granules, suggesting that it does not contain elements essential for function.

The C-terminus is r equir ed f or efficient localization of ALPH1 to RNA granules
The C-terminus of ALPH1 is more conserved among Kinetoplastida than the N-terminus, is predicted ␣-helical and r equir ed for ALPH1 dimerization (Figures 1 and 2 ). Cterminally truncated ALPH1 failed to efficiently localize to the posterior pole, similar to ALPH1 truncated at the Nterminus by 162 amino acids or more ( Figure 4 D and E). In contrast to N-terminally ALPH1 truncations, stress granule localization of ALPH1 C was also significantly reduced ( P << 0.0005) (Figure 4 F), indicating loss of interaction with RNA and / or RNA binding proteins. Additional removal of the N-terminus caused a complete loss in both posterior pole localization and stress granule localization (Figures 4 D-F). The C-terminus is thus r equir ed for ALPH1 localization to any type of granule, suggesting a major role in mediating ALPH1 interactions.

ALPH1 interacts with multiple partners
Dcp2, the canonical eukaryotic mRN A deca pping enzyme, is part of a multisubunit complex. The N-and C-terminal regions of Dcp2 engage in multiple pr otein-pr otein interactions essential for recruitment of mRNAs and adoption of an acti v e conformation ( 52 ). To investigate whether ALPH1 is similarly part of a larger complex, we examined the ALPH1 interactome by proximity labelling, fusing ALPH1 to TurboID biotin ligase by endogenous tagging. To gain domain resolution of interaction sites we additionally carried out TurboID experiments with truncated versions of ALPH1 ( N2, C, cat, compare Figure 4 ) which r equir ed over expr ession. As a control we included TurboID analysis of ov ere xpressed full-length ALPH1. All proximity labelling experiments were controlled by parental, i.e. untagged, cells. In addition, cells expressing an eYFP-TurboID fusion, served to identify proteins biotinylated by the TurboID enzyme in a non-specific manner. All fusion proteins also contained an HA-epitope tag.
All generated cell lines were verified by Western blotting using streptavidin to detect biotinylated proteins and anti-HA to detect the bait (Figure 5 A). Biotinylated proteins wer e r eadily detectable in all lines except for parental control. For all fusion proteins, the bait protein was among the most abundant biotinylated proteins. Cells expressing full length ALPH1 from either the endogenous locus or via ov ere xpression had a similar pattern of biotinylated proteins: the apparent band for the 116 kDa ALPH1 fusion protein was more dominant in the ov er-e xpresser line, and, as expected, absent from cells expressing truncated ALPH1 v ersions. Cells e xpressing ALPH N2 and ALPH C exhibited distinct patterns of biotinylated proteins. In contrast, cells expressing the catalytic domain of ALPH1 alone had distinctly higher intensity and more e xtensi v e labelling. The eYFP-TurboID control showed considerable biotinylation, despite the absence of specific interactions, and was used in the following as a stringent control to define potential bystander labelling ( 53 ).
Next, all cell lines were subjected to streptavidin affinity purification followed by LC-MSMS analysis. Proteins enriched by BioID, compared to controls, were grouped into confidence intervals (SigA, SigB), based on statistical analysis in Perseus ( 42 ) (Supplementary Table S1A). To further eliminate false positi v es from low le v el bystander labelling, all quantified protein groups wer e filter ed to r equir e at least  (Supplementary Table S1B).
Endo genousl y tagged ALPH1-TurboID yielded 70 significantly enriched proteins (16 in SigA, 54 in SigB). Ov ere xpressed ALPH1-TurboID yielded 150 significantly enriched proteins (95 in Sig A (red in Figure 5 Table S1B); most were undetected with the eYFP control. The larger number of ALPH1cat candidate interactors compared to full-length ALPH1 is likely a result of a less defined localization of ALPH1cat (Figure 4 ).

A high-confidence list of ALPH1 interacting partners
For constructing a high-confidence list of candidate interacting proteins for ALPH1 and its truncations we included only proteins (i) present in both BioID experiments with the full length ALPH1 (endogenous and ov ere xpression) with at least SigB (67 protein groups, Supplementary Table S1C), (ii) absent from both SigA and SigB of the TurboID-eYFP control (47 of these 67 proteins), and (iii) with either a clear connection to mRN A metabolism, (experimentall y proven or possessing a classical RNA binding domains (21 of these 47 proteins)) or absent from at least one of the ALPH1 truncations, suggesting a specific interaction with a domain (a further fiv e of these 47 proteins). Additionally we retained Tb927.7.3980 (Tc40-antigen like), as, despite detection with the TurboID-eYFP control (SigA) it has both posterior pole localization ( 44 ) and a connection to mRNA metabolism ( 44 , 54 , 55 ) and hence is potentially a genuine interactor. Moreover, we excluded a WD domain G-beta repeat protein Tb927.4.960, as this protein is identified in many BioID experiments with unrelated proteins (e.g. Mex67 and NUP158; ( 53 ) and is enriched in the eYFP control (albeit below the significance interval), indicating a likely common BioID contaminant. The final list contains 26 proteins, including ALPH1 ( Figure 6 A and Supplementary Table S1D).

ALPH1 BioID hits with localization to the posterior pole
ALPH1 is distinct from most other trypanosome RNAbinding proteins by its localization to the posterior pole. Among the high confidence ALPH1 BioID hits were fiv e proteins with a posterior pole localization ( 44 ): XRNA, a CMGC-family protein kinase Tb927.10.10870, hypothet-ical proteins Tb927.11.3490 and Tb927.9.12070 and the Tc40 antigen-like protein.
XRNA has a known co-localization with ALPH1 to the posterior pole ( 5 , 31 ) and its function in the 5 -3 decay pathway is well established ( 3 , 4 )). Interestingly, the ALPH1 BioID a pproach identified XRN A as a significant interactor, but e xclusi v ely with full length ALPH1 or ALPH1 N. For ALPH1cat and ALPH1 C, XRNA enrichment is below the SigB threshold, at 10% for ALPH1 C and at 6% for ALPH1cat compared to full length ALPH1 (Figure 5 B and  6 A). These data indicate an interaction between ALPH1 and XRNA dependent on the C-terminus of ALPH1.
Tb927.10.10870 encodes a CMGC-family protein kinase with no known connection to mRN A metabolism. Endo genous expression of a HALO-tag fusion protein of the kinase in a cell line expressing ALPH1-eYFP from the endogenous locus showed co-localization of both proteins to the posterior pole, most prominent after heat shock (Figure 6 B). At starvation, an eYFP-fusion of the kinase co-localized with PABP2 to starvation stress granules, in addition to the posterior pole granule, a strong indication of a function in mRNA metabolism (Figure 6 B). The domain interaction pattern of the CMGC kinase with ALPH1 resembles that of XRN A, i.e. significantl y enriched with ALPH1 N as bait, but dramatically reduced with ALPH1 C and absent with ALPH1cat (Figure 5 B). In contrast to XRNA, the kinase is also less enriched with ALPH1 N (SigB instead of SigA), suggesting that both ALPH1 termini are important for the interaction, but with a more significant C-terminal contribution.
The 87 kDa protein encoded by Tb927.11.3490 has no predicted domains, while the 109.0 kDa protein product of Tb927.9.12070 has a C-terminal predicted ATPdependent RNA helicase domain. Both proteins co-purify with oligo(dT) beads, indicating mRNA association ( 55 ). We expressed these proteins fused to a C-terminal eYFPtag and confirmed heat-shock inducible relocalization to the posterior pole with ALPH1 and co-localization with PABP2 to starvation stress granules (Figure 6 C). Both proteins are enriched in BioID isolations with all ALPH1 trunca tions, indica ting tha t they likely bind the ALPH1 ca talytic domain.
Tc40-antigen-like protein is the T. brucei ortholog to the Tc40 antigen from the related parasite T. cruzi , an immunodominant antigen in chronic Chagas disease patients ( 56 ). In T. brucei , many lines of evidence support a function for this protein in mRNA metabolism: Tc40 is enriched in purified starvation stress granules ( 57 ), localizes to nuclear periphery granules, a special RNA granule type of trypanosomes that forms at the cytoplasmic site of nuclear pores when trans-splicing is inhibited ( 54 ) and is captured by oligo(dT) beads ( 55 ).

ALPH1 interactors specific to defined ALPH1 regions and domains
Most proteins identified with the BioID experiments are enriched with all ALPH1 truncations, including the isola ted ca talytic domain, but eight proteins, including XRNA and the CMGC-family kinase (discussed above), are absent or significantly less enriched with the ALPH1 catalytic  Table S1D). The color-scheme indicates whether a protein was identified with the respective ALPH1 BioID bait and in which significance group (gr een / r ed), whether the protein is involved in mRNA metabolism (light brown), and whether the protein is known to localize to either the posterior pole (PP, light blue) or to starvation stress granules (SGs, dark blue) as judged by Tryptag ( 44 ) or own data (published here or elsewhere). None of the proteins was detected or enriched with the eYFP control, with the exception of the Tc40-antigen like protein. ( B ) Tb927.10.10870 was expressed as a HaloTag ® fusion and stained with TMR in a cell line also expressing ALPH1-eYFP from the endogenous locus. Representati v e images of one untreated and one heat shocked cell (2 h 41 • ) are shown (Z-stack projection sum slices of 5 slices a 140 nm); note that colors are switched for clarity. Tb927.10.10870 fused to eYFP was also co-expressed with PABP2-mCHFP (a marker for starvation stress granules ( 31 )) and starvation was induced by incubation in PBS for 2 h. Images ar e pr esented as deconvolved Z-stack projections (sum slices). ( C ) Tb927.9.12070 and Tb927.11.3490 were expressed fused to a C-terminal eYFP tag from the endogenous locus in a cell line also expressing ALPH1-mChFP or the stress granule marker protein PABP2-mChFP from the endogenous locus. Fluorescent images of representati v e cells are shown under untreated conditions, after 2 h heat shock at 41 • C and after 2 h of PBS starvation as projections (sum slices) of deconvolved Z-stacks. Note that the autofluorescence of the lysosome is visible in the red channel, because the ALPH1-mCHFP fluorescence is very weak.
domain when compared to full-length protein, indicating domain specific interactions (Figure 6 A). Of this cohort, two proteins are also absent from potential ALPH1 N interactions, indicating an interaction requiring the N-terminal domain: the Rtr1 / RPAP2 family protein (Tb927. 10.2180) and a nuclear LIM interactor (NLI)-interacting factor-like phosphatase (Tb927.9.14120). Rtr1 / RPAP2 has not been identified by any mRNA-related screen and localizes to the cytoplasm with both N-and C-terminal tagging ( 44 ). In contrast, the NLI interacting factor-like phosphatase contains a CCCH motif ( 58 ), has been identified as a post-transcriptional r epr essor ( 55 , 59 ), localizes to a trypanosome-specific RNA granule type at the nuclear periphery ( 54 ) and thus is implicated in mRNA metabolism. The protein localizes to granules ( 44 ) and we hav e e xpressed the protein as a C-terminal eYFP fusion together with the stress granule marker PABP2 fused to mChFP and confirmed that these granules are stress granules starvation-induced stress granules (Supplementary Figure S5A).
Of the four proteins potentially interacting with both the C-and N-terminal region of ALPH1, but not the catalytic domain, two have clear connections to mRNA metabolism: the pumilio protein PUF2 is essential ( 60 ), coprecipitates with oligo(dT) and acts as a co-transcriptional r epr essor ( 55 ). The DEAD box RNA helicase DHH1 plays a role in life-cycle-dependent mRNA regulation ( 61 ). The remaining two proteins appear unrelated to mRNA metabolism: The serine threonine phosphatase 2A (Tb927.3.1240) has not been localized by TrypTag and, in our C-terminal endogenous tagging, we saw no evidence for localization to the posterior pole granule or starvation stress granules (Supplementary Figure S5B and S5C). The Roadblock / LC7 domain containing protein Tb927.11.16540 has axonemal localization ( 44 ) and thus a potential function in flagellar transport. We conclude that these latter two proteins are likely mis-identifications and not components of the decapping complex.

The trypanosome mRNA decapping complex
Two ALPH1 BioID interactors were selected for confirmation: the CMGC-type protein kinase and XRNA. XRNA functions downstream of ALPH1, colocalizes with ALPH1 to the posterior pole, and the yeast ortholog XRN1 is a known Dcp2 interactor ( 62 , 63 ): an interaction is ther efor e likely. The CMGC-type protein kinase resembles XRNA in its localization to the posterior pole and confined interaction to the ALPH1 C-terminal domain; its function is unknown.
The CMGC kinase interactome was determined with BioID, expressing the protein from its endogenous locus as a C-terminal TurboID-HA fusion. Biotinylated proteins wer e r eadily detectable on a Western blot probed with streptavidin (Supplementary Figure S6) and mass spectrometry identified 216 protein groups (Supplementary Table S2A). Of these, 54 protein groups were enriched in comparison to the control BioID (parental cells) and identified in at least two of thr ee r eplicates (Figur e 7 A; Supplementary Table S2B). From these 54 proteins we excluded all proteins also identified with the eYFP control in SigA or B, with the exception of the Tc40 antigen-like protein (see above). From the remaining 44 proteins, we excluded all proteins that were less than 2.5-fold enriched. The final list of putati v e CMGC-type kinase interactors contains 22 proteins (Supplementary Table S2C). Only five proteins of this list fall into the SigC group (FDR = 0.05, s0 = 2.0): next to the bait protein this is XRNA, the two ALPH1 catalytic domain interactors Tb927.11.3490 and Tb927.9.12070 and the Tc40-antigen-like protein. The other 17 proteins fall below the SigC significance interval, but include se v en additional proteins in the final ALPH1 BioID list, including ALPH1. The lower significance of the latter protein cohort can be explained by low abundance of the protein kinase.
To investigate the XRNA interactome, we used immunopr ecipitation. XRNA was expr essed fused to a C-terminal mNeonGreen tag from its endogenous locus ( 31 ). Cells were broken by cryo-milling and the XRNA complex captured by mNeonGreen affinity immunoisolation. Mass spectrometry detected 1484 protein groups in the eluate (Supplementary Table S3A), out of which 223 were enriched in comparison to parental cells and had at least two quantification values within the four bait samples (Supplementary Table S3B). Of these, 14 wer e consider ed significantly enriched (FDR = 0.05, s0 = 2.0; Supplementary Table  S2C), including the bait. This cohort of 13 high-confidence XRNA interactors included fiv e proteins that were also in the ALPH1 BioID list: ALPH1 itself, SCD6, RNA helicase Tb927.4.3890 and the two RNA binding proteins Tb927.11.3490 and Tb927.9.12070 (Figure 7 B, left). Notably, the CMGC-kinase was not identified: one possible explanation is a transient or less-stable interaction. The Tc40 antigen like protein was enriched, but below the chosen significance threshold.
Altogether, these data, provide compelling evidence for an mRN A deca pping complex that contains ALPH1, XRN A and the CMGC-famil y kinase. Of 22 putati v e CMGC-family kinase interactors, 12 were also identified for ALPH1 and fiv e for T. brucei XRNA. Of the 13 T. brucei XRNA interactors, fiv e were in both the ALPH1 and CMGC-kinase BioID isolations. To define a robust core of the putati v e T. brucei mRN A deca pping complex, we included only proteins identified with all three bait proteins. A so-defined mRNA decapping complex consist of six proteins, all with posterior pole localization: ALPH1, XRNA, the CMGC-family kinase, the RNA binding Overlapping pr oteins fr om interactors of ALPH1, CMGC-type protein kinase and XRNA are shown as the most likeliest components of the ALPH1 core complex. Proteins with slightly less robust evidence for being complex subunits are shown in dashed circles. Note that the dimerization of ALPH1 via the C-terminus observed in vitro is not shown here for clarity. Moreover, we cannot exclude that binding sites of either ALPH1 domain are used in competition rather than sim ultaneousl y. ( D ) Summary of the ALPH1 interactome. ALPH1 is depicted as a schematic homodimer and all interacting proteins identified in this work are shown connected to the respecti v e ALPH1 domain(s). Localization of proteins to the posterior pole and / or to stress granules is indicated at the left. ( E ) Evolution of distinct components in decapping complexes in Kinetoplastida. Coulson plot r epr esentation of subunit pr esence layer ed onto a simplified eukaryotic phylogeny, to emphasize subunit losses and replacements of subunits between Metamonads and Discoba. In teal are canonical subunits including Dcp1 / 2 and Edc, while in magenta are the subunits reported here as associated with APLH1. Significantly, XRNA / XRN1 (orange) is retained by all lineages. The selecti v e pressure and order of events, such as by a gradual or more catastrophic change its unknown, but parallels se v eral additional systems such as the lamina and kinetochore ( 78 , 79 ). Significantly, in Metamonada, Heterolobosea and Euglenida subunit retention is sparse and may reflect the presence of an additional set of di v ergent components remaining to be identified. For a list of gene IDs for decapping complex component homologs see Supplementary Table S5. proteins Tb927.11.3490 and Tb927.9.12070 and the Tc40antigen-like protein (coloured in Figure 7 C). Two further proteins are potential complex components, but with less robust support (dashed circles in Figure 7 C): The RNA helicase Tb927.4.3890 and SCD6 do not localize to the posterior pole and are below significance in the CMGC-kinase BioID. The ortholog of the helicase is also absent from the T. cruzi XRNA pulldown. A summary of all ALPH1 interactions discovered in this work is shown in Figure 7 D.

Dir ect immunopr ecipitation of ALPH1 delivers a potential sub-complex
While we confirmed the BioID deri v ed composition of the ALPH1 decapping complex by reverse experiments targeting CMGC-type kinase and XRNA, we also attempted a direct affinity capture of ALPH1 from T. brucei and T. cruzi , under the same respecti v e conditions applied for XRNA. For C-terminal eYFP fusion, an epitope tag that is well tolerated in an ALPH1 knockout background ( 5 ), most interactors, including XRNA, were absent and only Tb927.11.3490 and the Tc40-antigen like protein were captur ed (Supplementary Figur e S10A, B; Supplementary Table S6). Fusing eYFP N-terminally to TbALPH1 lead to similar results, deli v ering the latter two proteins and Tb927.9.12070, albeit at low enrichment (Supplementary Figure S10C; Supplementary Table S6). It is tempting to specula te tha t the ALPH1 fusion-tag is fully buried in the rather large decapping complex, thus inaccessible to the nanobody and, as a consequence, only a complex with partial composition precipitates. The presence of such a subcomple x, co-e xisting in a spatially (or temporally) distinct manner is concei vab le and possibly biologically relevant.

The ALPH1 mRNA decapping complex is Kinetoplastida specific
ALPH1 is conserved among kinetoplastids, with the exception of the free-living Bodonid Bodo saltans ( 6 ), that does howe v er encode orthologues of Tb927.11.3490, Tb927.9.12070, the TC40 antigen-like protein and the CMGC-family kinase, indicating incomplete data as a potential reason for ALPH1 absence. Indeed, a tblastn search readily detected a genomic region (CYKH01001277 position 14 169-15 402) with a partial coding sequence homologous to T. brucei ALPH1 (R272-A551(E-value: 2e-58); G607-L662(E-value: 5.8)) that appears to be syntenic, as judged by the upstream presence of the evolutionary conserved gene encoding the RRP45 exosome subunit (BSAL 92565 and Tb927.6.670) ( 64 ). Orthologs of the core interactors with PP localization are present in most Kinetoplastida genomes (Figure 7 E). Tb927.11.3490 and the TC40 antigen-like protein are not detected in any genome beyond the kinetoplastids and the other two interactors have unique sequence stretches: The N-terminal 390 residue serine / threonine kinase domain of the CMGCfamily kinase is flanked by a C-terminal region of variable length within Kinetoplastida (Supplementary Figure S7A). Tb927.9.12070 shares partial similarity with the RNAhelicase Tb927.4.3890 (Supplementary Figure S7B, C), an ALPH1 interactor which does not localize to the PP, but conditionally, to stress-granules. This RNA-helicase appears to be uni v ersall y distributed, w hile Tb927.9.12070, again, has Kinetoplastida specific regions (Supplementary Figur e S7B). Inter estingly, the shar ed sequence of approximately 520 residues between the two Kinetoplastida helicases includes the helicase C-terminal domain but excludes the helicase ATP-binding domain (Supplementary Figure  S7C). XRNA and SCD6 are uni v ersally distributed (Supplementary Figure S7D) and are known Dcp2 interactors in animals and fungi ( 62 ).
Surprisingly, neither ALPH1 nor the four core interactors could be detected in Heterolobosea, Metamonada or Euglenids (only r epr esented by the draft genome / transcriptome of Euglena gracilis ) . E. gracilis encodes an ApaH like phosphatase, but it consists of the catalytic domain only and ther efor e likely is not functioning in mRN A deca pping, but rather the ortholo gue to the noncytoplasmic T. brucei  Figure S7E).

The posterior pole is a highly dynamic structure anterior of the microtubules plus end that is not enriched in mRNAs
All fiv e proteins of the ALPH1 core comple x hav e behaviours that distinguish them from other RNA metabolism proteins by virtue of posterior pole localiza tion, indica ting a potential function of this structure in mRN A deca pping. We ther efor e characterized this structure in greater detail.
First, localization of ov ere xpressed ALPH1-eYFP was monitored during the cell cycle (Figure 8 A). The cell cycle stage of an individual cell can be determined from a DAPI stained image according to the number and position of kinetoplast(s) (the DNA containing structure within a single mitochondrion) and nuclei ( 65 , 66 ). Kinetoplast duplication takes place prior to nuclear division, resulting in three main cell cycle stages: 1K1N (one kinetoplast and one nucleus), 2K1N (two kinetoplasts and one nucleus) and 2K2N (two kinetoplasts and 2 nuclei). After division, the posterior nucleus moves between the two kinetoplasts and the longitudinally occurring cytokinesis produces a posterior and an anterior sibling. The majority of cultured cells are 1K1N cells and all have a spot-like, ALPH1-positi v e posterior pole granule. In 2K1N cells, we observed gradual elongation of the ALPH1 spot with the progressing cell cycle, indicated by an increasing distance between the two kinetoplasts. In very late 2K1N cells and all 2K2N cells, ALPH1 was visible as a punctate string starting at the posterior pole. In late 2K2N cells this string terminated proximal to the kinetoplast of the anterior sibling. After cell division, the string remains visible in a fraction of 1K1N cells that were most likely posterior siblings from a recent cell division, as they were of a small size and had a non-dividing kinetoplast.
This d ynamic localiza tion of ALPH1 in dividing cells is reminiscent of that previously described for the microtubule plus end marker protein XMAP215 ( 44 , 67 ), as the plus ends of the subpellicular microtubule array lie at The PP-granule contains no mRNAs. Cells expr essing ALPH1-4Ty1-eYFP wer e probed for total mRNA by oligos antisense to either the miniexon sequence (left) or the poly (A) tail (right). ALPH1 was detected by immunofluorescence using anti-Ty1 (BB2). Both untreated and heat-shocked cells were used; the later to increase the amount of ALPH1 at the posterior pole. Note that the miniexon probe also recognizes the nuclear-localized SL RNA next to total mRNAs; heat shock reduces total mRNA le v els, but not SL RNA le v els and thus causes an increase in the nuclear signal ( 31 ). ( F ) XRNA-eYFP was expressed from the endogenous locus in either wild type cells or ALPH1 N / -cells. XRNA localization was monitored after 70 minutes of heat shock. An arrow points to XRNA-eYFP localized at the posterior pole. the posterior end of the cell ( 68 ). We examined whether XMAP215 and ALPH1 co-localize, with XMAP215 expressed as an N-terminally tagged mChFP fusion protein from the endogenous locus in an inducible ALPH1 overexpression cell line. Both XMAP215 and ALPH1 localized to a spot at the posterior pole with ALPH1 slightly more anterior than XMAP215 (Figure 8 B). In 63% of 1K1N cells (26 / 41) ALPH1 was unequivocally anterior to XMAP215 and in 33% in (14 / 41) cells ALPH1 appeared adjacent or co-localizes with XMAP215, and in only 2% of the cells (1 / 41) ALPH1 appeared to be slightly posterior to XMAP215 (Supplementary Figure S8A-C). Both XMAP215 and ALPH1 localize to a similar, but not entirely identical string-like structure in dividing cells (Supplementary Figure S8D). The da ta indica te tha t ALPH1 localizes to a structure that is associated but not identical with the microtubules plus ends, or perhaps to an anterior subcomplex of the structure.
The presence and size of most RNA granules depends on the state of mRNA metabolism. The percentage of ALPH1-eYFP at the posterior pole granule was quantified for both endo genousl y expressed and over expr essed ALPH1-eYFP in cells subjected to various treatments affecting mRNA metabolism ( Figure 8 C and D). As a control, each cell line also expressed the cytoplasmic RNA granule marker mChFP-DHH1 from the endogenous locus. When ALPH1-eYFP was expressed from the endogenous locus, on average, 2.6% of ALPH1-eYFP was at the posterior pole. This fraction significantly increased to 9.8% and 6.7% when cells were heat-shocked or starv ed, respecti v ely. No significant changes were observed when cells were treated with the trans-splicing inhibitor sinefungin, the transcription inhibitor actinomycin D, or the translation inhibitor cycloheximide. When ALPH1-eYFP was over expr essed, an average of 11.6% of ALPH1-eYFP was at the posterior pole and this fraction increased to 22.9% and 21.1% when cells were heat-shocked or starved. Treatment with actinomycin D, sinefungin or cy clohe ximide caused a significant decrease in ALPH1 at the posterior pole to 6.6%, 5.7% and 6.3%, respecti v ely. In both e xpression systems, the fraction of DHH1 at the posterior pole was unchanged. DHH1 was undetectable at the posterior pole granule, with the exception of occasional minor accumulation in ALPH1 ov ere xpression cells. In conclusion, ALPH1 localization to the posterior pole increases with stresses that dissociate polysomes (hea t shock, starva tion) and thus incr ease fr ee cytoplasmic mRNAs, but decreases with trea tments tha t reduce free cytoplasmic mRNAs (sinefungin, actinomycin D, cycloheximide), suggesting a d ynamic localiza tion dependent on free mRNA concentration.
To determine if mRN A deca pping occurs at the PPgranule, we asked whether the granule is enriched in mRNAs using fluorescence in situ hybridization (FISH) (Figure 8 E). We probed untreated or heat-shocked cells with fluorescently labelled oligonucleotides, either antisense to the miniexon (present at the 5 end of trypanosome mRNAs) or antisense to the poly(A) tail, and monitored ALPH1 localization by immunofluorescence in parallel. While ALPH1 was clearly visible at the posterior pole, with the strongest signal in heat-shocked cells, there was no accumulation of mRNAs at the posterior pole visible detected with either FISH probe, e v en in heat-shocked cells ( Figure  3 D), consistent with a lack of mRNA decay intermediates at the posterior pole reported previously ( 69 ). Absence of both total mRN A and mRN A decay intermediates strongly argues against a function of the PP granule in mRNA degradation or storage.
Next, we asked whether ALPH1 localization to the posterior pole is r equir ed to r ecruit other members of the complex. We used the ALPH1 N / -cell line, which has no ALPH1 at the posterior pole ( Figure 3 C and 4 ), to test if the localization of XRNA at the posterior pole depends on ALPH1. XRNA-eYFP was expressed in wild type cells and in ALPH1 N / -cells from the endogenous locus and XRNA localization to the PP monitored by fluor escence microscop y in r esponse to heat shock. XRNA r emains pr esent a t the PP in ALPH1 N / -cells, indica ting localization independent of ALPH1 (Figure 8 F). Vice versa, we asked whether XRNA is r equir ed for ALPH1 localization to the posterior pole. We expressed a C-terminal HALO-tag fusion of ALPH1 in a previously characterized XRN A RN Ai cell line ( 70 ). ALPH1 localization to the posterior pole was reduced within 24 h of RNAi induction, both in untreated and in heat-shocked cells (Supplementary Figure S9). These data suggest that XRNA localization to the posterior pole may be r equir ed to r ecruit ALPH1. An alternati v e e xplanation is that the reduction in ALPH1 localization to the posterior pole is caused by the change in mRNA metabolism that results from the RNAi depletion of XRNA.
In conclusion, the posterior pole is unlikely a place of mRN A deca pping as it is neither enriched in mRNAs nor in mRNA metabolism products; this is consistent with findings that ALPH1 localization to the PP is nonessential for cellular survi val. Howe v er, ALPH1 localization to the PP is highly dynamic and may serve to regulate overall decapping activity.

DISCUSSION
Here, we define a decapping complex with a composition that appears unique to Kinetoplastida, suggesting a distinct mechanism and evolutionary origin underpinning this central process within this lineage ( 6 ). Whilst we find some similarities to the canonical Dcp2 decapping complex of animals and fungi, for example an interaction with the 5 -3 exoribonuclease XRN1 and with SCD6, the core complex is largely composed of lineage-specific proteins and decapping relies on an ApaH-like phosphatase unique to Kinetoplastida. Mor eover, ther e is great potential for exploiting this complex with central function in mRNA metabolism as a drug target against diseases caused by Kinetoplastida, as the entire ApaH like phosphatase family is absent from mammals ( 6-8 ).

The ALPH1 N-terminal domain is dispensable, while the Cterminus is an interaction hub
The vast majority of ApaH-like phosphatases consist of just a catalytic domain and do not function in mRNA decapping in vivo ( 6 ). Kinetoplastida ALPH1 is hence a rare, if not e xclusi v e e xception ( 6 ). The distinguishing Nucleic Acids Research, 2023, Vol. 51, No. 14 7537 feature of Kinetoplastida ALPH1 within the ApaH-like phosphatase family is the presence of unique N-and C-terminal extensions / domains.
The N-terminal extension is not r equir ed for growth in culture, albeit ALPH1 N / -cells have a mild proliferation phenotype. This is consistent with low conservation between the Kinetoplastida ALPH1 N-terminal sequences, with 14.6% sequence identity across the lineage, and the lack of N-terminal extensions in the orthologs of Trypanosoma grayi and Leptomonas pyrrhocoris. The only determined role of the N-terminal domain is in mediating efficient localization to the PP granule.
In contrast, the C-terminal domain is predicted to be partly structur ed, pr esent in all Kinetoplastida ALPH1 orthologues and better conserved at 31.9% sequence identity across the taxa. This domain is r equir ed for efficient localiza tion to starva tion stress granules, indica ting involvement in interactions with RNA-binding proteins and / or RNA. Further, the C-terminus mediates ALPH1 dimerization and is r equir ed for interactions with the 5 -3 exoribonuclease XRNA and a CMGC-family kinase; both are high-confidence interaction partners. These data indicate a fundamental role for the C-terminus in ALPH1 function and / or regulation. The CMGC-family kinase represents a candida te regula tor, and by analogy, phosphorylation contributes to regulation of decapping in opisthokonts ( 71 , 72 ).

The role of the PP granule remains elusive
ALPH1 N / -cells are devoid of ALPH1 localization to the PP granule but viab le, e xcluding a function for the PPgranule as the major loca tion dedica ted to the essential process of mRN A deca pping, a conclusion supported by the absence of mRNAs or mRNA decay products at the PP. An alternati v e function for this structure would be to regulate ALPH1 access to mRNA substrates and hence overall mRN A deca pping activity. In fact, localization of ALPH1 to the PP is altered when mRNA metabolism is experimentally manipulated: the fraction of ALPH1 at the posterior pole increases when mRNAs are released from polysomes (hea t shock, starva tion), w hile blocking RN A synthesis (actinomycin D, sinefungin) or mRNA recruitment to ribosomes (CHX) causes a reduction in PP localization. Hence, cells may attempt to adapt substrate-exposed ALPH1 le v els (ALPH1 that is not a t the PP) to substra te abundance (non-polysomal mRNAs) in response to environmental cues. Howe v er, the proportion of ALPH1 at the PP is extremel y low: onl y 2-3% of ALPH1 molecules are located proximal to the PP in steady state conditions and this is only increased ∼3 fold by heat shock or starvation, with > 90% of ALPH1 elsewhere (Figure 8 C). These rather moderate changes argue against a function of the PP granule in regulating the concentration of acti v e cytoplasmic ALPH1. It is, howe v er, possib le that the fraction of ALPH1 at the PP granule is massi v ely increased in the quiescent, non-dividing life cycle stages of trypanosomes, contributing to the overall reduction in mRNA turnover observed in these stages. This has not yet been investigated. A further possibility is that ALPH1 needs to be recruited to the PP granule in order to be activated and is then released back into the cytoplasm: spatial separation of activation and decapping activity would add a stringent le v el of regula tion. To investiga te this, it is in particular important to define which components of the ALPH1 complex localize to the PP independently and which depend on other complex members. We show that XRNA can localize to the posterior pole independently from ALPH1 (Figure 8 F).

A kinetoplastida-specific mRNA decapping complex
Our analyses define an ALPH1 mRNA decapping complex, minimally consisting of ALPH1, XRNA, a CMGCtype kinase, the Tc40-antigen like protein and two RNA binding proteins, Tb927.9.12070 and Tb927.11.3490 (Figure 7 C). All components of this core complex share localization to the PP. Se v eral additional proteins are likely additional components, most notably, SCD6 and an RNA helicase (Tb927.4.3890) (Figure 7 C-D). These proteins do not localize to the PP, but conditionally, to stress-granules. The RNA helicase shares considerable sequence similarity with Tb927.9.12070 albeit excluding the helicase active site (Supplementary Figure S7B, C), indicating possible m utuall y exclusi v e binding to the complex. Evidence for the existence of ALPH1 sub-complexes is also provided by the fact that the ALPH1 IP experiments consistently lacked certain complex members. The likely reason is that the affinity tag was not accessible to the nanobody in a certain ALPH1 complex formation, favouring exclusive purification of another subcomplex (Supplementary Figure S10). However, more e xperimental wor k is r equir ed to confirm the existence and resolve distribution of ALPH1 sub-complexes.
Importantly, this complex composition is shared with T. cruzi (Figure 7 B and Supplementary Figure S10B), and, with the exception of the universal XRNA, is Kinetoplastida specific (Figure 7 E). The ALPH1 decapping complex appears to be an innovation occurring early within the Metakinetoplastida (Supplementary Table S5). While homologues to ALPH1 and the four core components are absent outside Kinetoplastida, Dcp2 orthologs are present in all remaining eukaryotic lineages, with the exception of Euglenozoa (Figure 7 E). Loss of Dcp2, together with the absence of prototypic Dcp2 complex subunits in Discoba and Metamonada, suggests the presence of Dcp2 in the last eukaryote common ancestor, consistent with our earlier study ( 73 ). Taken together, this indicates a significant di v ergence in mRNA degradation mechanisms in Kinetoplastea, as based on the evolution of a novel mRN A deca pping complex, relying on a distinct enzyme family for decapping activity, but recruited to the conserved exoribonuclease XRNA / Xrn1.

Convergent evolution of lineage specific decapping comple x es
Decapping flags mRNA molecules for degradation and is a powerful ultimate determinant in mRNA turnover requiring stringent control. For Dcp2, the major decapping enzyme of animals and fungi, multiple regulatory factors share the same Dcp2 binding sites and likely compete ( 74 , 75 ). A variety of regulatory mechanisms involves autoinhibition and interactions with activators, such as Dcp1, Edc1, Edc3 and Pat1 ( 52 , 62 ). Dcp2 and these latter factors are all absent from Kinetoplastida or of sufficient di v ergence as to be unidentifiable (Figure 7 E). Even though there is no known evolutionary relationship between ALPH1 and the canonical nudix domain mRN A deca pping enzyme Dcp2, ther e ar e some interesting similarities. A paH / A paHlike phosphatases and nudix hydrolases have a broad substr ate r ange ( 76 ) and both depend on their N-and Cterminal extensions for specificity in mRN A deca pping. Many factors specifically interact with the N-and C-termini of Dcp2 ( 62 , 63 ), se v eral of which are shared between the trypanosome ALPH1 complex and Dcp2 (DHH1, SCD6, XRN A / XRN1). mRN A deca pping has essentially the same r equir ements in all eukaryotes, but can be met by different mechanisms and protein families. We propose that the opisthokont Dcp2 complex and the ALPH1 complex in kinetoplastids hav e conv er ged to wards providing the same function, e.g. controlled mRNA decay, and provide a further striking example of convergent evolution in kinetoplastids sitting alongside the kinetochore and nuclear lamina, underscoring the ability of many functions to be achie vab le with distinct mechanics.

DA T A A V AILABILITY
All proteomics data have been deposited at the ProteomeXchange Consortium via the PRIDE partner repository ( 77 ) with the data set identifier PXD038550 (all BioID and XRN A affinity ca pture) and PXD042322 (ALPH1 affinity capture).

SUPPLEMENT ARY DA T A
Supplementary Data are available at NAR Online.