The eukaryome of African children is influenced by geographic location, gut biogeography, and nutritional status

Abstract Eukaryotes have historically been studied as parasites, but recent evidence suggests they may be indicators of a healthy gut ecosystem. Here, we describe the eukaryome along the gastrointestinal tract of children aged 2–5 years and test for associations with clinical factors such as anaemia, intestinal inflammation, chronic undernutrition, and age. Children were enrolled from December 2016 to May 2018 in Bangui, Central African Republic and Antananarivo, Madagascar. We analyzed a total of 1104 samples representing 212 gastric, 187 duodenal, and 705 fecal samples using a metabarcoding approach targeting the full ITS2 region for fungi, and the V4 hypervariable region of the 18S rRNA gene for the overall eukaryome. Roughly, half of all fecal samples showed microeukaryotic reads. We find high intersubject variability, only a handful of taxa that are likely residents of the gastrointestinal tract, and frequent co-occurrence of eukaryotes within an individual. We also find that the eukaryome differs between the stomach, duodenum, and feces and is strongly influenced by country of origin. Our data show trends towards higher levels of Fusarium equiseti, a mycotoxin producing fungus, and lower levels of the protist Blastocystis in stunted children compared to nonstunted controls. Overall, the eukaryome is poorly correlated with clinical variables. Our study is of one of the largest cohorts analyzing the human intestinal eukaryome to date and the first to compare the eukaryome across different compartments of the gastrointestinal tract. Our results highlight the importance of studying populations across the world to uncover common features of the eukaryome in health.


Introduction
The human gastrointestinal (GI) microbiome is a complex community comprised of bacteria, archaea, viruses, and eukaryotes (fungi, protists, and helminths).Many studies have mapped the bacterial microbiome across life stages, geographic locations, and health states and demonstrated its importance in health and disease (Yatsunenko et al. 2012, Luan et al. 2015, Castanys-Muñoz et al. 2016, Dominguez-Bello et al. 2019 ).Despite the clear contributions of some host-associated eukaryotes to human health, eukaryotes have been less often c har acterized than the bacterial members of the gut micr obiome, particularl y in lar ge-scale studies (P arfr ey et al. 2014, Stensvold and v an der Giezen 2018, Mann et al. 2020 ).Hurdles include the lower biomass of eukaryotes in the gut (eukaryotes constitute < 0.1% of the total biomass of the microbiota; Qin et al. 2010 ), the smaller community studying eu-karyotes, particularly outside of parasitology, technical difficulties arising from less well-curated databases (del Campo et al. 2020 ), and difficulty separating gut residents from eukaryotes introduced via diet (Suhr andHallen-Adams 2015 , Mann et al. 2020 ).
Studies using 18S rRNA targeted amplicon sequencing sho w ed that protists and fungi, especially members of the Saccharomycetales , are the dominant eukaryotes in the human GI tract (Parfrey et al. 2014, Scanlan et al. 2018 ).Protists are a common part of the gut community (Stensvold et al. 2011a, b , 2020, Forsell et al. 2012, Roser et al. 2013, P arfr ey et al. 2014, Krogsgaard et al. 2015, 2018, Jokelainen et al. 2017, Stensvold and van der Giezen 2018, Stensvold 2019, Mann et al. 2020 ) and ar e r egularl y detected in the feces of infants and toddlers (Jokelainen et al. 2017 ).In the past, studies describing the human eukaryome using specific Pol ymer ase Chain Reaction (PCR) primers reported that the pr otist gener a Blastoc ystis and Dientamoeba (Scanlan and Mar chesi 2008, Velasco et al. 2011, Roser et al. 2013, El Safadi et al. 2014, Turkeltaub et al. 2015, Andersen and Stensvold 2016, Jokelainen et al. 2017 ) are widespread in healthy humans .Amoebae , especially Entamoeba coli, have been also described as commensal members of the human gut microbiome (ten Hove et al. 2007, Bruijnesteijn v an Coppenr aet et al. 2009, Stensvold et al. 2011b, Krogsgaard et al. 2018 ).Further, the presence of Blastocystis has been associated with sanitation le v els, w ater sour ce, and contact with animals or other infected humans (El Safadi et al. 2014, Beghini et al. 2017, Scanlan et al. 2018 ) and is gener all y r eported at higher pr e v alence in low-and middle-income countries.Fungi are also commonly found in the intestinal microbiota of many different mammals, including humans (El Mouzan et al. 2017, Hallen-Adams and Suhr 2017, Nash et al. 2017, Auchtung et al. 2018, Lavrinienko et al. 2021, Boutin et al. 2021, Sun et al. 2021 ).In humans, fungi colonize the infant intestinal tract shortly after birth (Schei et al. 2017 ).T hus , there is consistent detection of eukaryotes in the intestinal microbiota of healthy subjects.
Recent cr oss-kingdom anal yses show that fungi (Huseyin et al. 2017 ) and microeukaryotes (Stensvold 2019 ) ar e activ e participants in the GI ecosystem and influence health and disease thr ough inter actions with eac h other, other micr obes, and the host.Microeukaryotes including Blastocystis , Giardia , Entamoeba (Verweij et al. 2004, ten Hove et al. 2007, Velasco et al. 2011 ), and a variety of fungi (Luan et al. 2015, Limon et al. 2019, Richard and Sok ol 2019 ) hav e been linked to GI disease, antibiotic-associated diarrhea, and c hemother a p y-induced enteric disor ders (Stensv old and van der Giezen 2018 ), inflammatory bo w el disease (IBD; El Mouzan et al. 2017, Sovran et al. 2018, Limon et al. 2019, Richard and Sokol 2019 ), and asthma (Arrieta et al. 2018, Goldman et al. 2018, Boutin et al. 2021 ) [r e vie wed in Huseyin et al. ( 2017 )].Further, the human mycobiome is changed in obesity (Mar Rodriguez et al. 2015 ) and fungal-bacterial interactions are perturbed in patients suffering of IBD (Sovran et al. 2018 ).Other microeukaryotes, such as Giardia intestinalis and Entamoeba histol ytica , ar e pathogens that dir ectl y cause substantial morbidity and mortality (Roberts et al. 2013 ).Last, helminths, whic h ar e often consider ed pathogens, ar e w ell kno wn to have an imm unor egulatory function (Walsh et al. 2009, Broadhurst et al. 2012, Zaiss et al. 2015, Finlay et al. 2016, Gause and Maizels 2016, Giacomin et al. 2016, Ramanan et al. 2016 ) and the presence of se v er al helminths has been associated with changes in the bacterial community (Walk et al. 2010, Cantacessi et al. 2014, McKenney et al. 2015, Zaiss et al. 2015, Giacomin et al. 2016 ).There is thus a clear link between alterations in the fungal and microeukaryotic community of the intestinal tract and disease.
In contrast to these observations, a growing number of studies show that protists such as E. coli and Blastocystis are common in healthy people (El Safadi et al. 2014, Krogsgaard et al. 2015, Andersen and Stensvold 2016, Beghini et al. 2017, Nie v es-Ramir ez et al. 2018 ), suggesting that they might also be important to maintain proper gut homeostasis (Audebert et al. 2016, Andersen and Stensvold 2016, Beghini et al. 2017 ) and are, thus indicators of a healthy gut ecosystem (Stensvold and van der Giezen 2018 ).Indeed, some of them, as an example Blastocystis (Krogsgaard et al. 2015 ) or Dientamoeba are even more common in healthy individuals than in comparison groups with immune mediated disease such as IBD (Andersen andStensvold 2016 , Beghini et al. 2017 ) and irritable bo w el syndrome [reviewed in Stensvold and van der Giezen ( 2018 )].Further, colonization with helminths were virtually universal in human populations before the adoption of modern, highly sanitized lifestyles (Goncalves et al. 2003 ) and their drastic decrease in urban industrial populations has been discussed as a possible contributor to the concomitant rise of autoimmune disease (Rook 2012 ).Another recent observation is that coinfection with different eukaryotic pathogens can lead to reduced virulence in some conditions [r e vie wed in Venter et al. ( 2022 )].In this light, r educed div ersity of helminths and other micr oeukaryotes (P arfrey et al. 2014 ) in industrialized countries may be altering the gut ecosystem dir ectl y thr ough loss of species and indir ectl y by the associated loss of interactions.
T hus , the role of microeukaryotes in the intestinal tract seems to be complex and to depend on the particular GI ecosystem they inhabit.
There is clear evidence of geographic differences in the bacterial micr obiome, mainl y mediated thr ough diet (De Filippo et al. 2010 ).Further, globalization and urbanization have been shown to be associated with a major loss in microbial diversity compared to a mor e tr aditional lifestyle (Clemente et al. 2015, Obregon-Tito et al. 2015, Smits et al. 2017, Jha et al. 2018, Pasolli et al. 2019 ).For the eukaryome , to date , most studies have been performed on a single geogr a phic location.A small study in South Africa has shown that the mycobiome is affected by urbanization (Kabwe et al. 2020 ).A larger study in China, spanning several ethnicities and geogr a phic regions sho w ed very high variability across different geographic regions (Sun et al. 2021 ), mainly reflecting dietary habits and the urbanization gradient.For microeukaryotes, there are some geogr a phic differ ences in the subtypes of Blastocystis (Alfellani et al. 2013 ).These studies clearly show the need to investigate the overall eukaryome make-up across different geographic locations.
So far, most studies assessing the microbiome focus on fecal samples as an ov er all r ead-out of the micr obial intestinal comm unity.Ther e is, ho w e v er, clear e vidence for bacterial comm unity changes along the GI tract (Vonaesch et al. 2018 ); there are likely similar changes in the eukaryotic community.The different segments of the GI tract have different roles and physiology, with nutrient absorption taking place mainly in the small intestine and fecal samples r epr esenting an ov er all r ead-out of the lo w er GI tract microbiota.
In this study, we aimed to c har acterize and compar e the intestinal and fecal eukaryome in c hildr en a ged 2-5 years living in two African countries: Madagascar and the Central African Republic (C AR).We then in v estigate the r elationship between the eukaryome composition and clinical factors such as stunted gr owth, ir on deficiency, and intestinal inflammation.

Study cohort, sample collection, metadata, and biobanking
This study was carried out in the context of the AFRIBIOTA project, a case-control study for stunting in c hildr en a ged 2-5 years in two different study sites, Bangui, CAR and Antananarivo, Madagascar (Vonaesch et al. 2018 ).Recruitment took place between December 2016 and March 2018.In the context of this analysis, onl y c hildr en r ecruited in the comm unity wer e included.Metadata including age, nutritional status, iron levels, hemoglobin, and socio-economic factors were collected using a standardized questionnair e. Complete blood count, C-r eactiv e pr otein (CRP), and ferritin le v els wer e measur ed at the Clinical Biology Center (CBC) of the Institut Pasteur de Madagascar and the Laboratoire d'Analyse Médicale at the Institut Pasteur de Bangui within 4 h after blood collection.Ferritin le v els wer e corr ected for systemic inflammation as described in (Thurnham et al. 2010 ).Hemoglobin values were adjusted for altitude as described in Centers for Disease Control (CDC;1989 ) and Sullivan et al. ( 2008 ), and anemia was defined as less than 110 g/l according to WHO criteria (Onis 2006, OMS 2011 ).All participants r eceiv ed or al and written information about the study and legal r epr esentativ es of the c hildr en pr ovided written consent to participate in the study.The study protocol for AFRIBIOTA has been a ppr ov ed by the Institutional Re vie w Board of the Institut Pasteur (2016-06/IRB) and the National Ethical Re vie w Boards of Mada gascar (55/MSANP/CE, 19 May 2015) and CAR (173/UB/FACSS/CSCVPER/16). Detailed inclusion and exclusion criteria and r ecruitment pr ocedur es ar e described else wher e (Vonaesch et al. 2018 ); importantly, children with severe acute disease were excluded, as were children that had recently taken antibiotics (Vonaesch et al. 2018 ).Based on median height of the WHO r efer ence population (Onis 2006, World Health Organization 2007 ), the c hildr en wer e classified in thr ee gr oups: se v er e stunting (height-for-a ge z-scor e ≤ −3SD), moder ate stunting (height-fora ge z-scor e between −3SD and −2SD), and not stunted (height-fora ge z-scor e ≥ −2SD).Car egiv ers wer e instructed to collect feces in the morning before coming to the hospital.Gastric and duodenal samples were collected using a pediatric nasogastric tube (Vygon, F rance), and w ere only collected for stunted c hildr en (ethical constr aints).The pr ocedur e was put in place for studying the bacterial community of the small intestine, which was suspected, and later shown, to be implicated in the pathophysiology of stunting (Vonaesch et al. 2018(Vonaesch et al. , 2022 ) ). We, thus had the unique opportunity to analyze the intestinal eukaryome from these samples.Once the gastric, duodenal, or fecal samples were collected, they wer e aliquoted, fr ozen at −20 • C and tr ansferr ed the same day to a −80 • C freezer (Bangui), or directly snap-frozen in liquid nitrogen and then tr ansferr ed to a −80 • C fr eezer (Antananariv o).DN A extraction was performed on site in Antananarivo and Bangui and extracted DN A w as shipped on dry ice.Biobanking and sample distribution was performed by the Unité de Bactériologie Expérimentale, Institut Pasteur de Madagascar, the Laboratoire d'Analyse Médicale, Institut Pasteur de Bangui, and the Clinical Investigation and Access to BioResources Platform (ICAReB) at the Institut P asteur, P aris .T he ST ORMS c hec klist for this study is included as File S1 (Supporting Information).

DN A extr action and sequencing
Gastric, duodenal, and fecal samples were extracted by commercial kits (QiaAmp cador ® Pathogen Mini or cador ® Pathogen 96 QIAcube ® HT Kit, Qia gen, whic h ar e kits using the same c hemistry for manual or automatic extraction) following the manufactur er's r ecommendations with an additional bead-beating step to incr ease mec hanical disruption as described in Vonaesc h et al. ( 2018 ).DN A extraction w as compared betw een the tw o sites using bacterial ZymoBiomics community standards (Zymobiomics, D6300) and DNA contamination was assessed using parallel processed negativ e contr ols (molecular gr ade w ater).Samples w ere stored at −80 • C until sequencing.Extracted DNA samples were shipped on dry ice to a commercial provider where library generation and sequencing was performed (Microbiome Insights, Canada).
For a subset of the data, a second library using the same 18S rRNA V4 amplicon primers (TAReuk454FWD1 and TAReukREV3) and a peptide nucleic acid primer to block amplification of mammalian sequences (5 -TCTT AA TCA TGGCCTCAGTT-3 ) (Mann et al. 2020 ) was performed at the University of British Columbia, Vancouver, Canada using identical PCR conditions .T he mammalian blocking primer was designed to reduce the amplification of human reads and thus increase sequencing depth for nonhuman eukaryotic reads (Mann et al. 2020 ).Sequencing of this second data set was performed at Dalhousie University on a MiSeq Illumina sequencer using 300 + 300 bp paired-end V3 chemistry as described in Kozich et al. ( 2013 ).
Dem ultiplexed r eads wer e obtained fr om the two sequencing facilities and were processed into amplicon sequence variants (ASVs) using the D AD A2 pipeline (Callahan et al. 2016 ) with a minimum sequence length of 150 and maximum expected error of 8. Ov er all, for the 18S rRNA dataset using no bloc king primer, we obtained 694 978 clean sequences after running the Dada2 pipeline and filtering out any nonmicroeukaryotic reads such as v ertebr ates or plants.On av er a ge, we had 2632 sequences per sequencing-positive sample (i.e. with any sequences amplified; median: 1638; minimum: 0; maximum: 21 327).Taxonomy for the 18S dataset was assigned in a m ultistep pr ocess.First, taxonomy was assigned using D AD A2 and with the integrated tool SINA (Pruesse et al. 2012 ) and against the SILVA database (version 128) after which any ASV that could not be assigned a taxonomy was compared to the PR2 database (version 4.11.1)(Guillou et al. 2013 ).ASVs present in only one sample and at a relative abundance of less than 0.1% of the total dataset wer e r emov ed.Samples with fewer than 5000 reads were removed from downstream analyses.
An av er a ge of 30 203 (minim um: 289; maxim um: 267 972; total: 9 393 270) reads were generated for the 18S dataset using a blocking primer.Ov er all, for the 18S rRNA dataset using the blocking primer, we obtained 3 415 973 clean sequences after running the Dada2 pipeline and filtering out an y nonmicr oeukaryotic r eads such as vertebrates or plants.On av er a ge, we had 6222 sequences per sequencing-positive sample (i.e. with any sequences amplified; median: 983; minimum: 2; maximum: 84 623).ASVs for this dataset were generated in an identical manner to the full 18S dataset and samples with fewer reads than 500 were removed fr om downstr eam anal yses.18S sample pr ocessing files can be found at https://github.com/Parfr eylab/afribiota .
Taxonomy was refined by cr oss-r efer encing with the other databases and by placement in phylogenetic trees for k e y taxa as described below.Suspected taxonomic misannotations were c hec ked using BLASTn and the NCBI NT database (McGinnis and Madden 2004, Ye et al. 2006, Johnson et al. 2008 ) and the taxonomic string assignment for Entamoeba and Sacc harom ycetales , whic h ar e err oneous in the Silv a database, wer e corr ected manuall y.All corr ections to the original taxonomy file ar e listed in Table S1 (Supporting Information).Code for the bioinformatic analysis is available at https://github.com/Parfr eylab/afribiota .

ITS2 dataset
An av er a ge of 57 170 (minim um: 24, maxim um: 339 136) sequences per sample were generated for the ITS2 dataset.ASVs wer e gener ated using the D AD A2 pipeline with a minimum sequence length of 50 and maximum expected error set to 6 and 8 for the forw ar d and r e v erse r eads, r espectiv el y.After der eplication, ASVs comprised of fewer than 50 reads or those with less than 0.1% ov er all r elativ e abundance in the full dataset wer e r emov ed.On av er a ge, we obtained 40 576 clean sequences per sample for the ITS2 dataset after running the Dada2 pipeline (minimum: 0; maximum: 217 506).Samples with fewer than 5000 reads were remov ed fr om downstr eam anal yses.Taxonomy was assigned using the UNITE database (version 8.0) (Koljalg et al. 2005 ).Fungal ASVs that could not be assigned beyond the kingdom le v el using the UNITE database (36.7% of the total dataset) were run thr ough BLAST a gainst the NCBI NT database but no further taxonomic information could be r esolv ed and the sequences were thus just assigned at Kingdom le v el.All code for sample processing is available at https://github.com/parfr eylab/afribiota .Rarefaction curves for all datasets are given in Figure S1 (Supporting Information).All raw sequencing data is deposited in ENA, accession number PRJEB57073.

Annotation of taxa as environmental, gut-associa ted, or potentiall y gut-associa ted taxa
Taxa were individually assessed and screened for humanassociation using expert knowledge and Google search.Taxa were classified as gut-associated if the taxon has been associated with human infection or r epeatedl y described to be part of the eukaryotic microbiome in peer-reviewed scientific journals.Taxa were classified as possibly gut-associated if the taxon has been associated with human infection, but in very few reports or cases.Taxa were classified as environmental if they are known plant parasites, are known to inhabit aquatic environments or soil, or are known to be associated with nonhuman eukaryotes such as insects.If there was no hit for the taxon in the Google search or if the taxonomic le v el did not allow distinguishing the taxon enough to make a claim about the most likely origin or association, taxa were classified as from unknown origin.

Construction of phylogenetic trees
Taxonomy was refined by phylogenetic analysis for k e y eukaryotes by constructing bac kbone tr ees and then placing ASVs within these trees.A total of five backbone trees were generated, one for Blastocystis , Entamoeba , Trichostomatia , Diplomonada , and Tric homonada .For eac h bac kbone tr ee, we r etrie v ed av ailable sequences from the SILV A (v .132),PR2 (v .4.12.0) and NCBI databases.Sequence accession numbers used for the backbone tr ees ar e pr esented in Table S2 (Supporting Information).We also used out-group taxa selected based on the liter atur e to r oot the tr ees.Sequences wer e first sorted and clustered at 99% similarity using usearch (v.8.1.1831_i86linux32)or vsearch (v.2.15.1) (Rognes et al. 2016 ).Sequences were then aligned using mafft (v .6.814b or v .7.475) (Katoh et al. 2009 ) and trimmed using trimAl (v.1.2re v59 or v.1.4.1) (Ca pella-Gutierr ez et al. 2009 ), and the alignment inspected using aliview (v.1.26)(Larsson 2014 ).A maximum-likelihood phylogenetic tree was then constructed using a GTR model and 100 bootstr a p r eplicates with RAxML (v.8) 95 (Stamatakis 2015(Stamatakis , K ozlo v et al. 2019 ) ). ASVs r etrie v ed fr om the Afribiota dataset were then placed into the resulting backbone tree using SINA (v.1.7.1) (Pruesse et al. 2012 ).The annotation of 102 ASVs was then updated based on the backbone trees (Table S1, Supporting Information).Blastocystis ASVs were further curated based on the alignment and tree.Two ASVs fall outside of known subtypes (DALASV146 and DALASV723); DALASV146 matches at 100% similarity with accession OM057456.1,which is sampled from Hoolock Gibbon.Subtypes ST1, ST2, and ST3 wer e r epr esented by man y ASVs and were further collapsed into subclusters if sequences were > 97% similar and formed a clade in the phylogenetic tree (Table S1 and Figure S4, Supporting Information).Subclusters did not show one to one correspondence with Blastocystis allele designations because the sequence fr a gment her e is shorter than the fr a gment used for allele designation and missing informative sites; for this reason we refer to subclusters by the ASV# of the ASV with highest r elativ e abundance r ather than allele .T he most similar allele for each ASV within a subcluster was determined using the Blastocystis Public databases for molecular typing and microbial genome diversity (Stensvold et al. 2012 ) and reported along with ASV in Table S1 (Supporting Information).Conda environment and scripts used to generate the Entamoeba tree are available at https:// github.com/aemann01/afribiota and the ones to generate the Blastocystis , Trichostomatia , Diplomonada , and Trichomonada trees at https://github.com/parfr eylab/afribiota .

Measurement of fecal markers of inflammation
Fecal calprotectin concentration was assayed in duplicate by a 'sandwich' type enzyme-linked immunosorbent assay, which uses a polyclonal antibody system (Calpr est; Eur ospital, Ital y).T he assa y w as performed accor ding to the manufacturer's instructions yielding a measur ement r ange of 15-5000 μg/g.Fecal α1-antitrypsin was measured using an immune-nephelemetric method adapted on the BN ProSpec system (Siemens, Germany).Briefly, stool samples were diluted 1:5 in 0.15 M NaCl then shaken vigor ousl y by the mean of a vortex until complete homogenization.The homogenate was centrifuged at 10 000 g for 15 min at 4 • C and the supernatant was used for anal ysis, whic h was performed at two different dilutions (1:5 and 1:500 final dilutions) to avoid an y pr ozone phenomena.Using this method, the range of measurement was 0.01-20 mg/g (Rodriguez-Otero et al. 2012 ).

Identification of microeukaryotes by microscopy
The identification of parasites was performed on a subset of subjects, whic h pr ovided enough stool samples according to methods r eported pr e viousl y (Habib et al. 2021 ).In short, fecal samples wer e examined micr oscopicall y using the Merthiolate-Iodine-F ormaldeh yde (MIF) and Kato-Katz (KK) techniques for helminths and the direct smear method (mounting without colouring) for protozoans according to standard techniques .T he analyses were performed at the Medical Center and the Experimental Bacteriology Unit of the Institut Pasteur de Madagascar and the Institut Pasteur de Bangui and validated by a medical doctor for diagnostic purposes.

Biosta tistical anal ysis
The final, filtered dataset with > 5000 sequences per sample comprised the following: for the 18S rRNA dataset without human blocking primers 33 gastric samples, 53 duodenal samples, and 464 fecal samples, for the 18S rRNA with human blocking primer 23 duodenal samples, 241 fecal samples and for the ITS2 dataset 158 gastric samples, 145 duodenal samples, and 315 fecal samples.Biostatistical anal yses wer e performed using R, v ersion 3.4.1 and the R pac ka ges Phyloseq (v ersion 1.22.3)(McMurdie and Holmes 2013 ), micr obiome (v ersion 1.0.2) (Callahan et al. 2016 ), v egan (v ersion 2.4-6) (Dixon 2009 ), DESeq2 (version 1.18.1)103 (Anders andHuber 2010 , Anders et al. 2013 ), andggplot2 (version 3.3.0)(Journal of Statistical Software, 2010 ).For the 18S dataset, taxa without annotation beyond the kingdom le v el wer e excluded from the final anal ysis because man y wer e misannotated bacterial sequences.Alpha diversity was quantified using the combined Shannon index using r ar efied data including singletons.All other analyses were performed on datasets with singletons filtered out.Beta diversity was quantified using the Bray-Curtis dissimilarity index for data of r elativ e abundance and the J accar d index for presenceabsence datasets (Bray and Curtis 1957 ).The J accar d index was pr eferr ed ov er the Sor ensen index to have equal weight for all taxa regardless of their prevalence across the dataset.Differences of diversity tests between samples were performed using nonparametric m ultiv ariate anal ysis of v ariance (PERMANOVA) with the function 'adonis' in the R pac ka ge v egan (Dixon 2009 ) or using an iter ativ e, logistic r egr ession.P -v alues wer e corr ected for m ultiple comparisons using the Benjamini-Hoc hber g pr ocedur e. Multiv ariate anal yses of differ entiall y abundant taxa as well as the presence of given taxa were performed on combined samples from both countries as well as on data from each country independently.Multivariate models were corrected for gender, age (in months), and country of origin and stratified on sample type, then on country of origin.For fecal samples, the m ultiv ariate models were also corrected for calprotectin levels as a measure of intestinal inflammation.Metadata, ASV tables and taxonomy tables can be found in the Appendix (Supporting Information) and the code is available at https://github.com/parfr eylab/afribiota .

Description of study population and the presence of eukaryotic reads
We analyzed a total of 1104 biological samples from a cohort of c hildr en fr om Mada gascar and CAR with two primer sets to investigate the fungal (ITS2 primers) and microeukary otic (18S rRN A gene primers) components of the gut microbiome .T he sample filtering w orkflo w is summarized in Figure S3 (Supporting Information).T he o v er all c har acteristics of the c hildr en included in the stud y are gi ven in Table S3 (Supporting Information) and the distribution of the main clinical variables in Figure S2 (Supporting Information).
Host, dietary, and environmental taxa were frequently amplified with 18S primers, as is common for studies of the mammalian gut (Mann et al. 2020 ).Removing sequences corresponding to plants or v ertebr ates fr om the 18S dataset left 464 fecal samples and 53 duodenal samples that had micr oeukaryotic r eads (66% and 28%, r espectiv el y) (Figur e S3A, Supporting Information).We also amplified a subset of 312 samples with 18S primers plus a mammalian blocking primer; 241 fecal samples (92%), and 23 duodenal samples (72%) wer e abov e the thr eshold of 500 sequences set by the bac kgr ound (negativ e contr ol; Figur e S3B, Supporting Information).After filtering out plant and v ertebr ate r eads, the micr oeukaryotic r eads wer e v ery low in the duodenal and gastric samples .T hey were a median of 99 reads in the duodenal samples and nine reads in the gastric samples using no blocking primer compared to a median read count of 988 for fecal samples and 49 reads in the duodenal samples and 2886 reads in fecal samples in the dataset using a blocking primer.T hus , for the two 18S datasets, the statistical anal yses wer e ther efor e onl y performed for the fecal samples.Amplification of ITS and 18S was une v en acr oss samples, so we asked whether samples with reads above threshold le v els wer e corr elated with country of origin or clinical v ariables.Ther e wer e significantl y mor e micr oeukaryotic positiv e samples in Mada gascar compar ed to Bangui (18S dataset with a blocking primer: P < .001and without a blocking primer: P = .014).Further, there was a weak trend to more microeukaryotic positive samples in older c hildr en compar ed to younger c hildr en (18S dataset with a blocking primer P = .38,18S dataset without a blocking primer: P = .026).Anemia in the fecal 18S dataset using a human blocking primer was the only clinical variable significantly associated with the presence of microeukaryotic reads .T he dataset generated with a blocking primer yielded higher diversity and more eukaryotes likely to be gut r esidents (Figur e S5, Supporting Information).We, thus focus in our analysis on this dataset and present results without the blocking primer in the supplement.
In the fungal ITS2 dataset 95% of the gastric samples sequenced (158 samples), 77% of the duodenal samples sequenced (145 samples), and 45% of the fecal samples sequenced (315 samples) had ITS2 (fungal) reads above the threshold of 5000 reads set by the negativ e contr ol (Figur e S3C, Supporting Information).Gastric and duodenal samples had a higher pr e v alence of fungipositiv e samples compar ed to feces ( P < .0001).The pr e v alence of samples with fungal reads in Madagascar (43%) was significantly higher compared to Bangui (33%; P = .03)and ther e wer e also mor e fungal r eads per sample in Mada gascar compar ed to Bangui ( P = .03).The presence of fungal reads was not significantly associated with any of the clinical parameters measured in the duodenum or feces.
T hus , our data shows that most of the human samples contain eukaryotic reads and that the presence of an intestinal eukaryome is not str ongl y influenced by clinical parameters but is influenced by the c hildr en's country of residence and age.

The fecal eukaryome of African children is domina ted b y Blastocystis and fungi
Characterizing the eukaryome reveals a handful of fungi and microeukaryotes that are prevalent across populations, while the ov er all eukaryome is low diversity and the distribution of most taxa patchy across individuals and populations (Figure 1 A and B; Figure S8A, Supporting Information).After filtering for low abundance and low-pr e v alence taxa we r ecov er ed 219 ASVs in the ITS dataset of which 99 (45%) are unassigned at species le v el and 48 (22%) at phylum le v el.
After filtering for low-pr e v alence taxa (pr esent in less than 50 sequences and less than 0.01% of the samples) and removing plants and v ertebr ates (18S rRNA dataset), we r ecov er ed 57 ASVs in the 18S rRNA dataset without using a mammalian blocking primer, and 127 ASVs in the reduced 18S rRNA dataset using a mammalian blocking primer.
As expected, for the ITS2 dataset, most of the taxa belong to the phylum Ascomycota or Basidiomycota.Ov er all, we identified two different fungal classes, 23 different orders, 41 different families, 60 different genera, and 78 different species (Tables S5-S10, Supporting Information).We identified a suite of eukaryotes spanning a broad taxonomic range that include many protists and helminths that are familiar from parasitology textbooks (Roberts et al. 2013 ), such as Entamoeba , Giardia , Trichuris , and Ascaris (Figure 3 ).The list of taxa detected is very likely an incomplete catalog of the microeukaryotic diversity in this cohort due to rather low sequencing depth and because many micr oeukaryotes ar e found at low frequency.We assessed whether using a mammalian blocking primer consistently increased the r ecov er ed div ersity acr oss micr oeukary otic taxa b y analyzing a subset of 100 fecal samples that were amplified with and without the mammalian blocking primer and compared taxon prevalence (Figure S5, Supporting Information).The mammalian blocking primer yielded higher sensitivity and diversity for pr otist gener a, but depressed detection and prevalence of fungal and helminth gener a (Figur e S5, Supporting Information).We detected higher diversity of ASVs overall using the mammalian blocking primer (Shannon Index, P < .001; Figure S5, Supporting Information).This was mediated by both higher richness (Chao1, P < .001)and higher e v enness of the comm unity structur e (Inv erse Simpson, P < .001).T hus , the r est of the anal ysis is focused on the dataset using mammalian blocking primer as it is most repr esentativ e of the diversity in the gut eukaryome and particularly for protists.Results on the slightly larger dataset not using mammalian blocking primers are presented in the supplementary data (Figures S6, S8-S10, S13, Tables S8, S9, and S19, Supporting Information).We found two genera to be present in at least 50% of samples in the 18S dataset: Blastocystis and Entamoeba .Using mammalian blocking primer allowing for preferential detection of protists, we detected Blastocystis to be present in 75% of all fecal samples anal yzed.Str atified by country, 50% of all samples from Madagascar amplified Blastocystis and Entamoeba and 50% of all samples from CAR Blastocystis .
In the dataset using a blocking primer, there were no consistent tr ends to co-occurr ence or coexclusion of an y of the eukaryotes with each other at lo w er taxonomic le v el (Figur es S6A-C, Supporting Information).Ho w e v er, ther e wer e clear negativ e corr elations between fungi and protozoa/helminths at higher taxonomic le v el (Figures S6D-G, Supporting Information).The same trends were observed in the dataset not using a mammalian blocker (data not shown).Se v er al co-occurr ences and coexclusions were observed in the fungal ITS2 dataset, yet often with weak associations (Figure S7, Supporting Information).There was also extensive coexistence of different Blastocystis subtypes within a single individual (Table S4, Supporting Information).
In summary, our data shows that the eukaryome of African c hildr en shows a high intersubject variability and is dominated by the protists Blastocystis and Entamoeba and different fungi.

T he m ycobiome of African children is v aried and domina ted b y members of Saccharomyces
The fecal mycobiome was lar gel y dominated by Ascomycota (Figure 2 ) and more precisely by members of the group Saccharomycetales (av er a ge r el.abundance 74%; Tables S5 and S6, Supporting Information).Ther e wer e v ery fe w fungi conserv ed on the lo w er taxonomic le v el.We detected fiv e fungal gener a to be present in at least 50% of all fecal samples and with a r elativ e abundance of at least 0.001%: Humicola , Malassezia , Cladosporium , Candida , and Samples wer e consider ed to be positiv e for a given genus if they had at least a single reads relating to this genus.Genera indicated in green are of probable environmental origin.Groups were compared using the Pearson chi-squared test and Benjamini-Hochberg correction for multiple testing.The colour code illustrating the degree of significance of the association is given on the bottom of the figure.
T hus , our data highlights a dominance of Ascomycota in the GI tract of African children with members of the Saccharomycetales as the main constituents of the fecal mycobiome.

The fecal eukaryome of children is strongly influenced by the country of residence
Ov er all, our data r e v eals that the fecal eukaryome of African c hildr en is influenced by country of residence (Figure 1 C-4 ; Figures S8B, S9B-D, S10, Supporting Information).For all amplicon datasets, country of origin significantly contributes to the over-all beta diversity when assessing both relative abundance (Bray-Curtis dissimilarity index; Figure 4 D) and presence-absence (Jaccard index; Figures S9B-D, Supporting Information).Further, the pr e v alence of se v er al taxa was significantl y differ ent between the two countries in a bivariate analysis (Figure 2 and 3 ; Figure S7B, Supporting Information).The r elativ e abundance of se v er al of these taxa remained significantly associated with the country of origin in a m ultiv ariate model correcting for sequencing run, total fungal reads and intestinal inflammation (Figure S10, Supporting Information).
In the 18S dataset without a mammalian bloc ker, pr e v alence of Ascaris , Trichuris , Entamoeba , and Blastoc ystis w as slightly higher in Samples wer e consider ed to be positiv e for a giv en genus if they had at least a single sequence relating to this genus.Gr oups wer e compar ed using the Pearson c hi-squar ed test and Benjamini-Hoc hber g corr ection for m ultiple testing.The colour code illustr ating the degr ee of significance of the association is given on the bottom of the figure.
Mada gascar while Candida , Pic hia , two differ ent gr oups of unassigned Sacc harom ycetales , and one unknown enter omonad wer e mor e pr e v alent in CAR (Figur e S8B, Supporting Information).
When correcting for sequencing run, total microeukaryotic reads, and intestinal inflammation (alpha-antitrypsin le v els and calpr otectine le v els), the two gener a of unassigned Sacc harom ycetales , Pichia, Entamoeba and Candida remained significantly associated with the country of origin of the full dataset of fecal samples.
In the reduced dataset using a mammalian blocking primer, we further found Entamoeba polecki to be significantly more prevalent in Madagascar compared to CAR (FDR = 0.007) in a multivariate model correcting for anemia status, intestinal inflammation, age and sequencing depth.Differences in alpha diversity according to the country of origin were visible in the 18S dataset but not in the ITS2 dataset (Figure 1 C).The differences in microeukaryotes across taxa was further confirmed on a subset of samples and microeukaryotes using microscopy-based approaches (Table S19, Supporting Information).T hus , our data show a clear influence of the country of origin on the eukaryotic community of fecal samples.

High genetic di v ersity of Blastocystis and frequent mixed colonization
In the dataset sequenced using a mammalian blocking primer, 78% of all fecal samples had at least one subtype of Blastocystis detected in their feces (189/241).The most pr e v alent subtype detected was ST3 (53% pr e v alence; 128/241), closel y follo w ed b y ST1 (46%; 111/241) and ST2 (33%; 79/241; Tables S6-S9, Supporting Information).Of the Blastocystis positive samples, 49% had only one Blastocystis subtype detected by 18S amplicon sequencing and 51% displayed mixed colonization (Table S4, Supporting Information).Ov er all, ther e was a slightly higher prevalence of Blastocystis ST3 in Mada gascar compar ed to CAR (FDR = 0.05).Blastocystis subtypes were further divided into subclusters using phylogenetic analysis (Figure 3 B; Table S1, Supporting Information).Several of the subclusters sho w ed a different distribution accor ding to the country of origin of the c hildr en: Ther e was a higher pr e v alence of subclusters ST3-DALASV6 and ST3-DALASV94 in feces from Malagasy c hildr en compar ed to CAR c hildr en.We observ ed further a lo w er pr e v alence of subcluster ST2-MIASV87 in Madagascar compared to CAR (Figure 3 B).
Together, these results indicate that children frequently have mixed colonization of distinct Blastocystis subtypes and genetic variability is high within Blastocystis subtypes.Some genetic variants (subclusters) show country-specific differences in pr e v alence.

Different species of Entamoeba are part of the intestinal eukaryome
All sequences from Entamoeba were placed in a phylogenetic tree (Figure S4, Supporting Information) to refine the classification at species le v el.We detected fiv e Entamoeba species: E. coli , E. dispar/histolytica (the ASVs detected there are all more similar to the nonpathogenic E. dispar , though it is not possible to confidently distinguish these species by amplicon sequencing), E. polecki , E. bovis , and E. hartmanni.Entamoeba polecki , previously known as Entamoeba chattoni, was detected in 8% of samples (20/242).We detected E. dispar/histolytica in 21% of all samples (50/242), E. coli in 22% (54/242), E. hartmanni in 22% (53/242), E. polecki in 8% (20/242), and E. bovis in 10% (25/242).Roughly, one-fourth of the Entamoeba positive samples had more than one Entamoeba species present.Further, for both, Blastocystis and Entamoeba , we saw a negative correlation with the relative abundance of fungi and a positive correlation in betw een Blastoc ystis and Entamoeba (Figure S6, Supporting Information).In conclusion, our data shows the presence of different species of Entamoeba in fecal samples with pronounced country-specific differences in the prevalence of specific subtaxa.Most samples show only a single Entamoeba species at the time.

Stunting is associated with altered abundance of certain members of the eukaryome
We next assessed if different clinical factors, including anemia, stunting status, and environmental enteric disease (measured through the inflammatory markers fecal calprotectin and alpha-1-antitrypsin) are associated with significant changes in the eukaryome composition in the GI tract (Figure 2 and 3 ; Figure S7B, Supporting Information).Ov er all, clinical factors contributed onl y mar ginall y to the beta diversity (Figire 4 D).Further, there was little influence of these clinical parameters on alpha diversity (Figure 1 C), except for anemia and calprotectin levels, which con-tributed mar ginall y to the alpha div ersity of the 18S dataset using mammalian blockers.In a bivariate model, prevalence of Ascaris , Tric huris , and Sacc harom ycetales was associated with anemia (Figure S8, Supporting Information).Ho w ever, no specific taxa were consistently associated with anemia, alpha-1-antitrypsin, or calpr otectin le v els in a m ultiv ariate anal ysis assessing for r elativ e abundance of the reads in the 18S nor the ITS2 datasets (data not shown).
We then tested for associations between taxa pr e v alence and r elativ e abundance and stunted growth in the ITS and 18S datasets (Figure S13, Supporting Information).In the ITS dataset the r elativ e abundance of one taxon-Fusarium equiseti -was associated with stunted growth in CAR in a biv ariate anal ysis assessing stunted growth as a categorical variable (corrected P = .03)and in a m ultiv ariate anal ysis (DeSeq2) that corrected for age, sequencing run and total fungal reads ( P = .001).Fusarium equiseti was also found with higher pr e v alence in stunted c hildr en compared to nonstunted controls in a m ultiv ariate logistic r egr ession model ( P = .03)correcting for country of origin, calprotectin le v els, sequencing run, total fungal reads, and gender .Together , these data suggest that there might be more F. equiseti in stunted c hildr en compar ed to nonstunted contr ols.
In the 18S dataset using the mammalian blocking primers, there was a trend of reduced Blastocystis relative abundance in stunted c hildr en compar ed to nonstunted contr ols .T his was observed in the overall dataset (including both study sites) performing a bivariate correlation analysis (FDR = 0.07).Stratifying the analysis by country, this trend was observed in bivariate analysis using stunting either as a categorical variable in Antananarivo (FDR = 0.08), as a continuous variable (FDR = 0.016), but not in C AR. T he association with Blastoc ystis w as not significant in a multiv ariate model corr ecting for intestinal inflammation, anemia, age, and gender, which are known modulators of the microbiota.The same tr ends wer e observ ed in the 18S dataset not using a mammalian blocking primer .Further , in both 18S datasets, several members of the Saccharomycetales were consistently associated with stunting in Bangui, CAR.
T hus , our data indicates that there are consistent trends for lo w er Blastoc ystis and Sacc harom ycetales le v els and higher F. equiseti le v els in stunted c hildr en in both study sites .T he associations are independent of intestinal inflammation and/or anemia.

The human eukaryome differs along the GI tract
We examined the mycobiome and eukaryome along the GI tract.Data for the eukaryome of the upper GI tract are sparse; most of the micr oeukaryotic r eads in the gastric and duodenal samples belong to host, dietary, and en vironmental sources , while gut residents, such as Entamoeba and Blastoc ystis , w er e spor adicall y observed.After filtering out vertebrate, arthropod, and plant reads (whic h ar e likel y of dietary origin) ther e wer e onl y fe w gastric and duodenal samples with micr oeukaryotic r eads with and without the mammalian blocking primer (Figure S3A and B, Supporting Information).
The mycobiome (assessed through ITS2 sequencing) of the upper and lo w er GI tract differed significantly in its alpha-and betadiversity.Fecal samples had a higher number of observed taxa, but lo w er e v enness and thus a lo w er ov er all div ersity as measur ed by the Shannon index (Figure 5 A; P < .001).Further, the gastric and duodenal samples sho w ed a similar composition (Figure 5 C), and both differed significantly from the fecal samples when assessing r elativ e abundance (Br ay-Curtis index, Figur e 5 B; P = .006)and presence/absence (J accar d index; Figure S9A, Supporting Information; P = .002).The ratio of Basidiom ycota/Ascom ycota was significantly higher in gastric and duodenal samples compared to fecal samples (Figure 5 C; P = .003).Several members of Saccharom ycetales wer e found in higher le v els in the feces compared to the upper GI tract, including Saccharomyces , Kasachstania , Debaryom yces , Wic kerhamom yces , and Meyerozyma, while Humicola is more abundant in the upper GI samples (Figure 5 D).Further, these fungal genera remained significantly different in between fecal and duodenal samples in a m ultiv ariate anal ysis corr ecting for inflammatory status, sequencing run and total fungal read count in the combined dataset (Figure 5 E), and when the ITS dataset was stratified by country of origin (Figure S11, Supporting Information).The same tr ends wer e also observ ed in a r educed sample set including only subjects with samples for all three GI tract locations (Figure S12, Supporting Information).
Our data thus clearly shows that the eukaryome differs in between different sites of the GI tract.

Discussion
Eukaryotes are important members of the GI tract, contributing to the consumption of nutrients and vitamin production (Schei et al. 2017 ) and dir ectl y modulating both the bacterial microbiota composition (Morton et al. 2015, Zaiss et al. 2015, Audebert et al. 2016, Nie v es-Ramir ez et al. 2018, Stensvold and van der Giezen 2018 ) and the immune system (Underhill and Iliev 2014, Zaiss et al. 2015, McFarlane et al. 2017, Richard and Sokol 2019 ).While the field of parasitology is well-established, the diversity and ecology of human intestinal eukaryome and v ariability acr oss individuals remains poorly studied.We describe the human eukaryome and mycobiome in a large cohort of children living in Africa without diarrhea or acute GI symptoms.We find a diverse community of eukaryotes, which does not seem to be str ongl y influenced by clinical factors .T he r esults her e ar e consistent with the emer ging viewpoint that many eukaryotes present in the human GI tract may be minimally harmful, commensal or of potential benefit (P arfr ey et al. 2014, Andersen and Stensvold 2016, Stensvold and van der Giezen 2018 ).Pathogenic protists such as E. histolytica and Cryptosporidium sp. are major sources of morbidity and mortality among c hildr en in r esource poor settings (Turkeltaub et al. 2015 ), and hav e rightl y garner ed m uc h r esearc h attention.These taxa ar e r ar e her e, likel y because this study excluded c hildr en with acute GI disease (Vonaesch et al. 2018 ).Further, many carriers of E. histolytica and G. intestinalis do not show (se v er e) symptoms and their virulence is determined by a complex interplay between parasite, host, and microbiota composition (Marie and Petri 2014 ).Last, it has also been shown that the eukaryotic microbiota composition is associated with the bacterial microbiota composition (Morton et al. 2015 ) and that the bacterial community has in turn an influence on Entamoeba virulence in the human GI tract.Indeed, ther e is incr eased awar eness that not onl y bacteria, but the whole micr obial comm unity inhabiting the GI tr act, including pr otists , helminths , fungi, archaea, and viruses pla y an important role in the ov er all ecosystem as well as in cross-talk with the host and regulation of virulence [reviewed in Hirt ( 2019 ), Ro w an- Nash et al. ( 2019 ), and Ungaro et al. ( 2019 )].Integrated studies across all domains of life would, ther efor e, be incr easingl y needed to understand the real implication of the human microbiome in health and disease.
To our knowledge, this is the first description of the eukaryotic comm unity comparing differ ent GI compartments and the first sequencing-based study to assess the eukaryotes in the human stomach and small intestine .F ew gastric and duodenal samples yielded micr oeukaryotic r eads in the 18S datasets, so we cannot compar e the comm unity acr oss the GI tr act.Ho w e v er, similar taxa wer e observ ed in the upper GI tr act and in feces with Blastocystis most pr e v alent.Our data r e v eals that ther e is a clear compartmentalization in the communities found along the GI tract for fungi.Fungal composition was very similar between the gastric and duodenal samples, and both are distinct from the fecal composition, similar to the bacterial community (Vonaesch et al. 2018(Vonaesch et al. , 2022 ) ).While this might be expected, seen the v astl y differ ent envir onments encounter ed at eac h site, this has so far ne v er been r eported.Earlier r esearc h described fe w acid-toler ant fungi like Candida and Phialemonium (von Rosenvinge et al. 2013 ) [r e vie wed in Hallen-Adams and Suhr ( 2017 )] in the stomach and culture-based a ppr oac hes identified Candida in the small intestine (Heyworth andBrown 1975 , Minoli et al. 1981 ).In our study, we assessed the fungal composition of 196 duodenal samples from two different countries and detect a vast diversity of strains, most of which ar e pr obabl y tr ansients.Ov er all, the two most fr equentl y detected   taxa were H. grisea (70% in Madagascar, 84% in CAR) and M. restricta (52% in Madagascar, 71% in CAR).Humicola grisea is likely an envir onmental tr ansient r ather than a true colonizer of the gut: It is because it is a w ell-kno wn thermophilic, soil-dw elling fungus and was not part of the eukaryotic community described in earlier studies assessing the feces in urban industrialized countries (Suhr andHallen-Adams 2015 , Hallen-Adams andSuhr 2017 ).Malassezia and Cladosporium , the two other most pr e v alent taxa in the upper GI tr act, whic h ar e also fr equentl y found in the feces here and in urban industrialized countries (Nash et al. 2017, Auchtung et al. 2018 ), show similar pr e v alence in all three compartments, suggesting that they could be true colonizers but further experimental work is necessary (Hallen-Adams and Suhr 2017 ).
Malassezia restricta is also known to be a natural member of the human skin microbiota (Vijaya Chandra et al. 2020 ).Malassezia has also been associated with IBD in a recent patient study and its direct implication has been validated in subsequent experiments in mice (Limon et al. 2019 ).
In line with pr e vious r esults fr om urban industrialized countries (Stensvold et al. 2011(Stensvold et al. b, P arfr ey et al. 2014 ) our data r e v eals that the fecal eukaryome of African c hildr en is dominated by fungi and pr otists, especiall y Blastocystis and Entamoeba .The patterns that there are few taxa in any given individual and high variability across the population are perhaps surprising when evaluated in the fr ame work of the bacterial microbiome.Indeed, the fecal prokaryome harbours overall more taxa and a m uc h higher pr oportion ar e widel y shar ed acr oss individuals.Ho w e v er, these patterns have been previously observed in humans and other mammals (P arfr ey et al. 2014(P arfr ey et al. , Bac hmann et al. 2015 ) ).The comm unities observ ed in this cohort ar e mor e div erse than in industrialized countries both in terms of the broader taxonomic gr oups r epr esented (e.g.Parabasalids , Blastocystis ) and greater diversity within taxonomic groups.At the individual le v el this means ther e ar e mor e so called mixed colonizations , e .g. multiple subtypes and strains of Blastocystis and Entamoeba were detected in ∼50% of individuals that harbour these taxa.We did not detect Chilomastix , Dientamoeba , nor Cryptosporidium in our 18S datasets, despite the fact they are widespread in the human fecal eukaryome (Barratt et al. 2011, Roser et al. 2013, Turkeltaub et al. 2015, Jokelainen et al. 2017, Greigert et al. 2018 ).There is evidence that D. fragilis is more prevalent in westernized compared to traditional comm unities (Barr att et al. 2011 ), whic h could explain the absence of D. fragilis in our study.Another hypothesis is that low sequencing depth was insufficient to detect D. fragilis and Chillomastix , whic h ar e low pr e v alence in the micr oscopy-based analysis.Last, we identified three primer mismatches of the primers for P ar abasalids ( Dientamoeba ) and for Fornicata (including Chilomastix ), which could explain the missing reads.Cryptosporidium is mainly associated with diarrhoea (Checkley et al. 1998, Costa et al. 2011, Mekonnen et al. 2019 ), so likely not detected in our cohort due to the choice of children included in the study.In line with this hypothesis, Cryptosporidium was also not detected through microscopy.
Last, r ecent e vidence suggests that ther e ar e a ge-dependent c hanges not onl y in the bacterial micr obiota comm unity but also in the eukary otes.Indeed, tw o studies in the USA and Ireland sho w ed Blastoc ystis to be less pr e v alent in c hildr en compar ed to adults (Scanlan et al. 2016(Scanlan et al. , 2018 ) ).This contrasts with the high pr e v alence of pr otist in the cohort of c hildr en described her e .T his difference might be due to a very high exposure to contaminated drinking water in these areas (Habib et al. 2021, Vonaesch et al. 2021 ), favouring the colonization by these micr oor ganisms in exposed c hildr en.This would be in line with a pr e vious study in c hildr en fr om Mexico, showing a high pr e v alence of Blastocystis alr eady in earl y life (P artida- Rodriguez et al. 2021 ).Mor e work is needed to understand the early life dynamics of eukaryotes within the human GI tract.
The fecal mycobiome of African c hildr en r esembles the mycobiome reported in westernized countries [reviewed in Richard and Sokol ( 2019 )]: in most studies, Candida (particularly Candida albicans ), Sacc harom yces (particularl y S. cerevisiae ), P enicillium , Aspergillus , Cryptococcus , Malassezia (particularly M. restricta ), Cladosporium , Galactom yces , Debaryom yces , and Tric hosporon , wer e detected in decr easing pr e v alence.In addition to these species, we fr equentl y detected Humicola , Kazac hstania , and Wic kerhamom yces .Humicola grisea is an environmental fungus (Wang et al. 2019 ).Kazachstania humilis , also called Candida humilis and K. exigua are normally found in fermented food (Garcia-Ortega et al. 2022 ).Wic kerhamom yces anomalus also called Sacc harom yces anomalus or Pichia anomala is often associated with spoilage or processing of food (Masneuf-Pomarede et al. 2015 ) and has also been found in the GI tract of insects (Cappelli et al. 2020 ).Very few eukaryotic taxa wer e pr esent in the feces of at least 50% of c hildr en, including H. grisea and M. restricta in the ITS2 dataset.Additional studies on describing mycobiome of the food and environment in conjunction with the gut mycobiome, including ideally also sourcetr ac king and/or repeated sampling would be needed to determine where gut fungi originate, and which are true colonizers of the gut (Lavrinienko et al. 2021 ).Here and in other studies fungal diversity within a sample (alpha diversity) is relatively low, Ascomycota and Basidiomycota are the dominant phyla, and many of the same genera are found (Hoffmann et al. 2013, Chehoud et al. 2015, Richard et al. 2015, Nash et al. 2017, Richard and Sokol 2019 ).The Ascom ycota/Basidiom ycota ratio was previously associated with gut health and has been shown to be lo w ered in adults (Sokol et al. 2017 ) as well as a c hildr en (Chehoud et al. 2015 ) suffering of IBD.Ho w e v er, in our study set, there was no association between this ratio and either stunting, anemia, or intestinal inflammation.Ho w e v er, ther e was a clear difference in the ratio by country of origin with higher Basidiomycota le v els in Mada gascar compar ed to CAR.
Which of the eukaryotes truly colonize the human gut and whic h ar e onl y tr ansient is a fundamental but unanswered question and one , i.e .incr easingl y debated for fungi (Hallen-Adams and Suhr 2017, Fiers et al. 2019 ).Recent evidence suggests that fungal colonization of the intestinal tract of healthy individuals is minimal (Auchtung et al. 2018 ), and raises the possibility that a small minority of the fungi detected in the gut mycobiome are residents of the gut (exemplified by Candida albicans ; Fiers et al. 2019 ).More than tw o-thir ds of all species reported in two previous studies were found only in a single sample, suggesting that they come from environmental sources (Suhr andHallen-Adams 2015 , Hallen-Adams andSuhr 2017 ).We hypothesized that true residents should be likely shared across several individuals, while food contaminants might be distributed less consistently.We thus tried to address this point by filtering the taxa according to their pr e v alence.Of note that ho w e v er e v en commonl y detected gut fungi, such as Debaryomyces hansenii or Penicillium, do not grow at 37  (Fiers et al. 2019 ).Similarl y, se v er al of the microeukaryotes detected, such as the rotifer Rotaria, are common inhabitants of freshwater and ther efor e likel y envir onmental contaminants.Ev en within known gut taxa we detect strains that are likely transient in humans and true residents of other mammals, such as E. bovis .These results reiterate the importance of critically examining the eukaryotic comm unity r ather than assuming all sequences obtained are members of the eukaryome.
Transient fungi might contribute to intestinal disturbances, especially if they produce m ycoto xins (Smith et al. 2012 ); F. equiseti , Penicillium , Fusarium , and Aspergillus are among the m ycoto xin producers detected here (Goswami et al. 2008, Munkvold 2017 ).Fusarium equiseti is a plant pathogen of cereals, field weeds, durian, goji berries, among others and is found in both, tropical and temperatur e r egions (Gosw ami et al. 2008, Munkv old 2017 ).Further, colonization with F. equiseti has been associated with a vegetarian diet (Hallen-Adams and Suhr 2017 ).Inter estingl y, we found a consistent trend to w ar ds higher levels of F. equiseti in stunted children compared to healthy controls .T his association might be due to a diet low in meat and high in plant matter consumed by stunted c hildr en (Vonaesc h et al. 2021 ) and favouring the colonization by F. equiseti (Hallen-Adams and Suhr 2017 ).It is also tempting to speculate that this fungus is contributing to the pathophysiology underlying stunted gro wth, ho w ever, m ycoto xin profiling as well as experimental data are needed to establish a causal relationship between these toxins and intestinal disturbances.
The diversity of protists and nematodes detected here resembles earlier r eports fr om humans living in Sub-Saharan Africa with high pr e v alence of Blastocystis , Entamoeba , Tric homonads , and yeasts (P arfr ey et al. 2014(P arfr ey et al. , Gr eigert et al. 2018 ) ).The catalogue of gut eukaryotes observed here is comparable to other parts of the world, albeit with higher diversity than typically found in Europe (Forsell et al. 2012, Scanlan et al. 2018, Lhotska et al. 2020 ).Nematodes (especially Ascaris and Trichuris trichiura ) were mainly found in Madagascar and only rarely in the CAR, possibly related to deworming medicine or lifestyle c hoices.Howe v er, local hotspots of nematode colonization wer e pr e viousl y r eported fr om se v er al places in Africa and are not associated with deworming campaigns (Moser et al. 2017, Schulz et al. 2018 ).
Diversity within common gut protists is high: diverse Entamoeba species colonize the intestines of humans and nonhuman primates (Stensvold et al. 2011b, Elsheikha et al. 2018, Stensvold 2019, Dos Santos Zanetti et al. 2021 ), and in this cohort we fr equentl y detected E. polec ki , E. dispar/E.histolytica, E. hartmanni , and E. coli .Entamoeba bovis is most often associated with ungulates and is a potential transient here.Entamoeba histolytica , a true pathogen, is detected less fr equentl y and difficult to distinguish fr om E. dispar by 18S sequencing (a specific qPCR is needed to make the molecular distinction).The diversity of Entamoeba detected here is high, and half of individuals positive for Entamoeba harbor multiple species and/or subtypes.
We find a trend to w ar ds greater Entamoeba presence in nonstunted compared to stunted children.Two studies assessing a possible correlation between infection with specific pathogens and stunted growth are currently ongoing within the Afribiota project.
Blastocystis is the most common protist within this cohort and widespread in humans and other mammals.Overall, Blastocystis ST1, ST2, and ST3 are predominant here, as in other human populations (Stensvold et al. 2020, Forsell et al. 2012 ), and ST4, which is common in Europe is absent.The diversity of Blastocystis is similar between countries , though the pr e v alence of se v er al str ains (subclusters) differs between countries .In contrast to two previous studies in Europe (Flemish Gut Project and Twins UK) (Tito et al. 2019 ), more than half of the c hildr en included in our study sho w ed concomitant presence of several Blastocystis subtypes in their feces.Our results suggest that different Blastocystis subtypes and Entamoeba species often coexist.This confirms a recent study in Cameroon in which it w as sho wn that some subtypes of Blastocytis and Entamoeba species show a positive correlation in their occurr ence (Ev en et al. 2021 ).
One surprising finding that emerges from our study is the lo w correlation betw een the eukary ome and clinical variables.
In pr e vious studies, particular fungal taxa were associated with increased inflammation in the context of IBD and/or colorectal cancer (Ye et al. 2006 ).In industrialized countries lo w er pr e v alence of Blastocystis has been reported in patients with inflammatory diseases including colorectal cancer or Crohn's disease compar ed to contr ol individuals (Beghini et al. 2017, Tito et al. 2019 ).Indeed, these micr oaer obic pr otists thriv e in low oxygen le v els, while higher oxygen le v els ar e a hallmark of gut inflammation, and Blastocystis could, ther efor e, be a marker of a 'healthy gut environment', meaning a gut environment that remains largely anaerobic (Audebert et al. 2016, Stensvold et al. 2020 ).Entamoeba is also reported to be inversely correlated with inflammatory diseases (Morton et al. 2015 ).Here we see trends of lo w er prevalence for E. coli in stunted c hildr en compar ed to nonstunted contr ols, though the association was not significant after correcting for multiple testing.While these results might be confounded by the fact that we had only limited sequencing depth and a very high variability of taxa for fungi, likely decreasing statistical po w er, our results suggest the eukaryotic community, at least in our two study sites, is not dr amaticall y r eor ganized by the clinical v ariables measur ed.Mor e r esearc h is needed to better elucidate the role of eukaryotes to the pathophysiology associated with a dysbiotic microbiota.
Ov er all, our r esults fr om two sites in Sub-Sahar an Africa show little correlation between stunting and the eukaryome, or individual eukaryote pr e v alence .T his could suggest that eukaryotes are less dir ectl y involv ed in the pathophysiology of c hr onic c hildhood undernutrition, as suggested also in a targeted analysis of the parasites in the Madagascar study site of Afribiota (Habib et al. 2021 ).Our results could also be a reflection that eukaryotes are less influenced by the altered gut environment compared to bacteria (Vonaesch et al. 2018 ).Further studies are, ho w ever, needed to corr obor ate this point and exclude an y tec hnical bias or site-specific effects.Earlier reports showing a direct role of helminths in undernutrition led to the recommendation of WHO for systematic antihelminthic treatments.Ho w ever, our observation is consistent with a recent meta-analysis on 80 studies showing that helminths are not directly involved in undernutrition (Raj et al. 2022 ).In line with this observation, a recent meta-analysis found very little impact of antihelminthic treatments on stunting or de v elopment (Taylor-Robinson et al. 2019 ).Ho w e v er, the cr oss-sectional design of our study does not allow to ca ptur e e v ents that initiate pathophysiology of stunting and ther efor e, cannot rule out a role for eukaryotes in leading to long term undernutrition.Further, as many c hildr en harbor se v er al eukaryotes at the same time, it is possible that effects of individual taxa are shielded because , e .g. different eukaryotes might have opposite effects on the immune system that cancel each other out.
This work provides a foundation for future studies assessing the eukaryome in disease contexts and longitudinal studies on the establishment and role of the human eukaryome.Future studies should also assess for interactions betw een prokary otes and eukaryotes in the gut microbiome to reveal cross-kingdom community structures and dynamics potentially influencing gut homeostasis and disease and assess how these interactions are shaped by diet and subsistence, as pr e viousl y shown to play a role in a study on nonhuman primates (Sharma et al. 2022 ) as well as in se v er al studies in humans (Morton et al. 2015, Ro w an-Nash et al. 2019 ).
One strength of our study lies in the fact that we combine the analysis of the 18S rRNA gene for the ov er all eukaryome with and without an additional primer blocking amplification of human DNA and the internall y tr anscribed spacer (ITS) gene for tar geted anal yses of the mycobiome and that we confirm presence of given eukaryotes on a subset of samples using microscopy.We show that the use of a mammalian blocking primer altered the observed community structure by detecting higher diversity within a sample, especially of protists and micr oeukar oytes detected from a higher proportion of samples .T his shows a limitation of amplicon-based profiling calls for standardized primers and protocols to allow for comparisons between different studies and locations.Combining the ITS and 18S rRNA gene a ppr oac hes w e w ere able to show that fungal and micr oeukaryotic div ersity in the gut of African c hildr en ar e onl y mar ginall y corr elated with clinical factors, yet str ongl y sha ped by geogr a phic location, most likel y thr ough diet and other envir onmental exposur es.
Our study has a few limitations: as a single sample was taken fr om eac h c hild and as we included onl y c hildr en a ged 2-5 years .T her efor e, the study does not allow assessing for dynamic changes nor does it allow to make any assumptions about early life succession.Further, stor a ge pr otocols slightl y differ ed acr oss the two sites and DNA was extracted with the same kit but in two differ ent geogr a phic locations and by differ ent experimenters.T hus , we might ha ve o v er estimated geogr a phic differ ences .T he ov er all impact of geogr a phy on the micr oeukary ome w as, ho we v er, confirmed on a subset of samples using microscopy.Further, seen that we only included two study sites in our project, the results about the ov er all div ersity of eukaryotes might be specific to our study context and might not a ppl y to other LIMC settings.
Further, since the implementation of our study, new ampliconbased sequencing methods have been de v eloped, allowing for the detection of a br oader r ange of eukaryotes within the human microbiome (P opo vic et al. 2018 ).Last, PCR based a ppr oac hes used for the taxonomic profiling of a given microbiota are at best semiquantitative in nature and typically do not integrate various taxonomic groups .T his hur dle can someho w be over come b y metatr anscriptomic a ppr oac hes, whic h can pr ovide a mor e global, unbiased and quantitative approach, as exemplified in a recent publication on idiopathic c hr onic diarrhea in macaques (Westr eic h et al. 2019 ).T hus , as any sequencing-based study, the data presented here is likely not re presentati ve of the true diversity of eukaryotes found in the feces of these c hildr en.
Ne v ertheless, by comparing the eukaryome of almost 1000 children of whom roughly half sho w ed fecal read counts for 18S rRNA gene and/or the ITS2 region in two different geographic locations of Sub-Saharan Africa, our data contributes valuable insights about the human eukaryome and sets the stage for more tar geted anal yses of eukaryome dynamics and of the role of the eukraoyme in health and disease.
In conclusion, our study clearly shows that African c hildr en harbour a specific eukaryome, which is compartmentalized along different sites of the GI tract and is strongly influenced by country of residence.

Figure 1 .
Figure1.Composition of the Afribiota samples as r e v ealed by ITS2 sequencing targeting fungi (A) , the broader 18S primers using a mammalian blocker (B) as well as association of the overall diversity with different clinical outcomes (C) .Significant association with the Shannon index as a measure of alpha diversity were assessed using a Wilcoxon rank-sum test.The colour code illustrating the degree of significance of the association is given on the bottom of the figure.

Figure 2 .
Figure 2. Differences in the fecal mycobiome in relation to geogr a phic location and differ ent clinical outcomes.Samples wer e consider ed to be positiv e for a given genus if they had at least a single reads relating to this genus.Genera indicated in green are of probable environmental origin.Groups were compared using the Pearson chi-squared test and Benjamini-Hochberg correction for multiple testing.The colour code illustrating the degree of significance of the association is given on the bottom of the figure.

Figure 3 .
Figure 3. Differences in the fecal microeukaryome in relation to geographic location and clinical variables.(A) All genera as detected by 18S sequencing using a mammalian blocking primer and (B) split in different Blastocystis clusters or Entamoeba species based on phylogenetic trees.Samples wer e consider ed to be positiv e for a giv en genus if they had at least a single sequence relating to this genus.Gr oups wer e compar ed using the Pearson c hi-squar ed test and Benjamini-Hoc hber g corr ection for m ultiple testing.The colour code illustr ating the degr ee of significance of the association is given on the bottom of the figure.

Figure 4 .
Figure 4. Differences in the microeukaryome (18S dataset) based on country of origin.PCoA plot based on the normalized Bray-Curtis dissimilarity index (log10) of the fecal dataset iter ativ el y r ar efied to 1000 micr oeukaryotic r eads (A) with the ITS2 primers tar geting fungi (CAR: n = 100, Madagascar: n = 202), (B) with a mammalian using 18S primers (CAR: n = 54, Madagascar: n = 99), (C) without a mammalian blocker using 18S primers (CAR: n = 110, Madagascar: n = 154).Samples from CAR are coloured in blue, samples from Madagascar in red.(D) Association of different clinical factors with Beta Diversity using a Permanova analysis on dispersion.The colour code is given in the figure.

Figure 5 .
Figure 5. Differences in the mycobiome along the GI tract.(A) Alpha diversity in the different compartments as measured by the Shannon index.(B) PCoA plot based on the normalized Bray-Curtis dissimilarity index (log10) of the dataset iter ativ el y r ar efied to 5000 fungal sequences (gastric: N = 148, duodenal: N = 132, and feces: N = 299).(C) Relative abundance of the different phyla according to sample type.(D) Differences in the mycobiome in relation to sampling location along the GI tract.Samples were considered to be positive for a given genus if they had at least a single sequence relating to this genus.Groups were compared using the Pearson chi-squared test and Benjamini-Hochberg correction for multiple testing.* P < .05;* * P < .01;* * * P < .005;comparison without an indication are nonsignificant.(E) Fungal genera showing significant differences in their relative abundance between duodenal and fecal samples in a DeSeq2 model correcting for sequencing depth.
• C and might thus not be true residents of the gastrointestinal tr act (Auc htung et al. 2018 ).Other common fungi may be transients and originate from food (e.g.Saccharomyces cerevisiae ) are plant pathogens (such as Fusarium , Alternaria , and Botrytis ), fungal comm unities fr om other parts of the body ( Malassezia ), or from the en vironment ( Asper gillus ) [r e vie wed in Auc htung et al. ( 2018 )].It is impossible to determine without experimental evidence if fungi are transient or resident members of the microbiota