-
PDF
- Split View
-
Views
-
Cite
Cite
Maryia Khomich, Håvard Kauserud, Ramiro Logares, Serena Rasconi, Tom Andersen, Planktonic protistan communities in lakes along a large-scale environmental gradient, FEMS Microbiology Ecology, Volume 93, Issue 4, April 2017, fiw231, https://doi.org/10.1093/femsec/fiw231
- Share Icon Share
Abstract
Despite their obvious importance, our knowledge about the eukaryotic microbial diversity of inland waters is still limited and poorly documented. We applied 18S rDNA amplicon sequencing to provide a comprehensive analysis of eukaryotic diversity in 74 low-productivity lakes along a 750 km longitudinal transect (5.40-18.52°E) across southern Scandinavia. We detected a wide diversity of pelagic microbial eukaryotes, classified into 1882 operational taxonomic units (OTUs). The highest OTU richness was found in traditional phytoplankton groups like Dinoflagellata, Chrysophyceae, Chlorophyta and Cryptophyta. A total of 53.6% OTUs were primarily autotrophic, while 19.4% of the heterotrophic OTUs belonged to putative parasitic taxa. Except for a longitudinal trend in the relative influence of mixotrophs, there were no significant associations between major functional groups (autotrophs, heterotrophs and parasites) and spatial and environmental variables. Community dissimilarity increased significantly with increasing geographical distance between lakes. In accordance with earlier, microscopy-based surveys in this region, we demonstrate distinct gradients in protistan diversity and community composition, which are better explained by spatial structure than local environment. The strong association between longitude and protistan diversity is probably better explained by differences in regional species pools due to differences in landscape productivity than by dispersal limitation or climatic constraints.
INTRODUCTION
Freshwater environments contain only around 0.01% of the world's water and comprise <1% of the land surface, but contain a disproportionately high fraction of the global biodiversity (e.g. one-third of global vertebrate diversity) (Dudgeon et al.2006; Strayer and Dudgeon 2010). However, knowledge of the total diversity of inland waters is incomplete, in particular among invertebrates and microbes (Dudgeon et al.2006). Freshwater habitats are of vital importance to global element cycling (Cole et al.2007) and other ecosystem services (Aylward et al.2005). ‘Lakes are particularly appealing subjects for ecological study in that they are self-contained ecosystems, discrete and largely isolated from others’ (Pianka 1974). Given their discrete nature, lakes may be good model systems for evaluating the effects of local dispersal and regional productivity on metacommunity properties (Leibold et al.2004; Ptacnik et al.2010; Hortal et al.2014). There is increasing awareness of how dispersal in spatial metacommunities contributes to maintaining biodiversity in freshwater ecosystems (De Meester et al.2005; Heino 2013). Dispersal mechanisms are better understood in fish and macroinvertebrates (Shurin, Cottenie and Hillebrand 2009) than in small microbial eukaryotes (protists) and bacteria. However, recent studies support dispersal limitation rather than cosmopolitan distribution among microbes (Martiny et al.2006; Heino et al.2010; Soininen et al.2011; Stomp et al.2011; Soininen 2012; Bates et al.2013).
Protists are key components in aquatic food webs both as producers of organic matter and as major consumers of bacterial biomass (Šlapeta, Moreira and López-García 2005). Recent molecular studies from marine (Logares et al.2012; Massana et al.2014; de Vargas et al.2015) and freshwater (Triadó‐Margarit and Casamayor 2012; Mangot et al.2013) ecosystems suggest that these environments represent a wide range of ecological niches and harbour enormous eukaryotic diversity. Phytoplankton monitoring has often focused on harmful blooms caused by anthropogenic nutrient pollution (Padisak et al.2006; O’Neil et al.2012), while relatively fewer studies have systematically investigated the general protistan diversity of oligotrophic lakes.
High-throughput sequencing (HTS) allows exploration of biodiversity in complex microbial communities in greater detail than hitherto possible (Johnson and Martiny 2015). Recently, HTS has extensively been applied in various aquatic microbial diversity surveys focusing on compositional changes along trophic or salinity gradients (Balzano, Abs and Leterme 2015), as well as seasonal fluctuations in microbial communities (Nolte et al.2010; Simon et al.2015). Freshwater eukaryotic assemblages have also been described in meromictic lakes (Charvet et al.2012), high mountain lakes (Kammerlander et al.2015), saline lakes (Wang et al.2014) and lakes of differing trophic status (Mangot et al.2013).
A set of lakes in a large region (e.g. boreal lakes) is an ideal model to study compositional variation from a perspective of multiple communities connected by dispersing organisms (Leibold et al.2004; Hortal et al.2014). Here, we investigate planktonic protistan communities over the west-east biodiversity gradient through southern Norway and Sweden (Ptacnik et al.2008b, 2010). Lakes for this study were carefully selected to be as similar as possible with respect to properties other than spatial position and local productivity (Fig. 1, Table S1, Supporting Information). Ptacnik et al. (2008b, 2010) found distinct regional trends in unicellular plankton diversity across Scandinavian lakes. Moreover, there are clear indications that plankton diversity actually affects functional aspects of lake ecosystems, such as resource use efficiency (Ptacnik et al.2008b). The Scandinavian diversity gradient is complex and not fully resolved as it coincides both with major changes in altitude, soil depth and landscape productivity, as well as the main post-glacial dispersal routes for freshwater organisms. Recurring glaciations in boreal areas can be considered an important, though neglected, historical climatic factor influencing biota (Soininen 2012). Recolonisation of Fennoscandia (Norway, Sweden, Finland and Karelia), which was entirely glaciated until ∼15 000 years ago, could occur either from the east (Russia) or from the south (Kontula and Väinölä 2001). Molecular studies show that fish (Perca fluviatilis) and crustacean (Gammarus lacustris) populations have invaded Fennoscandia from both directions (Refseth et al.1998; Vainio and Väinölä 2003).

A map of sampled lakes (n = 77) coloured by annual mean air temperature (°C) based on the WorldClim database (Hijmans et al.2005). Contour lines represent altitude below 600 m. The mountain ridge extends S-N around 8°E. Three lakes indicated by black colour were not included in the analysis due to glacial influence or high/low pH.
Local environmental effects must be controlled to resolve spatial patterns in lake biodiversity. The concentration of total phosphorus (TP) is often the primary limiting factor for primary production in lakes (Schindler 1977), especially in areas with high atmospheric deposition of inorganic nitrogen such as Southern Scandinavia (Elser et al.2009; Hessen et al.2009). The concentration of total organic carbon (TOC) affects the balance between heterotrophic and autotrophic processes in plankton communities by being both a carbon source for heterotrophic bacteria and a modulator of the underwater light availability (Jansson et al.2000; Thrane, Hessen and Andersen 2014). Biotic factors (predation, host availability for parasites, viral infection) also influence protistan diversity (Lepère et al.2006; Rasconi, Jobard and Sime-Ngando 2011; Zhao et al.2011; Triadó‐Margarit and Casamayor 2012).
The phytoplankton of lakes in this region is well studied by traditional microscopy (Rosén 1981; Brettum 1989; Brettum and Andersen 2004; Willén 2007; Ptacnik et al.2010), but this is the first comprehensive HTS survey of their protistan communities. Our objectives were as follows: (i) to analyse taxonomic and functional composition of pelagic protistan communities across a known biodiversity gradient, (ii) to determine what factors govern protistan community differentiation across lakes in this gradient and (iii) to estimate the relative roles of local and regional factors in influencing community structure.
MATERIALS AND METHODS
Site description
Lakes were selected from the ‘Rebecca’ (Solheim et al.2008) and ‘Nordic lake survey 1995’ (Henriksen et al. 1998) data sets on Norwegian and Swedish lakes to create a subset fulfilling the following criteria: longitude 5–18 °E, latitude 58–62°N, altitude <600 m, surface area >1 km2, TP < 30 μg L−1, TOC < 30 mg L−1 and pH > 5. These lakes represent a subset of boreal lakes with best possible coverage and orthogonality with respect to gradients of TP, TOC and longitudinal position. The former two represent two major effects on aquatic productivity (Thrane, Hessen and Andersen 2014), while the latter reflects the regional diversity gradient (Ptacnik et al.2010). This means that longitude will also be aligned with the regional productivity gradient, which Ptacnik et al. (2010) have shown can be represented by a distance-weighted average of TP in nearby lakes from an independent data set (TPreg). Longitude and TPreg were closely correlated in our study lakes with Pearson correlations from 0.93 to 0.97, for interpolation ranges from 100 to 500 km (Fig. S1, Supporting Information). The three gradient variables were split in two factor levels (high/low), giving eight combinations of TP, TOC and longitude. A total of 12 lakes were randomly sampled from each combination. Sampling was performed by hydroplane in July to August 2011 (Thrane, Hessen and Andersen 2014). Because of unfavourable weather conditions during sampling, the number of sampled lakes was eventually reduced to 77 (Fig. 1).
Sampling programme
Water samples were collected from the lake epilimnion (0–5 m) using an integrating water sampler (Hydro-BIOS, Germany) in the central part of each lake during daytime. For DNA analysis, up to 15 L of water was pre-screened on 100 μm mesh to remove large non-protistan organisms and filtered onto 47 mm 2 μm Isopore TTTP membrane filters (Millipore Corp., MA, USA) taken in 3×3 replicates. The filters were stored at –20°C in cryovials until DNA extraction. Samples for nutrients were collected as described in Thrane, Hessen and Andersen (2014). Concentrations of TP, TOC and total nitrogen were determined using standard techniques (for details, see Thrane, Hessen and Andersen 2014).
Chemical characteristics of the water (e.g. nutrients, pH and ionic strength) are the most relevant environmental factors determining changes in phytoplankton community composition. TOC and TP were chosen as proxies to represent regional environmental gradients and local nutrient supply variability, respectively. The third variable, conductivity, is directly related to the concentration of ions in dissolved salts, and serves as an indicator of soil depth and landscape productivity that is less affected by local pollution than TP (Ryder 1982). Since there is a close relationship between pH and conductivity in this data set (R2 = 0.65; p < 0.00001), conductivity can also be considered a proxy for pH. It is important to take into account that not all predictor variables are completely independent (Fig. S2, Supporting Information).
DNA extraction, amplification and 454-sequencing of the V4 SSU
DNA was extracted from the filters using NucleoSpin® Plant II Kit (Mackerey-Nagel, Düren, Germany) according to the manufacturer's instructions and quantified using Nanodrop (NanoDrop Technologies Inc., DE, USA). The hypervariable V4 region (∼380 bp) of the eukaryotic 18S rRNA (Stoeck et al.2010; Logares et al.2012) was amplified using universal primers. It is the gene's longest variable region and has relatively high taxonomic resolution (Dunthorn et al.2012). Fusion pyrosequencing primers were designed according to Roche specifications and included adaptors (adaptor A (5΄–3΄): CCATCTCATCCCTGCGTGTCTCCGACTCAG, adaptor B (5΄–3΄) CCTATCCCCTGTGTGCCTTGGCAGTC), key (TCAG) and 10-bp unique tags (MIDs in Roche technical bulletin 005-2009) and the V4 primers. PCR amplifications were performed on a PTC-200 DNA Engine Cycler (BioRad, USA) in 20 μl reaction volumes containing 4 μl DNA template (diluted 1:10), 1x Phusion HF buffer, 0.2 mM dNTPs, 0.25 μM of each fusion primer, 0.02 U/μl Phusion HotStart II polymerase (Finnzymes, Vantaa, Finland), 3% DMSO and 1 mg ml−1 BSA (New England BioLabs, Auckland, New Zealand). The amplification programme was as follows: 30 s at 98°C, followed by 30 cycles of 10 s at 98°C, 30 s at 53°C and 30 s at 72°C, with a final extension step at 72°C for 5 min before storage at –20°C. Amplification was verified on 1% agarose gels. PCR products were cleaned with a Wizard® SV Gel and PCR Clean-Up System (Promega, Madison, WI, USA), quantified using a Sequalprep Normalization Plate (96) Kit (Invitrogen, Paisley, UK) and pooled into equimolar amplicon libraries. Pooled libraries were quantified using Qubit dsDNA BR Assay Kit (Invitrogen). The 454 Titanium sequencing of the tagged amplicons was performed using GS FLX Titanium (Lib A chemistry kit) at the Norwegian Sequencing Centre at the University of Oslo (Norway) on 1/2 of a 454 FLX Titanium sequencing plate (454 Life Sciences, Branford, CT, USA). The raw 454 data with corresponding mapping files were deposited in Dryad (doi:10.5061/dryad.7s6s8).
Bioinformatics analyses for 454 reads
A total of 526 390 sequence reads from 87 samples were quality-filtered, denoised and processed using QIIME v. 1.5.0 (Caporaso et al.2010) on the Abel cluster at the University of Oslo unless otherwise indicated. All reads with mismatched forward and/or reverse tags were removed to avoid false positives (Carlsen et al.2012). Sequences with length <200 bp and >550 bp, average Phred quality score of <25, errors in the tags, homopolymers exceeding 6 bp, ambiguous base calls >1 and >1 mismatch in the primers were discarded. Additionally, a 50-bp sliding window (average quality score >25) was used to identify regions of low-sequence quality and sequences were truncated to the last good window. The resulting sequences (414 679) were denoised using DeNoiser v. 091 (Reeder and Knight 2010), and clustered into operational taxonomic units (OTUs) with UCLUST v.1.2.22 (Edgar 2010) with a 99% similarity threshold, –max_accepts = 20, and –max_rejects = 500. A high clustering threshold was used to allow for inclusion of highly related but distinct taxa (e.g. among ciliates, haptophytes) (Worden 2006; Doherty et al.2007; Egge et al.2013; Santoferrara et al.2014), and because the V4 region is characterised by rapid rates of evolution, and the data set was denoised (Logares et al.2014). OTUs globally represented by a single sequence were considered sequencing errors and removed (Quince et al.2009; Kunin et al.2010; Tedersoo et al.2010). Taxonomic assignments were made by comparing the most abundant (representative) sequence of each OTU against reference databases SILVA v111 (Quast et al.2013) and PR2 (Guillou et al.2013) using BLAST (Altschul et al.1990) with threshold e-value = 10−5. As taxonomy was consistent across both databases, the SILVA assignments were used in downstream analyses. Unwanted OTUs (e.g. Metazoa, Embryophyta) were removed. Chimeras were detected by using ChimeraSlayer (Haas et al.2011), as implemented in mothur v.1.26.0 (Schloss et al.2009), and subsequently discarded. The final curated data set comprised 334 858 reads (64% of reads), including 10 technical replicates to check for sequencing consistency. The 10 technical replicates were significantly more similar with respect to OTU composition than between sample comparisons (Fig. S3, Supporting Information), demonstrating little influence of biases introduced during PCR and sequencing.
Statistical analyses
One glacially influenced lake (temperature <10°C) and the two with low/high pH (<6 and >8) were omitted from the final set of lakes as potential outliers leaving 74 lakes with pH 6.3-8.0. OTUs with <10 reads or occurring in <2 samples were removed. A total of 281 571 sequences (54% of initial raw reads) that clustered into 1882 OTUs for the 74 lake samples were used in statistical analyses. The OTU table was rarefied to a common sampling depth of 1000 reads/sample to calculate OTU-based diversity measures. To minimise the effect of abundance measure inconsistencies, ordinations were conducted on presence/absence data as well as by-site normalised read abundances. Downstream statistical analyses were performed in R version 3.1.0 (R Development Core Team 2014) using vegan (Oksanen et al.2013) and MASS (Venables and Ripley 2002) for multivariate and species richness analyses unless otherwise noted. Species accumulation curves (SAC; calculated using the analytical version of the specaccum function) were used to assess sampling effort. Rarefaction curves were constructed using rarecurve. Alpha diversity indices (observed richness, Shannon diversity, Simpson diversity) were calculated using the function diversity.
Ordinations by detrended correspondence analysis (DCA) (Hill and Gauch 1980) and non-metric multidimensional scaling (NMDS) ordinations (Minchin 1987) were used to describe patterns in eukaryotic species composition. In addition, NMDS ordinations were conducted on a subset of a matrix representing 10 technical replicates to confirm that sequencing-induced variation was smaller than biological variation in the samples. Kendall's rank correlation coefficients τ and procrustes correlations (protest function in vegan) between pairs of DCA and NMDS axes with two and three dimensions were calculated. Permutation-based significance tests by the envfit function were used to fit spatial and environmental gradient variables to the NMDS ordination. Since OTU richness is a count variable, we used generalised linear models (GLMs) of the quasi-poisson family with permutation-based significance tests to fit rarefied richness (both total and for individual taxonomic groups) to the NMDS ordination axes. A standard Mantel test to investigate correlation between community composition and geographical distance between lakes was run using Bray-Curtis distances between communities and 999 permutations.
Linear models (LMs) related OTU richness (observed, rarefied) and diversity (Shannon, Simpson) to the major gradient variables (longitude, latitude, altitude, TOC, TP, conductivity; the latter three log transformed). Variance partitioning by redundancy analysis (RDA), using function varpart in R package vegan (Oksanen et al.2013) on Hellinger transformed, normalised abundance data was used to estimate the variance fractions of eukaryote community composition that could be explained by the local environment, spatial gradients or shared between them. The local environment was represented by concentrations of TOC, TP and conductivity (all log transformed), while the spatial gradients were represented by longitude, latitude and altitude. Univariate variance partitionings with the same predictor variables were done using LMs on richness and diversity indices, as well as NMDS site scores on two axes.
Spatial variance structures were investigated with the spdep package for R (Bivand and Piras 2015). Spatial analysis depends on defining a neighbourhood relationship between sites, which unfortunately can be done many ways (Bivand et al.2008). We used two conceptually different neighbourhood definitions (Gabriel and Relative neighbour) which nevertheless generally produced similar results. Spatial patterns in model residuals were assessed with Moran's I coefficient for spatial autocorrelation (Moran 1950), using the moran.test function from the spdep R package.
Relationships between functional traits and environment were investigated by so-called fourth corner analysis (Legendre, Galzin and Harmelin-Vivien 1997) where the site by OTU matrix is pre- and post-multiplied with matrices representing environment by site and OTU by trait. We used the same six environmental variables in Fig. S2, and autotrophic, phagotrophic or parasitic nutritional modes as functional traits based on Simon et al. (2015). Significance tests used the combination of row- and columnwise permutations of the site by OTU matrix (Dray and Legendre 2008), as implemented in the fourthcorner function of the ade4 package (Dray and Dufour 2007).
Nestedness is usually defined as a biogeographical pattern where species-poor communities form nested subsets of the richer ones. While the nestedness concept is old, quantitative methods for detecting such patterns are more recent and still developing (Ulrich, Almeida-Neto and Gotelli 2009). Nestedness indices are usually tested against null model distributions generated by randomisations of the site by OTU presence/absence matrix. Since we are basically interested in whether the distribution of species across sites are random or not given that some species are common and some rare, we used the RANDNEST algorithm of Jonsson (2001), which is a non-sequential randomisation that preserves OTU frequencies across sites but not row and column totals. Nestedness analysis was performed with oecosimu function in the vegan package, using the discrepancy index of Brualdi and Sanderson (1999) and the NODF index of Almeida‐Neto et al. (2008). We also used the same vegan function to compute the relative beta diversity contributions from nestedness and spatial turnover according to Baselga (2010). Presence/absence matrices were based on rarefied reads using the same random seed as for calculating rarefied richness and diversity measures.
RESULTS
Overall protistan community composition and taxonomic distribution
Non-protistan sequences amplified by the general eukaryotic primers (7.19% of OTUs), mainly crustaceans and rotifers, were excluded. A total of 1882 OTUs were recovered from the 281 571 high-quality denoised reads. An average of 426 (range: 145–771) OTUs were detected per sample and the mean number of total reads per lake was 3805 (range: 1107 - 6325). Protistan sequences were distributed across the supergroups Amoebozoa, Archaeplastida, Excavata, Opisthokonta, SAR (Stramenopiles, Alveolata and Rhizaria) as well as across several lineages of uncertain phylogenetic placement like Apusomonadida, Centroheliozoa, Cryptophyta, Haptophyta, Katablepharida and Telonemia (Fig. 2, Table S2, Supporting Information). Although the relative abundance of taxonomic groups varied between lakes, the majority of sequences in all samples belong to Alveolata, Cryptophyta and Stramenopiles.

Distribution of OTUs and reads (log scale) of all detected groups and lineages of unresolved phylogenetic placement. Each point represents a group/lineage, and point colour reflects taxonomic affiliation (supergroup): yellow = Archaeplastida, red = Alveolata, green = Opisthokonta, blue = Stramenopiles.
Alveolata was the most diverse supergroup (38.9% of reads, 40.28% of OTUs). Among alveolates, dinoflagellates represented the biggest fraction in terms of reads (24.02%) and OTUs (20.40%). Ciliates (11.85%), Perkinsea (5.37%) and other alveolates (2.66%) constituted a significantly smaller proportion of OTUs. Interestingly, Cryptophyta sequences were consistently abundant, representing up to 26.60% of the total reads, but not very diverse (7.12% of OTUs). Three OTUs affiliated to the Cryptomonadales family were present in all lakes. Finally, Stramenopiles was the third most abundant group in our study (18.27% of reads), of which Chrysophyceae (11.10% of reads) were present in all samples. Chrysophyceae comprised the second most diverse group (12.86% of OTUs) after dinoflagellates.
Opisthokonts were dominated by fungi (6.64% of OTUs), followed by choanoflagellates (1.65% of OTUs) and ichthyosporeans (0.53% of OTUs). Chlorophyta, the most abundant Archaeplastida group, were present in all lakes accounting for 3.26% of reads and 7.12% of OTUs. Charophyta, the second most abundant archaeplastidan group, represent 0.19% of total reads and 0.53% of OTUs. Haptophytes occurred at low proportions (0.20% of the total reads and 0.43% of OTUs) and were assigned to freshwater Pavlovophyceae and Prymnesiophyceae or undefined haptophyte sequences. Rhizaria (3.81% of the total reads) were exclusively composed of cercozoans (7.49% OTUs). Katablepharid reads comprised 1.28% of the reads and 0.58% OTUs. A small proportion of the OTUs were assigned to Centroheliozoa (3.35%), Excavata (0.32%), Amoebozoa (0.21%), Telonemia (0.21%) and Apusomonadida (0.05%). Finally, four OTUs (0.21%) could not be assigned with confidence to any of the above-mentioned groups.
Classical morphospecies taxa like Cryptophyta, Katablepharida and Synurophyceae were ‘supradiverse’, containing a larger than average fraction of the total reads and a lower than average fraction of the OTUs (Fig. 2, Table S2). In contrast, largely heterotrophic groups such as Fungi, Bicosoecida and Excavata, which are either hard to identify or neglected in classical phytoplankton microscopy, were ‘superdiverse’ maintaining higher than average OTU richness despite lower than average read abundance (Fig. 2, Table S2).
The 20 most frequent OTUs, representing 38.40% of reads (Fig. 3), were affiliated to Cryptophyta, Stramenopiles (Chrysophyceae, Synurophyceae, Raphidophyceae), Alveolata (Dinoflagellata), Centroheliozoa and Archaeplastida (Chlorophyta). The 1765 most infrequent OTUs (93.80% of all OTUs) represented only 34.30% of reads.

Rank-abundance curve for the total 1882 OTUs detected in 74 lakes. The relative abundance of top 20 protistan OTUs is shown in the insert. The identity number of the respective OTU is shown below the bars. Colours represent the supergroups: yellow = Archaeplastida, red = Alveolata, dark green = Centroheliozoa, blue = Stramenopiles, dark blue = Cryptophyta.
Richness across samples
The 74 lake ecosystems differed in richness and diversity (Table S3, Supporting Information). Rarefaction curves of OTU richness (Fig. 4) for each lake indicated that the total eukaryotic diversity was not recovered in any of the lakes. However, the overall SAC based on the progressive addition of samples show that the gamma diversity in the studied area has been fully recovered (Fig. 4, insert). Shannon diversity varied greatly across samples (range: 1.88 - 5.15) (Table S3) and OTU richness increased towards the east (Fig. 5A).

Rarefaction curves for 74 samples. Species accumulation curve (obtained by randomising 74 samples) for OTUs against sampling effort is presented in the insert. Dashed line represents the rarefaction subsampling level (1000 reads per sample).

NMDS ordination (stress = 0.195), based on Bray–Curtis dissimilarities between protistan communities. Each point represents a lake, with point size scaled by the rarefied OTU richness of the site. (A) Arrows represent fitted gradient vectors for spatial (longitude, latitude and altitude) and environmental (log-transformed TOC, TP and conductivity) variables. (B) Arrows represent corresponding gradient vectors for total and group-specific OTU richness, based on GLMs of the quasi-poisson family. Taxonomic group names are only shown for vectors with permutation-based, FDR-adjusted significance probabilities <0.05.
Environmental factors influencing eukaryotic community composition
NMDS ordination axes based on Bray–Curtis distances from site-normalised read abundance data were highly correlated with the corresponding DCA axes (P < 0.0001, Table S4, Supporting Information). Similar results by the two methods strongly suggest that a reliable gradient structure has been found. Protistan community composition was significantly related to spatial and environmental gradients (P = 0.001 on 999 permutations). Vectors representing longitude and TOC (the latter log transformed) pointed in the same direction (Fig. 5A), while being orthogonal to vectors reflecting local environment (TP and specific conductivity; both log transformed) and shorter spatial gradients (latitude and altitude). Community dissimilarity increased significantly with geographical distance (Mantel correlation = 0.37, P = 0.001 on 999 permutations). Thus, we infer that longitude is the strongest spatial factor influencing eukaryotic community composition in the studied lakes.
Vectors representing GLM fits between NMDS axis scores and rarefied richnesses of the most abundant taxonomic groups are shown in Fig. 5B, with the corresponding permutation-based significance probabilities in Table S5 (Supporting Information). Total OTU richness was associated with both NMDS axes which altogether explained 47% of the total deviance, while this fraction was generally lower for individual taxonomic groups. A total of 17 of the 25 groups were significantly related to NMDS scores at an overall false discovery rate (FDR) <5%. Of these, Chrysophyceae and Dinoflagellata had the closest associations with NMDS1, while Oomycota, Excavata and Charophyta had the closest associations with NMDS2. The other groups were to a larger extent associated with both axes, and many of them closely aligned with the longitude vector (especially Raphidophyceae, Synurophyceae, Chytridiomycota, Choanozoa and Cercozoa). OTU richnesses of Cryptophyta, Kathablepharida, Chlorophyta and Bacillariophyceae were not significantly related to any ordination axis.
Functional groups’ distribution
To investigate the distribution pattern of different eukaryotic groups, we classified the OTUs in four functional traits on the basis of their taxonomy and mode of nutrition: autotrophs, heterotrophs, parasites and unclassified (Adl et al.2012; Simon et al.2015). Some of these groups are considered obligate autotrophs (e.g. diatoms, chlorophytes, charophytes, dictyochophytes, raphidophytes and synurophytes), while others are putative mixotrophs (e.g. chrysophytes, cryptophytes and haptophytes) (Jones 2000; Adl et al.2012). Heterotrophic protists were represented by bicosoecids, centroheliozoans, choanozoans, ciliates and saprotrophic fungi. Putative parasites included members of Cercozoa, Chytridiomycota (Gleason et al.2008), Cryptomycota (Gleason et al.2012), Ichthyosporea (Glockling, Marshall and Gleason 2013) and Perkinsea (Bråte et al.2010; Mangot, Debroas and Domaizon 2011). The fourth corner analysis showed no significant associations between functional traits and any of the indicator variables for spatial position or local environment (all FDR-corrected p-values > 0.34 on 999 permutations). Visual inspection of distribution patterns of functional traits (Fig. S4, Supporting Information) supported the same conclusion. However, performing the fourth corner analysis with mixotrophy as an additional trait (i.e. identifying cryptophytes, chrysophytes and haptophytes as putative mixotrophs) indicated significant relationships between longitude and both autotrophy and mixotrophy (adjusted p-values on 999 permutations = 0.0315 and 0.0138, respectively). These results demonstrate that distinguishing between autotrophs and mixotrophs enhanced the resolution of our analyses.
Autotrophs and mixotrophs comprised 53.6% of the OTUs, while 19.4% of the heterotrophic OTUs belonged to putative parasitic groups. In terms of relative abundance, autotrophs dominated (71.21% total reads) with heterotrophs and parasites constituting 16.61% and 8.30% of all reads, respectively. In other words, heterotrophic and parasitic groups were represented by fewer reads per OTU than primary producers.
Mean observed richness of autotrophs (and mixotrophs), heterotrophs and parasites were 249 (range = 83–454; SD = 68), 85 (range = 20–157; SD = 29) and 61 (range = 14–112; SD = 22) OTUs per lake, respectively. The corresponding beta diversities ( = total number of OTUs / average number of OTUs per lake; often interpreted as the number of community turnovers along the main gradient) were 4.1, 4.2, and 6.0 for autotrophs (and mixotrophs), heterotrophs and parasites, respectively.
Variance partitioning and spatial autocorrelation
Variance partitioning by RDA on Hellinger transformed relative read abundances was used to assess the contributions of environmental and spatial gradients to the compositional variation. Both environmental factors and spatial gradients were significant in explaining parts of the protistan community composition (5.2% and 3.6% adjusted of total variation, respectively, not including the shared effect between local and spatial components). Approximately 90% of the community variance remained unexplained by environmental and spatial gradient indicators (Table S6, Supporting Information). To further investigate the contribution of environmental versus spatial predictors, we chose alpha diversity measures (rarefied OTU richness, Shannon, Simpson) and NMDS site scores (k = 2) as dependent variables using the same spatial and environmental factors as predictor variables. Rarefied OTU richness, Shannon diversity and Simpson diversity were closely correlated (Spearman's ρ = 0.86–0.96, Kendall's τ = 0.69–0.86), indicating they capture the same aspects of diversity. Local environment and spatial factors explained 8% and 33% of OTU richness variation independently and 13% in combination (Table S6). Spatial factors alone explained 42% and 40% of total variation in Shannon and Simpson diversity in our lakes, respectively (P < 0.001) (Table S6). NMDS1 axis has stronger effects of local environment (33% of variance explained, Fig. S5, Table S6, Supporting Information). In contrast, NMDS2 axis has variance partitioning pattern with slight dominance of both local and spatial predictors in comparison to spatial parameters only (15% and 13% of the variance explained, respectively) (Table S6). The Moran's I test (Moran 1950) showed no residual spatial autocorrelation in protistan community richness (rarefied OTU richness, Shannon and Simpson diversity) beyond what could be explained by the spatial predictors (longitude, latitude, altitude). Both parametric and resampling-based probabilities were non-significant.
Nestedness analysis
Both the tested nestedness indicators (BS-discrepancy and NODF) showed highly significant results (P = 0.01 on 99 randomisations), with standardised effect sizes of –25 and 78, respectively. Partitioning beta diversity according to Baselga (2010) indicated an overwhelming (>99%) contribution from spatial turnover. In other words, while nestedness was a statistically significant feature of our dataset, it accounted for <1% of the total beta diversity, irrespective of whether the analysis was based on Jaccard and Sørensen indices.
DISCUSSION
Microbial diversity in freshwater habitats is still poorly described and undersampled (Lefèvre et al.2008), although an increasing number of HTS surveys is improving the situation. Many freshwater HTS studies reported so far have been seasonal studies in single locations (Lepère et al.2013; Mangot et al.2013) or from a small number of lakes with special properties like high salinity or altitude (Wang et al.2014; Kammerlander et al.2015). Our study is the first to cover an extensive spatial gradient on a set of lowland lakes carefully selected to be as homogeneous as possible with respect to factors other than local productivity and spatial position. By choosing a synoptic sampling strategy, we deliberately emphasise the comparability between lakes at the expense within-lake representability of a single snapshot sample. Despite its limitations, our HTS study clearly confirms of the longitudinal biodiversity gradient in Southern Scandinavia that has earlier been inferred from microscopy (Hessen et al.2006; Ptacnik et al.2010).
Spatial factors (longitude, latitude, altitude) explained a higher fraction of the protistan community variation than environmental factors (TOC, TP and conductivity), although a substantial fraction could not be resolved between the two sources of variation. A larger fraction of total variance (40%) was explainable in LMs of OTU richness than in RDA models of community composition (9%). The level of explained RDA variance is comparable to some microscopy-based phytoplankton community studies (Beisner et al.2006), while others have reported higher fractions. For example, the metastudy of lentic diatom communities in Eurasia, Africa and Antarctica by Verleyen et al. (2009) found environmental factors to account for most of the community variation (21%) while spatial factors related to dispersal (5.5%) were less important.
Longitude had stronger influence on the eukaryotic community composition in Scandinavian lakes than other spatial variables, as observed by Stomp et al. (2011) in North American lakes. By constraining lake size and productivity in our sampling design, we think that our study has less chance of being confounded by large-scale patterns in lake shape and local productivity compared to Stomp et al. (2011). Overall, surveys conducted in lakes point to the significance of both spatial and environmental predictors as drivers of community structure, with the relative effect of these factors is likely dependent on environmental gradients, spatial extent and dispersal ability (Heino et al.2015).
Plankton community composition patterns
Microscopy-based phytoplankton surveys (Watson et al.1997; Ptacnik et al.2008a) have found that the relative biomass contributions of chrysophytes and cryptophytes decrease while diatoms, chlorophytes, and cyanobacteria increase with increasing eutrophication (for which TP is the most common proxy). The high read abundances of chrysophytes and cryptophytes in our study concurs with previously reported pattern. Chrysophyceae dominance has been frequently related to the extreme oligotrophic conditions, and their disappearance indicates eutrophication (Ptacnik et al.2008a). Supplementing nutrient uptake by phagotrophy of bacteria and small phytoplankton may be an important adaptation to nutrient-poor conditions in chrysophytes (Jones 2000).
Watson et al. (1997) report relative biomass contributions of dinoflagellates half that of chrysophytes and cryptophytes, which is somewhat in contrast with the dominance by alveolate and dinoflagellate OTUs in our study. This discrepancy may partly be due to underestimation by fixation losses and identification problems for especially non-thecate dinoflagellates by microscopy, and partly overestimation due to biases in molecular methods. Alveolates are known to have high rDNA copy numbers (Zhu et al.2005; Massana 2011; Gong et al.2013) which could lead to overestimation of cell abundance, but not necessarily of biomass. The well-documented relationship between cell size and genome size across all eukaryotes (Prokopowich, Gregory and Crease 2003) implies that OTUs with high relative read abundance due to high rDNA copy number would also be expected to have high contribution to total biomass, as has been recently shown for haptophytes (Egge et al.2013). While rRNA genes in other protists are generally thought to evolve in a strictly conserved manner such that all repeats will be identical (Dover 1982), this seems to not always the case for ciliates (Gong et al.2013). Hence, some of the diversity of alveolate rDNA could be attributed to intragenomic variability that is especially valid for larger cells with high number of rDNA copies.
Fungi (saprotrophs and parasites) were less diverse in our study (6.64% OTUs) than in other recent reports (Lefranc et al.2005; Lepère, Domaizon and Debroas 2008; Mangot et al.2013). Chytridiomycota are commonly found on a large variety of substrates in freshwater and decompose chitin, starch and cellulose in detrital organic materials (Gleason et al.2008), such that the observed positive association with TOC could reflect direct utilisation of dissolved organic matter. Conversely, chytrids are also commonly described as phytoplankton parasites that respond chemotactically to photosynthesis exudates, and frequently parasitise diatoms (Kagami, Amano and Ishii 2012; Kagami, Miki and Takimoto et al.2014), which is supported by their proximity in the NMDS ordination. The longitudinal alignment and eastward increase in Raphidophyceae richness is consistent with indications of ongoing dispersal by the nuisance raphidophyte Gonyostomum semen towards western Scandinavia (Hongve, Lovstad and Bjorndalen 1988; Lepistö, Antikainen and Kivinen 1994; Rengefors, Weyhenmeyer and Bloch 2012).
Functional diversity and nestedness
While there were no significant relationships between spatial and environmental gradients and lumped autotrophic, heterotrophic or parasitic traits (Fig. S4, Supporting Information), significant fourth corner relationships were found when the autotrophic trait was differentiated into pure photoautotrophy and mixotrophy. Since both traits were associated with longitude, we infer a longitudinal shift with increasing mixotrophy from west to east. In contrast, heterotrophic and parasitic lifestyles all appear to have essential roles in the organisation, energy transfer and element cycling of aquatic food webs (Gleason et al.2008; Rasconi, Jobard and Sime-Ngando 2011; Rasconi, Niquil and Sime‐Ngando 2012). Beta diversity (Tuomisto 2010) for primary producers and heterotrophs was lower than for parasites indicating a higher species packing and therefore narrower ecological niche widths for parasitic taxa.
The classical nestedness analysis methods have all been developed on pre-HTS data sets with typically at least an order of magnitude fewer species than the number of OTUs in this study. Fayle and Manica (2010) found that inflated type 1 error rates were a general problem for HTS-type data sets, and that all common permutation strategies were affected by this problem. In our data set, the contribution of nestedness to total beta diversity appeared to be very minor even though tests for nestedness per se were highly significant. We think that this illustrates a classical conflict between statistical and biological significance in large data sets (e.g. Yoccoz 1991), and conclude that spatial turnover is probably the most important source of beta diversity in Nordic lakes.
CONCLUDING REMARKS
While improved sampling strategies or increased sequencing depth most likely would have given a more complete picture of local (alpha) diversity, we are confident that the overall pattern in diversity and community composition is well captured by our study. Our analyses indicated that spatial position is more important than the local environment for the composition and diversity of protistan communities in Northern lakes. Altitude and latitude are, in contrast to longitude, established proxies for well-documented climate effects on biodiversity (Gaston 2000). We deliberately constrained the climatic variation in our study by making the longitudinal gradient three times longer than the latitudinal. With this design we find a strong longitudinal signal of the same magnitude as in earlier studies with non-molecular methods (Hessen et al.2006; Ptacnik et al.2010).
The mechanistic explanation for the longitudinal biodiversity gradient in Scandinavia has been elusive, mainly because it is aligned with geographical gradients in climate, soil depth, terrain ruggedness and landscape productivity. While climatic gradients have been designed to be as short as possible in our study, they can only be entirely eliminated at the expense of spatial extent. The study area has longitudinal oceanicity gradient with an eastward increase in the difference between winter and summer temperatures, while there is no longitudinal trend in mean annual temperature (Fig. S6, Supporting Information). We nevertheless feel that the longitudinal biodiversity gradient in Scandinavia cannot just be dismissed as a climatic gradient in disguise.
By being open, interactive habitat patches, lake ecosystems are often considered epitomes of the metacommunity concept, where the opposing forces of dispersal and environmental filtering are the main biodiversity-shaping forces (Leibold et al.2004). The relative importance of dispersal in lentic plankton communities is difficult to quantify directly, but is expected to depend on both the scale and heterogeneity of the landscape in which the lakes are embedded. Data are limited, but reviews by Shurin, Cottenie and Hillebrand (2009) and Jenkins (2014) indicate that dispersal limitation is relevant for lake biota, but perhaps less so for planktonic protists than for larger organisms. Ptacnik et al. (2010) argue that since dissimilarity between planktonic communities tend to increase with productivity, landscapes where productive lakes are common will tend to maintain larger regional species pools than poorer ones. Regional differences in landscape productivity are reflected by the regionally averaged TP (TPreg) of Ptacnik et al. (2010), which is closely related to longitude in our study area (Fig. S1). In this perspective, the longitudinal biodiversity gradient in Scandinavia can be seen as the contrast between species-rich lakes in the productive landscapes of the east with poorer landscapes where productive lakes are rarer in the west.
SUPPLEMENTARY DATA
Supplementary data are available at FEMSEC online.
Acknowledgments
We thank the COMSAT field sampling crew, especially Dag. O. Hessen, Johnny Håll, Marcia Kyle, Robert Ptacnik, and Jan-Erik Thrane, for their efforts. Robert Ptacnik is acknowledged for providing the TPreg data.
FUNDING
This study has been supported financially by the Department of Biosciences, University of Oslo and by the Research Council of Norway (contract Miljø2015/196336 ‘Biodiversity, community saturation and ecosystem function in lakes’ (COMSAT)).
Conflict of interest. None declared.
REFERENCES