A metabarcoding comparison of taxonomic richness and composition between the water column and the benthic boundary layer

Parry, Atkinson, A., P. J., and Lindeque, P. K. A metabarcoding comparison of taxonomic richness and composition between the water column and the benthic boundary layer. – ICES of Science, doi:10.1093/icesjms/fsaa228. Zooplankton monitoring in shelf seas predominantly uses nets that miss the benthic boundary layer (BBL) just above the seabed. However, this boundary between pelagic and benthic assemblages can be faunistically rich, having its own distinct hyperbenthic fauna and acting as a low-light refuge for overwintering or dielly migrating zooplankton. To compare species richness and composition between pelagic and BBL habitats, we sampled a long-term monitoring site in the Western English Channel seasonally. Metabarcoding methods applied to vertical net samples (top 50 m in a (cid:2) 54-m water column) and those from a hyperbenthic sledge generated > 100 000 sequences clustered into 294 opera-tional taxonomic units. Of these, 215 were found in the BBL and 170 in the water column. Some key taxa (e.g. mysids) were native to the BBL; by contrast, other delicate taxa (e.g. ctenophores) seemed to avoid the BBL. The major contrasts in plankton composition related to the seasonal cycle rather than to pelagic-BBL differences, suggesting that the basic dynamics of the site are captured by our ongoing long-term weekly resolution monitoring. Overall, metabarcoding approaches, applied to both water column and BBL, provide an independent view of plankton dynamics, and augment existing traditional methods.


Introduction
Boundaries within the marine environment, for example between water column and seabed, are often sites of high biodiversity and dynamic nutrient exchange (Mees and Jones, 1997;Dauvin and Vallet, 2006). Despite this, both benthic and pelagic ecologists often fail to sample the bottom few metres of the water column, and our ecological and biogeochemical understanding of this interface is weak (Dauvin and Vallet, 2006;Queiró s et al., 2019). The benthic boundary layer (BBL) is defined as that region up to 1.5 m above the seabed (Vallet and Dauvin, 2001) and our relative ignorance of this layer is a recognized problem and critical knowledge gap, for example affecting models that attempt to couple pelagic and benthic systems, such as the European Regional Seas Ecosystem Model (ERSEM, Butenschön et al., 2016). Likewise, most time-series, including those that are used to fulfil policy directives (Bedford et al., 2020), are based on sampling that does not extend to the full depth of the water column.
The main reason for omitting the BBL from consideration stems from the practical difficulty of sampling. The BBL eludes programmes targeting the pelagic (e.g. nets and water bottles used to sample the upper water column) and seabed (trawl, grab or corer samples, photography, or video). Both approaches miss the 1-1.5-m layer above the seabed, home to a distinct, dynamic assemblage including benthic diatoms, protozoans, mero-and V C International Council for the Exploration of the Sea 2020. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/ licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. holozooplankton, mysids, and euphausiids. These in turn are prey to demersal fish, which form the most commercially valuable component of fisheries on continental shelves throughout the world (Brander, 2001). The BBL can also form an important zone of concentration for truly pelagic zooplankton; for example euphausiids may congregate there to feed (Schmidt et al., 2011), diapausing copepods may migrate downwards and concentrate near the seabed (Hirche et al., 2006) or the lower light intensity near the seabed may attract dielly migrating zooplankton in the day time. While less is known regarding the diversity and species composition of the BBL than is known about pelagic and benthic assemblages, it harbours organisms that have a crucial role in the food web and are rarely seen either in the pelagic zone or at the seabed (Mees and Jones, 1997).
Molecular techniques have been shown to be powerful tools for elucidating the diversity and species richness of zooplankton communities (Lindeque et al., 2013). One option to assess the diversity and species composition of the BBL is with metabarcoding, which is the large-scale taxonomic identification of complex samples via analysis of one or a few orthologous DNA regions, called barcodes (Bucklin et al., 2016). High-throughput sequencing (HTS) platforms became widely available over a decade ago (Shendure and Ji, 2008). HTS is an overall term used to describe a number of new technologies that enable the sequencing of millions of DNA fragments in parallel, which is thus more efficient and affordable than traditional Sanger sequencing. This technology has transformed aspects of biological science (Galan et al., 2018), in particular the molecular identification of whole mixed assemblages, and has driven a major acceleration in research and development within this area (Kumar and Kocour, 2017). The ability to extract and sequence DNA from whole community samples has enabled HTS technology to be applied to a variety of ecosystems and organisms from Bacteria and Fungi (Gilbert et al., 2014;Staley et al., 2017) to Eukaryota (Carew et al., 2017;Su et al., 2018). The use of HTS to examine the diversity of zooplankton assemblages has revealed hidden taxonomic richness especially for Copepoda, hard-to-identify meroplankton such as Bivalvia, Gastropoda, and Polychaeta as well as parasites and rare species (Lindeque et al., 2013). One of the most commonly used genes for barcoding with HTS is the small subunit 18S ribosomal RNA (rRNA) gene. 18S rRNA gene barcodes have been widely utilized in a range of published studies with the scientific community converging towards the use of the V1-V2 (39%) and V9 (34%) hypervariable regions (Leray and Knowlton, 2016).
Our study site, named L4 (www.westernchannelobservatory. org.uk), is a pelagic and benthic time-series and marine biodiversity reference site located in the Western English Channel. L4 is a transitionally stratified site, 13 km SSW of Plymouth in water $54 m deep with a seabed of muddy sand [MBA (Marine Biological Association), 1957]. It now has a >30-year time-series of weekly resolution plankton observations as well as a suite of state and rate measurements in both water column and benthos. Zooplankton has been collected at station L4 at weekly resolution since 1988, using vertical hauls from 50 m depth up to the surface with a WP2 net (mesh 200 mm; standardization of zooplankton sampling methods, UNESCO, 1968). As is standard practice with pelagic sampling, the bottom few metres of the water are not sampled to avoid damage to the net.
The central aim of this study is to use metabarcoding to examine whether there are fundamental differences in the hyperbenthic and pelagic assemblages at L4 that are reflected in species richness or taxonomic makeup of molecular material. This allows us to gauge whether plankton seasonality and longer-term trends recorded by sampling in the 0-50-m layer are likely to be representative of those in the whole water column or whether they miss an important fraction of the diversity. Comparisons of plankton diversity between the pelagic and the BBL based on traditional microscope analysis are problematic, since identifications may reflect specific areas of expertise of the analyst, and few have skills in both benthic and pelagic taxonomy . To address this issue, we instead used HTS of 18S amplicons as a more consistent approach, comparing the catches of a hyperbenthic sledge to that of traditional nets.

Sample collection
Sampling was carried out on four dates over an annual cycle from October 2012 until July 2013 at the Western Channel Observatory long time-series station L4 (50 15.00 0 N, 4 13.02 0 W) ( Table 1). In recent years, we have been supplementing the longterm monitoring (using 0-50 m hauls with a pair of 200-mm mesh WP2 nets), with a finer mesh net haul to provide a better representation of the abundant and smaller metazoans (Cornwell et al., 2018). These additional net hauls are made with a 57-cm diameter, 0.25-m 2 mouth area, and 63-mm net lowered to 50 m and retrieved slowly at 0.2-0.3 m s À1 and these hauls form the pelagic comparison with the BBL samples.
To sample the BBL, a parallel series of hauls were made with a hyperbenthic sledge, for comparison with the traditional plankton net hauls. This sledge was originally purpose built for sampling soft seabed around the SW UK but was modified for this study by outfitting it with a 63-mm mesh 14-cm high by 20-cm wide (0.028 m 2 mouth area). This net was mounted 15 cm above the bottom runners of the sledge. The sledge was fitted with a pair of hinged metal flaps at the mouth of the net. While towing through the water column the flaps were sprung closed to prevent water and plankton entering the net mouth. The towing bridle was mounted on the top of the sledge, which thus acted as a depressor. On contact with the seabed a lever protruding from the bottom of the sledge was tilted upwards by the weight of the sledge on the seabed, opening the net flaps and allowing plankton into the sampler only when the sledge was touching the seafloor. Trial "blank" hauls were completed in the water column well away from the seafloor and these confirmed the lack of plankton in the cod end when the net baffles were closed. The volume filtered from both the sledge and the vertical nets was estimated by multiplying the mouth area by the distance of the tow. Distance travelled for the hauls over the seabed was recorded from the ship's GPS (Table 1). All samples from both the vertical nets and the sledge were preserved in 95% ethanol and stored at 4 C.

Molecular identification DNA extraction
Samples were prepared for genomic DNA extraction by centrifugation at 4000g for 4 min to pellet the zooplankton. Excess ethanol was removed, and the samples were resuspended in 40 ml of MilliQ water for overnight rehydration at 4 C. The samples were then centrifuged at 4000g for 4 min to pellet the zooplankton to allow removal of excess water before resuspension in 15 ml of homogenizing solution (400 mM Tris-HCl, pH 8, 60 mM EDTA, 105 mM NaCl, 1% SDS, 0.28 mg/ml RNase). Each sample was physically homogenized, using a 10-ml syringe and 19-G needle, and incubated at 37 C for 30 min. After this, 250 mg/ml proteinase K was added and the samples incubated for a further 30 min at 37 C. 4.28 ml of sodium perchlorate (5 M NaClO 4 ) was added and the samples were then shaken again at room temperature for 20 min. They were then physically homogenized as before and incubated at 65 C for a further 20 min. The homogenate was extracted once with phenol/chloroform, pH 8.0, and once with chloroform and precipitated with 2.5 volumes of 100% ethanol at À80 C for 1 h. The samples were washed with 70% ethanol, airdried overnight, then resuspended in 1.5 ml of DNA Grade water and left at 55 C for 30 min and then at room temperature for 3.5 h. The DNA extractions were analysed to assess the quality and quantity of DNA present using a NanoDrop 1000 Spectrophotometer (ThermoScientific, Delaware, USA).

sequencing
Primers (SSU_F04 and SSU_R22), designed by Fonseca et al. (2010), were chosen for amplicon generation. These primers target a homologous region of the 18S nuclear small subunit rRNAgene and flank a region that is highly divergent (V1-V2).
PCR amplification was performed in triplicate using 1 ml of genomic DNA template (1:10 dilution) in 25 ml reactions containing 5 ml of 5Â buffer, 2.5 ml 2 mM dNTPs, 2 ml 25 mM MgCl 2 , 2.5 ml of primers, and 0.25 ml of GoTaq Flexi (Promega). The PCR conditions involved a 2-min denaturation at 95 C followed by 30 cycles of 1 min at 95 C, 45 s at 55 C, 1 min at 72 C, and a final extension of 10 min at 72 C. No template controls were included. Electrophoresis of pooled triplicate PCR products and negative controls was undertaken on a 1% agarose gel and the 450-bp amplicons were extracted using the QIAquick Gel Extraction Kit (Qiagen). The triplicate amplicons were pooled and sent to MR DNA for sequencing (www.mrdnalab.com, Shallowater, TX, USA) on a Roche 454 FLX platform titanium instruments and reagents, following the manufacturer's guidelines.

Sequence data processing
The 454 sequencing reads were processed using the Qiime (Quantitative Insights into Microbial Ecology, v1.3.0) pipeline (Caporaso et al., 2010) as flowgrams (.sff files). The data were processed using default settings for all parameters, namely an operational taxonomic unit (OTU) threshold of 0.97, 0 primer mismatches, 0 ambiguous bases, a maximum length of homopolymer run of 6, and 200 nucleotides as a minimum sequence length (http://www.qiime.org/tutorials/processing_18S_data.html). The samples were not multiplexed so the barcode area of the mapping file was left blank and the split libraries script was altered accordingly. The data were then de-noised using the de-noiser wrapper within Qiime to remove the sequence errors characteristic of 454 sequencing machines (for review see Gaspar and Thomas, 2013). Chimaeras were identified using ChimeraSlayer and rejected from the dataset before construction of the OTU table. The OTUs were assigned using UCLUST (Edgar, 2010), a de novo OTU picker within Qiime. A representative set of sequences was then generated and these sequences were assigned taxonomy (at the level of 95% homology) using the BLASTN search of the NCBI nonredundant dataset.

Statistical analysis of sequence data
Analyses of the OTUs vs. samples matrix were conducted using Primer v7 (Clarke and Gorley 2015) and Microsoft Excel. Species accumulation and rarefaction values were calculated in Primer and plotted in Excel. Bar plots were generated directly in Primer following a standardization pre-treatment to convert values to percentages. Nonmetric multidimensional scaling (nMDS) ordinations were constructed using Bray-Curtis similarities among samples calculated using transformed (presence/absence) data. In one instance (presence/absence of all identified OTUs), the resulting nMDS ordination was unsatisfactory so the "fix collapse" option was used, mixing in a small proportion (5%) of metric MDS to linearize the Shepard diagram (see Clarke et al., 2014 for details).

Species richness and taxonomic diversity
Adding all four sampling time-points and BBL and water column samples together, 100 088 sequences were returned in total; these sequences clustered into 294 OTUs at the 97% homology (Supplementary Table S1). Of these, 186 OTUs were successfully assigned taxonomy across 22 phyla ( Table 2). The majority of the sequences returned for each sample were taxonomically identified to the genus level, with only 3% of the total sequences (clustering into 107 OTUs) that could not reliably be assigned to a taxonomic group-these are referred to as "unknowns". The 107 unknown OTUs collectively contained a total of 2745 sequences with only 10 of these OTUs containing >9 sequences.
The large majority of OTUs were from the kingdom Animalia (Figure 1a and Supplementary Table S2) with small numbers of Chromista. A single fungal species, Engyodontium album, was detected (two OTUs). The proportion of unknown sequences that could not be assigned taxonomy was highest (12.9%) in the vertical haul from July and the next highest (6.4%) in the vertical haul from October. The proportion of unknowns in BBL samples was always less than in the corresponding vertical hauls.
Nonmetric MDS ordinations (Figure 2) based on the presence/ absence of all OTUs assigned taxonomy, Animalia and Chromista (Figure 2a-c), show a strong tendency for samples collected by different methods at the same time to group together, with the exception of Chromista in April (Figure 2c).
How comprehensively have we sampled OTUs in the BBL and in the water column?
OTU accumulation curves for sledge samples and vertical hauls (Figure 3) show that the BBL sledge consistently captures more OTUs, possibly because of a greater sampling volume (Table 1) and greater number of sequences analysed, as well as greater species richness in October and January. Importantly, however, neither curve suggests that the sampling method has comprehensively sampled the potential pool of OTUs as both continue to rise. Rarefaction curves (Figure 4) relate the number of sequences analysed to the number of OTUs this reveals, providing a comparative index of OTU richness (which we will refer to as diversity in what follows) between sampling methods and time of year. Figure 4 shows that diversity was highest in the BBL in October and lowest in the BBL in April. With the exception of the sledge sample from April, there is a close match in diversity between sledge samples and vertical hauls sampled at different times, when a relatively small number of sequences are analysed. Difference in diversity becomes more obvious as more sequences are analysed. For the October and January samples, the sledge clearly collects more OTUs than the equivalent vertical hauls. In July, both methods collect comparable diversity, while in April, the pattern is reversed with the vertical hauls collecting more diversity for an equivalent sequencing effort.

Discussion
Our appreciation of shelf sea plankton ecology is shaped by how we sample. Long time-series of zooplankton are typically based on net hauls taken well above the seabed, with catches identified under a microscope. This is the case for the multidecadal timeseries of weekly 0-50 m, 200-mm hauls at the L4 site, which provide insights into seasonality and longer-term change Highfield et al., 2010;Atkinson et al., 2015). Here, we discuss some other perspectives on the zooplankton, sampling BBL assemblages with fine mesh nets and using molecular methods to identify organisms. Based on these metabarcoding methods, we examine differences in the taxonomic richness and composition between pelagic and BBL habitats, concluding with remarks on the future outlook, both for epibenthic sampling and for metabarcoding analysis approaches.
How do the taxonomic richness and composition of the BBL differ from the pelagic?
The fauna collected in the BBL contain true epibenthic species residing in the near-bottom environment, species derived from downward extensions of pelagic planktonic populations, which   Unknown  36  16  33  11  10  10  13  11  70  39  Totals  128  92  113  74  35  44  49  54  215  170 The totals (final two columns) represent the number of unique OTUs for the taxon, from either sledge or vertical net sampling across all four dates.
are often seasonal in nature, as well as infaunal species emerging into the water column, often on diel cycles (Dauvin and Vallet, 2006). Our prior expectation, therefore, was that the BBL would be taxonomically richer than the overlying water column because it might contain representatives of both seabed and water column assemblages (Zouhiri and Dauvin, 1996;Mees and Jones, 1997;Dauvin and Vallet, 2006). More OTUs were indeed found in the sledge samples, but this could be because these were based on larger-scale spatial integration ($0.5-1 km horizontally compared to $50 m for the vertical net), and larger volumes of water were sampled. When adjusted for the number of sequences analysed; however, the richness of the pelagic and BBL assemblages was

Comparison of taxonomic richness and composition
broadly similar (Figure 4). While the BBL does form the intersection of pelagic and benthic assemblages, there was a scarcity or absence of some taxa found in the water column, which is only partially compensated by taxa that are confined to the BBL.
While overall richness was similar between pelagic and BBL habitats, our metabarcoding methods revealed some characteristic and important taxa that live near the seabed. Mysids are an important example of these ( Figure 1C and Table 2). These results echo those from Dauvin and Vallet (2006) who made intensive studies on the BBL in the English Channel between 1988 and 1996, taking 460 samples using a Macer-Giroq sledge that sampled the fauna at four levels between 0.1 and 1.45 m above the bottom. Their study revealed the important contribution of mysids to the BBL assemblage in the Western English Channel, with a pronounced underlying seasonality (Zouhiri and Dauvin, 1996;Dauvin et al., 2000Dauvin et al., , 2011Dauvin, 2001, 2004). Our standard 0-50 m sampling of the water column at L4,     performed during daylight hours, clearly under-represents these mysids, which can move up into the water column at night from the BBL (Mauchline, 1980). Unlike the characteristically hyperbenthic mysids, the Cnidaria included component taxa inhabiting either the sea bed or the water column. These were dominated by two genera, Agalma and Bellonella. The former is a colonial siphonophore and, as might be expected, most of its sequences were derived from vertical hauls, whereas the majority of those from Bellonella spp. a soft coral were from the sledge samples. The consistency with which metabarcoding approaches can be applied to different types of samples is well suited to revealing such differences, particularly for rare or hard-to-identify taxa (Lindeque et al., 2013). However, it must be noted that Bellonella is an octocoral that does not occur in the NE Atlantic. Other species in the same family, Alcyoniidae, in particular Alcyonium digitatum, are commonly recorded on the seabed off Plymouth. The database we used for assigning taxonomy to our samples, BLAST, does contain three 18S fragments assigned to A. digitatum; however, these do not cover the same regions of the 18S gene that we sequenced.
Notwithstanding the differences in the faunal composition between the pelagic and BBL, our multivariate analysis (Figure 3) showed that these habitat contrasts were outweighed by seasonal differences in composition. This is an important result for the design of plankton monitoring programmes because it points to the need for highly resolved seasonal sampling to appreciate the extent of plankton diversity in inshore waters. In this study we used fine mesh nets (63 mm) compared to the more typical 200 mm for mesozooplankton because the latter has been shown at L4 to severely under-sample small and delicate metazoans (Cross et al., 2015). Despite using these fine mesh nets at four times of year alongside sledge sampling, our rarefaction and accumulation curves (Figures 3 and 4) still did not reach plateaux, indicating that the true richness of the assemblage at the L4 site was still under-represented.
Why are some species concentrated or rare in the benthic boundary layer?
Unlike traditional taxonomic analyses, our metabarcoding results cannot be used to estimate the concentration of each taxon in the pelagic and BBL habitats. This is due both to analytical limitations and to the variable gene copy number between taxa, which hinders the conversion of numbers of sequences of a taxon to its biomass (Bik et al., 2012). Nevertheless, the taxa varied enormously in their relative number of sequences returned from the two habitats, allowing us a picture of their relative ranking of concentration in one habitat or the other. At one extreme, ctenophore (Pleurobrachia spp.) sequences were found in the water column but were completely absent in the BBL, possibly due to the delicate tentacles of this group requiring avoidance of the seabed. Likewise, larger and more mobile taxa such as Euphausiids and amphipods were present in the water column but absent, or nearly absent from the BBL.
By contrast, at the other end of the ranking, important taxa were strongly concentrated in the BBL. In addition to the abovementioned mysids, the winter sampling showed high sequence concentrations of both fish and chaetognaths in the BBL, relative to the pelagic. The latter group in particular are major predators of copepods at L4 (Bonnet et al., 2005;Maud et al., 2018). Unlike the Macer-Giroq sledge used by Dauvin and Vallet (2006) and Dauvin et al., (2011), the mouth aperture and mesh size of our sledge was small and this would lead to a high degree of escape by larger, more mobile organisms and potentially an underrepresentation of their abundance and diversity. For this reason, we suggest that the BBL hosts an under-represented component of larger carnivorous or omnivorous taxa such as mysids, fish, and chaetognaths, which could migrate upwards and exert strong predation pressure on the zooplankton in the overlying water column.
Copepods form important trophic links (Beaugrand et al., 2003) and at L4 comprised a major fraction of mesozooplankton biomass . This group lay within the middle of our overall ranking of taxa according to relative concentration in the pelagic or the BBL, suggesting no substantial concentrations in either habitat. Previous studies have found both elevated and depressed copepod concentrations close to the seabed, depending on the setting (Wishner, 1980;Vallet and Dauvin, 2004;Hirche et al., 2006;Cornwell et al., 2020). For instance, ontogenetically migrating Calanus hyperboreus can intercept the seabed on their downwards seasonal overwintering migration and aggregate there in high concentrations (Hirche et al., 2006). By contrast a recent seasonal study of the dominant small cyclopoid, Oithona similis at L4 using bottle profiles at 0, 10, 25, and 50 m showed highest densities typically in the mid depth layer, with the relative avoidance of near-surface and near seabed layers (Cornwell et al., 2020). Overall, given the narrowness of the 50-54 m stratum missed by the nets compared to the top 50 m and the lack of strong copepod concentrations in the BBL, we speculate that the standard net monitoring provides a reasonably faithful representation of their seasonal dynamics.

Outlook
Time-series such as L4 provide valuable testbeds for models of plankton seasonality  as well as improved understanding of what is driving inter-annual and multidecadal trends Schmidt et al., 2020). Time-series worldwide are coming under increasing pressure, however, from funding cuts so any effort to streamline them or to use new technologies needs to take into consideration how good each timeseries is already. For example, it is rarely tested how representative the plankton sampling is of the whole water column. While metabarcoding showed that the BBL and pelagic assemblages at L4 differed, the seasonal changes outweighed the differences between the strata. This is important since reducing the frequency of sampling provides one way of saving resources. Instead our results help to underscore the need to maintain high-resolution weekly sampling; not only does this capture the seasonal dynamics of the plankton properly, but it also reveals their true taxonomic composition and richness.
To analyse these time-series, HTS is becoming increasingly attractive. This provides an alternative and complementary way of enumerating plankton, with strengths and weaknesses very different from those of classical microscopy. One key strength is the ability to reveal "hidden" diversity in taxa that are rare or hard to identify such as meroplankton or parasites (Lindeque et al., 2013;Goodwin et al., 2017). However, this technology should be seen as an augmentation of existing long time-series methods and not as a replacement. DNA databases need to keep track with the shrinking costs and greater use of metabarcoding, as the DNA sequences generated are only of value if the databases contain correctly identified specimens, which takes us back to the need for specialized experts.
The use of HTS approaches for plankton is expanding rapidly. In addition to augmenting long time-series, their roles can range from detecting marine invasive species to identifying areas of high species richness, for example to inform Marine Protected Area design and monitoring (Blasiak et al., 2020). In future, it is envisaged that HTS approaches will be used to sample DNA in the environment to quantify the abundance of species, after which a cloud-based machine will automatically reconstruct the ecological networks implicit in the data (Bohan et al., 2017). However, if methods that rely on databases are to become more reliable and useful in, for example monitoring, then we need a much better effort to sequence the organisms that actually occur where the monitoring is done. Some OTU taxonomic assignments in this study do not represent local species but related species from elsewhere in the world; a search of the database revealed that the local species have not been sequenced and uploaded. Improving reference libraries is a substantial undertaking but is an urgent priority for both taxonomists and molecular ecologists to maximize the payback from HTS.

Supplementary data
Supplementary material is available at the ICESJMS online version of the manuscript.

Data availability statement
The data underlying this article will be shared on reasonable request to the corresponding author.