Three novel D-xylose-fermenting yeast species of Spathaspora clade were recovered from rotting wood in regions of the Atlantic Rainforest ecosystem in Brazil. Differentiation of new species was based on analyses of the gene encoding the D1/D2 sequences of large subunit of rRNA and on 642 conserved, single-copy, orthologous genes from genome sequence assemblies from the newly described species and 15 closely-related Debaryomycetaceae/Metschnikowiaceae species. Spathaspora girioi sp. nov. produced unconjugated asci with a single elongated ascospore with curved ends; ascospore formation was not observed for the other two species. The three novel species ferment D-xylose with different efficiencies. Spathaspora hagerdaliae sp. nov. and Sp. girioi sp. nov. showed xylose reductase (XR) activity strictly dependent on NADPH, whereas Sp. gorwiae sp. nov. had XR activity that used both NADH and NADPH as co-factors. The genes that encode enzymes involved in D-xylose metabolism (XR, xylitol dehydrogenase and xylulokinase) were also identified for these novel species. The type strains are Sp. girioi sp. nov. UFMG-CM-Y302T (=CBS 13476), Sp. hagerdaliae f.a., sp. nov. UFMG-CM-Y303T (=CBS 13475) and Sp. gorwiae f.a., sp. nov. UFMG-CM-Y312T (=CBS 13472).

INTRODUCTION

Lignocellulosic or second-generation (2G) bioethanol has environmental and social advantages in comparison to first-generation (1G) bioethanol from starch or sugar (Sudiyani, Sembiring and Adilina 2014). The pretreatment of lignocellulosic-based feedstock for ethanol production usually solubilizes the hemicellulose fraction and often produces D-xylose as main product, which is a pentose that is not naturally fermented by Saccharomyces cerevisiae strains (Gírio et al. 2010; Nogué and Karhumaa 2014). Despite the existence of D-xylose-fermenting yeast species, such as Scheffersomyces stipitis, Sc. shehatae and Pachysolen tannophilus, as well as recombinant strains of S. cerevisiae (Cadete, Fonseca and Rosa 2014; Harner et al. 2015; Konishi et al. 2015), the specific D-xylose uptake rates and ethanol formation rates displayed by these microorganisms have not reached industrially viable scales (Latimer et al. 2014; Novy et al. 2014; Parreiras et al. 2014; Harner et al. 2015; Konishi et al. 2015). In recent years, efforts have been made to search for novel D-xylose-fermenting yeasts, leading to the description of new species capable of converting D-xylose into ethanol. Some of the most promising yeasts with these traits belong to the Spathaspora clade, including Sp. passalidarum and Sp. arborariae, which has encouraged the development of studies on yeasts of this genus (Nguyen et al. 2006; Cadete et al. 2009, 2012, 2013; Cunha-Pereira et al. 2011; Hou and Yao 2012; Hickert et al. 2013; Martiniano et al. 2013; Lobo et al. 2014; Su, Willis and Jeffries 2015). In fact, Sp. passalidarum is currently considered the best ethanol producer among D-xylose-fermenting species (Hou 2012; Long et al. 2012; Cadete, Fonseca and Rosa 2014). Besides Sp. passalidarum and Sp. arborariae, other species from this genus, such as Sp. brasiliensis, Sp. roraimanensis, Sp. suhii and Sp. xylofermentans, are known for their ability to ferment D-xylose, producing ethanol or xylitol at different amounts (Cadete et al. 2012, 2013).

Nguyen et al. (2006) described the genus Spathaspora from a single isolate of the species Sp. passalidarum, which was associated with wood-boring beetles collected in Louisiana, USA. This species produces asci containing elongate ascospores with curved ends, a unique trait of this genus. Many additional Spathaspora species have since been described, primarily from Brazilian ecosystems (Barbosa et al. 2009; Cadete et al. 2009, 2012, 2013). In addition, Wang et al. (2016) described the new species Sp. allomyrinae, a D-xylose-fermenting yeast species isolated from the gut of the host beetles Allomyrina dichotoma (Coleoptera: Scarabeidae) collected in China. Based on sequence analysis of the gene encoding the D1/D2 domains of the large subunit of rRNA, Daniel, Lachance and Kurtzman (2014) suggested that this clade consists of a core group that includes the species Candida jeffriesii, C. materiae, Sp. arborariae, Sp. brasiliensis, Sp. passalidarum and Sp. suhii. The species Sp. roraimanensis shares the unique ascospore morphology of the genus and is closely related to Sp. xylofermentans, in which ascospore production has not been observed. These two yeasts show various phylogenetic affinities with several Candida species, such as C. sake, C. alai or C. insectamans. Daniel, Lachance and Kurtzman (2014) suggested that these yeasts could represent independent genera based on the long branch lengths observed in analyses of the LSU rRNA D1/D2 sequences, but multilocus analyses are required to generate a more reliable phylogeny. The reliability of phylogenies can be enhanced by denser taxon sampling, so the discovery of additional clade members could also help to elucidate the correct phylogeny of these yeasts of biotechnological interest.

During a study of yeasts associated with rotting wood in Brazilian ecosystems, three D-xylose-fermenting species were isolated (Morais et al. 2013). Analyses of ITS and D1/D2 sequences showed that these yeasts represent new species belonging to the Spathaspora clade. One species produced ascospores with curved ends and is to be named Sp. girioi sp. nov. Sporulation was not observed for the other two species, and they are described here as Sp. hagerdaliae f. a., sp. nov. and Sp. gorwiae f. a., sp. nov.

MATERIALS AND METHODS

Yeast isolation and identification

Strains UFMG-CM-Y302, UFMG-CM-Y303 and UFMG-CM-Y312 were obtained from rotting wood samples collected in the Private Natural Heritage Reserve of Serra Bonita. This is an ecological reserve with 2000 ha of Atlantic Rainforest located in Serra Bonita (15°23S 39°33W), Bahia state, northeastern Brazil (Amorim et al. 2011). Strain UFMG-CM-Y424 was isolated from a rotting wood sample collected in the Private Natural Heritage Reserve of the Sanctuary of Caraça. The area is an ecological reserve of 11 233 ha located in Serra do Espinhaço (20°050S 43°280W), Minas Gerais state, southeastern Brazil (Barbosa et al. 2009). Strains UFMG-CM-Y302 and UFMG-CM-Y303 were obtained from samples cultured on YNB-xylan medium (yeast nitrogen base 0.67%, xylan 1%, chloramphenicol 0.02%). Strain UFMG-CM-Y312 was obtained from a sample cultured on YNB-xylose medium (yeast nitrogen base 0.67%, D-xylose 0.5%, chloramphenicol 0.02%), whereas strain UFMG-CM-Y424 was obtained from sample cultured on YNB-cellobiose medium (yeast nitrogen base 0.67%, cellobiose 0.5%, chloramphenicol 0.02%, ethanol 2%). The samples were incubated at 25°C and 150 rpm until growth was detected (3–10 days), as previously described (Morais et al. 2013). All isolates were purified by repeated streaking on YMA plates (D-glucose 1%, peptone 0.5%, yeast extract 0.3%, malt extract 0.3%, agar 2%) and preserved at –80°C for later identification. The yeasts were characterized using standard methods (Kurtzman et al. 2011). Species identifications were performed by analysis of the gene encoding the ITS-5.8S region and the D1/D2 variable domains of the large subunit of rRNA (White et al. 1990; O'Donnell 1993; Kurtzman and Robnett 1998; Lachance et al. 1999). The amplified DNA was concentrated, cleaned and sequenced in an ABI 3130 Genetic Analyzer automated sequencing system (Life Technologies, CA, USA) using BigDye v3.1 and POP7 polymer. Strains UFMG-CM-Y303 and Y424 were sequenced using an ABI 3730 automated DNA gene analyzer (Applied Biosystems, USA) at the Robarts Research Institute, London, Ontario, Canada. The sequences were assembled, edited and aligned with the program MEGA6 (Tamura et al. 2013). The GenBank/EMBL/DDBJ accession numbers for the partial sequences of ITS region and the gene encoding the LSU rRNA (including D1/D2 domains) of strains UFMG-CM-Y302, UFMG-CM-Y303, UFMG-CM-Y312 and UFMG-CM-Y424 are KC959937 and KC959473; KU556168; KC959938 and KC959474; KR184129, respectively. A neighbor-joining analysis was performed on the D1/D2 domains of the gene-encoding sequences from the large subunit of rRNA.

Genome assembly, annotation and phylogenetic analysis

Genomic DNA (gDNA) was isolated from the type strains of the three new Spathaspora species. For each strain, gDNA was sonicated and ligated to Illumina sequencing adapters as previously described (Hittinger et al. 2010). These paired-end libraries were sequenced on an Illumina HiSeq 2000 or 2500. All sequencing data of the three novel species are available from the NCBI Sequence Read Archive: Bioproject accession PRJNA306688. These Whole Genome Shotgun projects have been deposited at DDBJ/EMBL/GenBank under accessions LQMZ00000000, LQMS00000000 and LQHL00000000. The versions described in this paper are versions LQMZ01000000, LQMS01000000 and LQHL01000000.

Paired-end Illumina reads were used as input to the meta-assembler pipeline iWGS (Zou et al. 2016). Briefly, the pipeline first included a quality control step using the tools Trimmomatic v0.33 (Bolger, Lohse and Usadel 2014) and Lighter (Song, Florea and Langmead 2014), followed by a k-mer length optimization step using KmerGenie v1.6982 (Chikhi and Medvedev 2014). Next, the assembly step included running the following de novo assemblers: ABYSS v1.5.2 (Simpson et al. 2009), DISCOVAR r51885 (Weisenfeld et al. 2014), MASURCA v3 (Zimin et al. 2013), SGA v0.10.13 (Simpson and Durbin 2012), SOAPdenovo2 v2.04 (Luo et al. 2012) and SPADES v3.5.0 (Bankevich et al. 2012). Finally, the quality of obtained assemblies was assessed using QUAST v3.1 (Gurevich et al. 2013), and the best assembly for each newly described species was chosen based on the N50 statistics. Genetic content of the analyzed assemblies was assessed with the MAKER pipeline (Holt and Yandell 2011) using the GeneMark-ES (Ter-Hovhannisyan et al. 2008), Augustus (Stanke et al. 2006) and SNAP (Korf 2004) predictors. Functional annotation of the predicted genes based on the KEGG database was conducted using the KAAS web server (Moriya et al. 2007).

Phylogenetic placement of the newly described species was obtained using conserved, single-copy orthologs identified in the best genomic assemblies of each newly described species and in 15 published or publicly available genomic assemblies from closely related species belonging to the Debaryomycetaceae and Metschnikowiaceae families: Sp. arborariae, Sp. passalidarum, Sc. stipitis, C. albicans, C. dubliniensis, C. tropicalis, C. parapsilosis, C. orthopsilosis, C. tenuis, Clavispora lusitaniae, Debaryomyces hansenii, Lodderomyces elongisporus, Meyerozyma guilliermondii, M. caribbica, and Metschnikowia fructicola (Hittinger et al. 2015). Lachancea kluyveri was used as outgroup.

Protein sequences of conserved, single-copy orthologs were identified in each individual genomic assembly with the program BUSCO v1.1b1 (Simão et al. 2015) using the ‘fungi’ reference dataset. A total of 642 conserved genes present in all 19 assemblies were then extracted and aligned using MAFFT v7.245 (Katoh and Standley 2013). To obtain gene trees, individual alignments of each gene were used as input for maximum-likelihood (ML) phylogenetic inference with RAxML v8.2.4 (Stamatakis 2014) using the LG model of amino-acid substitution (Le and Gascuel 2008) and substitution rates modeled using the gamma distribution and empirical frequencies (‘PROTGAMMALGF’). In a separate run, all individual alignments were concatenated, and the resulting super alignment of 561 019 sites was used as input into RAxML to infer a global ML tree using the same parameters. Finally, individual gene trees were used to calculate internode certainty scores for each branch of the ML tree, as proposed by Salichos and Rokas (2013). The global ML tree and the 642 single-gene trees can be accessed in TreeBase using the following address: http://purl.org/phylo/treebase/phylows/study/TB2:S18536.

Fermentation assays

Yeasts cultured on YMA plates, at 25°C for 48 h, were transferred to 100 mL YPX pre-inoculum medium (yeast extract 1%, peptone 2%, D-xylose 2%) in 500 mL Erlenmeyer flasks and incubated at 30°C with continuous shaking (140 rpm) for 24 h. Cells were harvested by centrifugation at 9000 rpm for 10 min, washed twice and resuspended in the fermentation media to a final concentration of 0.5 g L−1. Batch fermentations using complete YP medium (yeast extract 1%, peptone 2%) containing 3% of D-xylose or a mixture of D-glucose (3%) and D-xylose (3%) as carbon sources were carried out in 100 mL in 250 mL Erlenmeyer flasks. The flasks were incubated as described above for 72 h, and the fermentation of xylose was monitored by taking samples at 0, 6, 12, 24, 36, 48, 60 and 72 h, and co-fermentation of glucose/xylose by taking samples at 0, 3, 6, 9, 12, 24, 36, 48, 60 and 72 h. Cell concentrations were determined by correlating optical density (OD) measurements taken with a Thermo Spectronic Genesys 20 Model 4001/4 spectrophotometer (Thermo Scientific, Waltham, USA) at 600 nm with a calibration curve (dry weight × OD). Glucose, xylose, xylitol and ethanol levels were determined by high-performance liquid chromatography system (Merck Hitachi, Darmstadt, Germany) using a refractive index detector (L-7490, Merck Hitachi, Darmstadt, Germany) and an Aminex HPX-87H column (300 × 7.8 mm, Bio-Rad Hercules, USA). The column was eluted with 5 mM H2SO4 as mobile phase at a flow rate of 0.4 mL min−1, 50°C.

Enzyme assays

For the assessment of xylose reductase (XR) and xylitol dehydrogenase (XDH) activities, the yeasts were grown inYPX medium, as described above, and after 16 h, the cells were harvested, washed with sterile deionized water and used to prepare crude cell-free extracts with Yeast Protein Extration Reagent (Y-PER®, Pierce, Rockford, USA) (Fonseca et al. 2007). Protein concentrations in the cell-free extract were determined by Bicinchoninic Acid Protein Assay Kit (Pierce, Rockford, USA) (Fonseca et al. 2007). Enzymatic activities were determined by following the oxidation or reduction of the coenzymes at 340 nm using a UV-2401 PC UV-PIS recording spectophotometer (Shimadzu, Kyoto, Japan) at 25°C, with an interval time of 1 s for recording and a total measuring time of 90 s for each reaction. Kinetic parameters of XR for D-xylose reduction were obtained in a reaction mixture containing triethanolamine buffer 200 mM pH 7.0, NAD(P)H 10 mM, D-xylose 2 M, cell-free extract and deionized water. Kinetic parameters of XDH for xylitol oxidation were obtained in a reaction mixture containing glycine buffer 200 mM pH 9.0, MgCl2 500 mM, NAD(P)+ 60 mM, xylitol 2 M, cell-free extract and deionized water. A value of 5.33 mM−1 cm−1 was used for the absorption coefficient of NAD(P)H. One unit was defined as the generation of 1 μmol NAD(P)H per minute. The specific enzyme activities were given in units (U) per milligram protein. This experiment was performed in duplicate.

Analysis of XYL genes in Spathaspora species

Based on the published draft genome sequences of Sp. passalidarum NRRL Y-27907T (=CBS 10155) (Wohlbach et al. 2011) and Sp. arborariae UFMG-HM-19.1AT (=CBS 11463) (Lobo et al. 2014), the encoding gene sequences for XR (XYL1), XDH (XYL2), and xylulokinase (XYL3) were used as queries to search the draft genome sequences of the new Spathaspora species. The enzymes encoded by these genes are responsible for the conversion of D-xylose to D-xylulose-5P, which is channeled through the pentose phosphate pathway for ethanol production. XYL1, XYL2 and XYL3 sequences from Sp. girioi sp. nov., Sp. hagerdaliae sp. nov. and Sp. gorwiae sp. nov. were obtained using the BLAST tool (Altschul et al. 1990). The location of the open reading frames (ORFs) and the predicted amino acid sequences were obtained using the software ARTEMIS. The XR (XYL1), XDH (XYL2/XYL2.1 and XYL2.2) and xylulokinase (XYL3) genes and their amino acid predicted sequences have been deposited at DDBJ/EMBL/GenBank under accessions KU668555-KU668558 and KU672526- KU672531. Sequences in FASTA format were aligned with ClustalW using the MEGA6 program (Thompson, Higgins and Gibson 1994; Tamura et al. 2013).

RESULTS AND DISCUSSION

Species delineation, classification, ecology and genome analysis

Analyses of the sequences of the ITS region and of the gene encoding the D1/D2 domains of the LSU rRNA showed that the four strains isolated represent three new species with affinities to the clade and genus Spathaspora (Fig. 1). Strain UFMG-CM-Y302 forms a sister pair with C. materiae, from which it differs by seven substitutions in D1/D2 domains and six substitutions and three indels in the ITS region. This strain also differs by nine substitutions in D1/D2 domains and 16 substitutions and five indels in ITS region from Sp. brasiliensis. Strain UFMG-CM-Y303 is nearly identical to strain UFMG-CM-Y424 in ITS and D1/D2 sequences; the two differ by two substitutions in the D1/D2 sequences. Their nearest relative is a new species represented by strain UFMG-CM-Y312. We treat them as distinct species because they differ by eight substitutions in D1/D2 sequences and 16 substitutions and five indels in the ITS region. Their closest described relative is C. lyxosophila (van der Walt, Ferreira and Steyn 1987). The sequence of the D1/D2 domains of strain UFMG-CM-Y303 presents a heterogeneity in two adjacent nucleotides. The heterogeneity was confirmed by both sequencing methods (Sanger and Illumina). With inclusion of Sp. allomyrinae in the D1/D2 phylogenetic tree and re-alignment of the sequences, strain UFMG-CM-Y424 occupies an intermediate position between the type strain of Sp. hagerdaliae sp. nov. and Sp. gorwiae sp. nov. However, the combined ITS and D1/D2 sequences, as well as the growth response on D-xylose and other tests suggest that strain UFMG-CM-Y424 can be putatively identified as Sp. hagerdaliae sp. nov. Final confirmation may be obtained from genome sequencing.

Figure 1.

Neighbor-joining tree phylogram showing the species affinities of the newly described species. Taxon sampling reflects the difficulty of obtaining monophyly for Spathaspora species in a phylogeny based on an alignment of LSU rRNA gene D1/D2 sequences. Lachancea kluyveri, which is otherwise known to be an outgroup with respect to the Debaryomycetaceae and the Metschnikowiaceae, had to be forced into its early-emerging position. The alignment contained 528 positions. Bootstraps of 50% or more are shown.

Figure 1.

Neighbor-joining tree phylogram showing the species affinities of the newly described species. Taxon sampling reflects the difficulty of obtaining monophyly for Spathaspora species in a phylogeny based on an alignment of LSU rRNA gene D1/D2 sequences. Lachancea kluyveri, which is otherwise known to be an outgroup with respect to the Debaryomycetaceae and the Metschnikowiaceae, had to be forced into its early-emerging position. The alignment contained 528 positions. Bootstraps of 50% or more are shown.

The ascospores formation was evaluated for three new species culturing them on Yeast Carbon Base (YCB) agar supplemented with 0.01% ammonium sulfate, cornmeal, diluted (1:9) V8 agar, Fowell acetate agar, Gorodkowa or 5% malt extract agar at 15°C or 25°C for 21 days. Strain UFMG-CM-Y302T produced unconjugated asci each with a single elongated ascospore with curved ends, when cultured in YCB agar supplemented with 0.01% ammonium sulfate or dilute (1:9) V8 agar. The name Sp. girioi sp. nov. is proposed to accommodate this species. The other two species (represented by strains UFMG-CM-Y303T and UFMG-CM-Y424, and strain UFMG-CM-Y312 T, respectively) were examined individually or mixed in pairs (strains UFMG-CM-Y303T and UFMG-CM-Y424) on the culture media, but asci or signs of conjugation were not seen. The names Sp. hagerdaliae f.a., sp. nov. (strains UFMG-CM-Y303T and UFMG-CM-Y424) and Sp. gorwiae f. a., sp. nov. (strain UFMG-CM-Y312T) are proposed to accommodate these species. The mention forma asexualis (f.a.) is added as a reminder that a sexual state is not known (Lachance 2012). Spathaspora girioi sp. nov. and Sp. gorwiae sp. nov. are described now, in spite of being known from only a single isolate, because of their ability to convert D-xylose into ethanol, a trait of biotechnological interest.

The phylogeny in Fig. 1 is less than satisfactory because it fails to resolve species that share the characteristic ascospore morphology of the genus Spathaspora into a single clade. Indeed, attempts to root a D1/D2-based tree that contains all described Spathaspora species using moderate relatives, such as C. sake, D. hansenii or C. tenuis were not successful. The outcome is highly dependent on taxon sampling. One explanation might be that the unique Spathaspora ascospore morphology is not a reliable synapomorphy of the genus. More likely is that the D1/D2 phylogeny is particularly faulty in the present case. Figure 1 also falls short of providing clear support for treating Sp. gorwiae sp. nov. and Sp. hagerdaliae sp. nov. as distinct phylogenetic species. Our choice of separating them is warranted by the extent of divergence in barcode sequences as well as phenotypic differences, including D-xylose metabolism and the utilization of L-sorbose, a trait that is not particularly labile.

The inadequacy of ribosomal DNA-based phylogenies of Spathaspora is further corroborated by the fact that these sequences do not resolve the Debaryomycetaceae as a sister family to the Metschnikowiaceae. The tree in Fig. 1 had to be rooted manually in order to relocate L. kluyveri as outgroup, a position that would otherwise befall on the two Metschnikowiaceae representatives. We therefore took a phylogenomic approach to determine the broader phylogenetic placement of the newly discovered species. We examined the genome assemblies of the three newly described species and 15 closely related Debaryomycetaceae or Metschnikowiaceae species and identified 642 conserved, single-copy, orthologous genes that were used for a subsequent ML phylogenetic analysis. To assess the robustness of our phylogeny, we also calculated internode certainty support values for each branch of the consensus tree, an approach that has been shown by Salichos and Rokas (2013) to better quantify phylogenetic support and conflict than standard bootstrap analysis. In contrast to the tree presented in Fig. 1, the resulting phylogeny (Fig. 2) had strong support for the placement of all three newly described species within the genus Spathaspora, free of intrusions by representatives of less related families, such as the Metschnikowiaceae or the Debaryomycetaceae. Specifically, Sp. girioi sp. nov. was placed as the sister species of Sp. arborariae, and the two were linked by their common ancestor to Sp. passalidarum. Although Sp. gorwiae sp. nov. and Sp. hagerdaliae sp. nov. formed a separate subclade, a monophyletic structure was retained among all Spathaspora species. Resolution of other taxonomic groups present in our dataset, such as the Candida/Lodderomyces or Metschnikowia/Clavispora clades, was consistent with their accepted taxonomy. Deep branches of the phylogeny, however, received poor support values, indicating high levels of conflict in the phylogenetic signal of the different genes used in our analysis. This reaffirmed the difficulty of confidently resolving deep relationships between yeast genera, even when using genome-scale analyses (Salichos and Rokas 2013).

Figure 2.

Phylogenetic placement of Sp. girioi sp. nov., Sp. gorwiae sp. nov. and Sp. hagerdaliae sp. nov. ML tree obtained from a concatenated alignment of 642 conserved, single-copy orthologous genes from the three newly described species, 15 closely related Debaryomycetaceae or Metschnikowiaceae species with publicly available genomic data and L. kluyveri as outgroup. Scale is in amino acid substitutions per site. Values above branches indicate their internode certainty (IC) support, obtained based on 642 individual gene trees. IC values close to 1 indicate the absence of conflict between genes for a given branch among the gene trees, IC values close to zero indicate equal support for a given branch and the second most-frequent alternative, IC values close to –1 indicate lower support for a given branch as compared to its second most-frequent alternative (i.e. a branch on the tree obtained from the concatenated alignment is rarely observed among individual gene trees).

Figure 2.

Phylogenetic placement of Sp. girioi sp. nov., Sp. gorwiae sp. nov. and Sp. hagerdaliae sp. nov. ML tree obtained from a concatenated alignment of 642 conserved, single-copy orthologous genes from the three newly described species, 15 closely related Debaryomycetaceae or Metschnikowiaceae species with publicly available genomic data and L. kluyveri as outgroup. Scale is in amino acid substitutions per site. Values above branches indicate their internode certainty (IC) support, obtained based on 642 individual gene trees. IC values close to 1 indicate the absence of conflict between genes for a given branch among the gene trees, IC values close to zero indicate equal support for a given branch and the second most-frequent alternative, IC values close to –1 indicate lower support for a given branch as compared to its second most-frequent alternative (i.e. a branch on the tree obtained from the concatenated alignment is rarely observed among individual gene trees).

Species of the Spathaspora clade have been isolated from rotting wood or insects associated with this substrate (Nguyen et al. 2006; Cadete et al.2009, 2013). Spathaspora girioi sp. nov., Sp. hagerdaliae sp. nov. and Sp. gorwiae sp. nov. were isolated from rotting wood samples of unidentified trees collected in the Private Natural Heritage Reserve of Serra Bonita and in the Sanctuary of Caraça. These areas are considered Atlantic Rainforest ecosystems. In view of their low frequency of isolation in otherwise extensive sampling efforts, the species must be regarded as a minor component of the very diverse yeast communities isolated from rotting wood in this ecosystem (Morais et al. 2013). The strains were recovered from four different rotting wood samples. The species Saturnispora silvae, Trichosporon laibachii and Kazachstania unispora were co-isolated with strains UFMG-CM-Y312T, UFMG-CM-Y303T and UFMG-CM-Y424, respectively. Strain UFMG-CM-Y302T (Sp. girioi sp. nov.) was the single yeast isolated from its rotting wood sample (Morais et al. 2013).

The differentiation between Sp. girioi sp. nov. and C. materiae (Spathaspora clade) is based on the assimilation of D-ribose and glycerol, which is positive for the new species and negative for C. materiae, as well as growth on ethanol, sorbitol and hexadecane, which is negative for Sp. girioi sp. nov and positive for C. materiae. Spathaspora girioi sp. nov. differs from Sp. brasiliensis in growth on citrate and D-gluconate, which is positive for Sp. girioi sp. nov and negative for Sp. brasiliensis. These species also differ in growth on ethyl acetate and glycerol, which is negative for the new species and positive for Sp. brasiliensis. Spathaspora hagerdaliae sp. nov. and Sp. gorwiae sp. nov. can be distinguished from their closest relative, C. lyxosophila, based on growth on D-gluconate, hexadecane and L-arabinose, which is positive for both new species and negative for C. lyxosophila, as well as growth on glycerol and soluble starch, which is positive for C. lyxosophila and negative for Sp. hagerdaliae sp. nov. and Sp. gorwiae sp. nov. Spathaspora girioi sp. nov. does not grow on L-arabinose and cycloheximide (0.01%), which are positive for the other two new species. On the other hand, Sp. girioi sp. nov. assimilates erythritol, whereas Sp. hagerdaliae sp. nov. and Sp. gorwiae sp. nov. do not use this carbon source. Spathaspora hagerdaliae sp. nov. differs from the other two novel species by failing to grow on L-sorbose. Spathaspora gorwiae sp. nov. can be also distinguished from the others based on its ability to produce acid from D-glucose, which does not occur in Sp. girioi sp. nov. or Sp. hagerdaliae sp. nov.

To better understand the global metabolisms of the newly described species, we performed a comparative analysis of their genome contents with that of other closely related Debaryomycetaceae/Metschnikowiaceae species. Our KEGG-based annotation pipeline assigned function to 2925 out of 7366 predicted ORFs in Sp. hagerdaliae sp. nov., 3001 out of 7026 predicted ORFs in Sp. girioi sp. nov., and 2990 out of 7077 predicted ORFs in Sp. gorwiae. Analysis of the annotations allowed us to identify several genes of interest related to the metabolic capabilities of these new species. First, we identified a gene coding for a predicted xylan 1,4-beta-xylosidase (XYL4, K15920, EC 3.2.1.37), which is present in the genus Spathaspora, C. tenuis (another known xylose fermenter) and Mw. fructicola. According to the IUBMB classification, this enzyme performs hydrolysis of (1→4)-beta-D-xylans, to remove successive D-xylose residues from the non-reducing termini. Presence of this enzyme in the newly described species corresponds well with the fact that they were obtained from samples cultured on YNB-xylan medium, since degrading xylan would provide a clear advantage during isolation. Such enzymes are also of interest for potential applications in the deconstruction of lignocellulosic biomass in future biorefineries. Second, we identified a general loss of the urate/allantoin degradation pathway in Sp. gorwiae sp. nov. and Sp. hagerdaliae sp. nov., including the genes encoding urate oxidase (UOX, K00365, EC 1.7.3.3), 5-hydroxyisourate hydrolase (UraH, K07127, EC 3.5.2.17), allantoinase (DAL1, K01466, EC 3.5.2.5), allantoicase (DAL2, K01477, EC 3.5.3.4) and ureidoglycolate lyase (DAL3, K01483, EC 4.3.2.3). A similar event has been previously described in other Saccharomycetaceae (Wong and Wolfe 2005; Gabaldón et al. 2013; Wolfe et al. 2015) and was hypothesized to be an adaptation to anaerobic/fermentative lifestyles because the pathway uses substantial quantities of oxygen. This loss also suggests that Sp. gorwiae sp. nov. and Sp. hagerdaliae sp. nov. should be unable to grow on urate or allantoin as their sole nitrogen source, instead requiring an alternative, such as ammonium sulfate. Finally, all studied Spathaspora species, except for Sp. hagerdaliae sp. nov., had a second copy of the gene encoding sorbose reductase (SOU1, K17742, EC 1.1.1.289). This enzyme converts L-sorbose into D-sorbitol, and the lack of a second copy might be related to the absence of L-sorbose assimilation in Sp. hagerdaliae sp. nov.

D-Xylose and D-glucose fermentation assays

After saccharification of pretreated lignocellulosic materials, the major sugars usually present in hydrolysates are D-glucose and D-xylose, which result from the hydrolysis of cellulose and hemicellulose, respectively. Thus, in this study, the fermentation of D-xylose and the co-fermentation of D-glucose and D-xylose by the new species were evaluated. The parameters related to the consumption of D-xylose and D-glucose, and the production of biomass, ethanol and xylitol are summarized in Table 1. Product titers, yields and productivities were calculated based on the time of maximum ethanol production or the end of the experiment (72 h). The growth and fermentation kinetics are shown in Figs S4 and S5, Supporting Information.

Table 1.

Production of ethanol and xylitol from D-xylose cultures and D-xylose/D- glucose co-cultures in YP medium by the new Spathaspora species.

   D-glucose D-xylose  Maximum ethanol    Maximum xylitol    
 Yeast Yeast strains consumption consumption Biomass concentration Yp/set Qp ηet concentration Yp/sxyl ηxyl Time 
 species (UFMG-CM-) (%)a (%)a (g L−1(g L−1(gg−1)b (gL−1 h−1)c (%)d (g L−1(gg−1)b (%)d (h)e 
D-xylose fermentation Sp. girioi sp. nov. Y302 – 100.0 5.6 ± 0.43 6.7 ± 0.29 0.22 0.19 42.3 10.8 ± 0.19 0.35 37.8 36 
 Sp. hagerdaliae sp. nov. Y303 – 99.3 7.7 ± 0.45 8.8 ± 0.80 0.28 0.24 54.9 6.4 ± 0.29 0.21 22.4 36 
  Y424 – 94.3 5.5 ± 0.25 7.3 ± 0.20 0.25 0.20 48.2 7.1 ± 0.94 0.24 26.4 36 
 Sp. gorwiae sp. nov. Y312 – 76.9 11.7 ± 0.28 2.4 ± 0.29 0.10 0.04 20.4 0.0 ± 0.00 0.00 0.0 60 
D-xylose/D-glucose Sp. girioi sp. nov. Y302 100.0 99.6 9.1 ± 0.05 10.7 ± 0.02 0.18 0.15 35.1 16.5 ± 0.52 0.52 57.2 72 
co-fermentation Sp. hagerdaliae sp. nov. Y303 100.0 99.2 8.2 ± 0.47 19.0 ± 1.49 0.32 0.53 63.5 11.3 ± 1.29 0.37 39.8 36 
  Y424 100.0 98.7 7.5 ± 0.46 17.7 ± 0.87 0.30 0.49 58.4 11.3 ± 1.08 0.36 39.3 36 
 Sp. gorwiae sp. nov. Y312 86.7 20.4 11.3 ± 0.72 7.2 ± 1.05 0.24 0.30 47.3 0.0 ± 0.00 0.00 0.0 24 
   D-glucose D-xylose  Maximum ethanol    Maximum xylitol    
 Yeast Yeast strains consumption consumption Biomass concentration Yp/set Qp ηet concentration Yp/sxyl ηxyl Time 
 species (UFMG-CM-) (%)a (%)a (g L−1(g L−1(gg−1)b (gL−1 h−1)c (%)d (g L−1(gg−1)b (%)d (h)e 
D-xylose fermentation Sp. girioi sp. nov. Y302 – 100.0 5.6 ± 0.43 6.7 ± 0.29 0.22 0.19 42.3 10.8 ± 0.19 0.35 37.8 36 
 Sp. hagerdaliae sp. nov. Y303 – 99.3 7.7 ± 0.45 8.8 ± 0.80 0.28 0.24 54.9 6.4 ± 0.29 0.21 22.4 36 
  Y424 – 94.3 5.5 ± 0.25 7.3 ± 0.20 0.25 0.20 48.2 7.1 ± 0.94 0.24 26.4 36 
 Sp. gorwiae sp. nov. Y312 – 76.9 11.7 ± 0.28 2.4 ± 0.29 0.10 0.04 20.4 0.0 ± 0.00 0.00 0.0 60 
D-xylose/D-glucose Sp. girioi sp. nov. Y302 100.0 99.6 9.1 ± 0.05 10.7 ± 0.02 0.18 0.15 35.1 16.5 ± 0.52 0.52 57.2 72 
co-fermentation Sp. hagerdaliae sp. nov. Y303 100.0 99.2 8.2 ± 0.47 19.0 ± 1.49 0.32 0.53 63.5 11.3 ± 1.29 0.37 39.8 36 
  Y424 100.0 98.7 7.5 ± 0.46 17.7 ± 0.87 0.30 0.49 58.4 11.3 ± 1.08 0.36 39.3 36 
 Sp. gorwiae sp. nov. Y312 86.7 20.4 11.3 ± 0.72 7.2 ± 1.05 0.24 0.30 47.3 0.0 ± 0.00 0.00 0.0 24 
a

Sugar consumption (%)—percentage of initial D-glucose or D-xylose consumed.

b

Yp/set (g g−1) and Yp/sxyl (g g−1)—ethanol or xylitol yield: correlation between ethanol or xylitol (ΔP) produced with sugar (ΔS) consumed.

c

Qpet (g L−1 h−1)—ethanol productivity: ratio between ethanol concentration (g L−1) and time (h).

d

ηet (%) and ηxyl (%)—conversion efficiency: percentage of the maximum theoretical ethanol or xylitol yield (0.511 g ethanol per g D-xylose and 0.917 g xylitol per g D-xylose).

e

Time of maximum ethanol production (g L−1) reached at the end of the experiment.

Table 1.

Production of ethanol and xylitol from D-xylose cultures and D-xylose/D- glucose co-cultures in YP medium by the new Spathaspora species.

   D-glucose D-xylose  Maximum ethanol    Maximum xylitol    
 Yeast Yeast strains consumption consumption Biomass concentration Yp/set Qp ηet concentration Yp/sxyl ηxyl Time 
 species (UFMG-CM-) (%)a (%)a (g L−1(g L−1(gg−1)b (gL−1 h−1)c (%)d (g L−1(gg−1)b (%)d (h)e 
D-xylose fermentation Sp. girioi sp. nov. Y302 – 100.0 5.6 ± 0.43 6.7 ± 0.29 0.22 0.19 42.3 10.8 ± 0.19 0.35 37.8 36 
 Sp. hagerdaliae sp. nov. Y303 – 99.3 7.7 ± 0.45 8.8 ± 0.80 0.28 0.24 54.9 6.4 ± 0.29 0.21 22.4 36 
  Y424 – 94.3 5.5 ± 0.25 7.3 ± 0.20 0.25 0.20 48.2 7.1 ± 0.94 0.24 26.4 36 
 Sp. gorwiae sp. nov. Y312 – 76.9 11.7 ± 0.28 2.4 ± 0.29 0.10 0.04 20.4 0.0 ± 0.00 0.00 0.0 60 
D-xylose/D-glucose Sp. girioi sp. nov. Y302 100.0 99.6 9.1 ± 0.05 10.7 ± 0.02 0.18 0.15 35.1 16.5 ± 0.52 0.52 57.2 72 
co-fermentation Sp. hagerdaliae sp. nov. Y303 100.0 99.2 8.2 ± 0.47 19.0 ± 1.49 0.32 0.53 63.5 11.3 ± 1.29 0.37 39.8 36 
  Y424 100.0 98.7 7.5 ± 0.46 17.7 ± 0.87 0.30 0.49 58.4 11.3 ± 1.08 0.36 39.3 36 
 Sp. gorwiae sp. nov. Y312 86.7 20.4 11.3 ± 0.72 7.2 ± 1.05 0.24 0.30 47.3 0.0 ± 0.00 0.00 0.0 24 
   D-glucose D-xylose  Maximum ethanol    Maximum xylitol    
 Yeast Yeast strains consumption consumption Biomass concentration Yp/set Qp ηet concentration Yp/sxyl ηxyl Time 
 species (UFMG-CM-) (%)a (%)a (g L−1(g L−1(gg−1)b (gL−1 h−1)c (%)d (g L−1(gg−1)b (%)d (h)e 
D-xylose fermentation Sp. girioi sp. nov. Y302 – 100.0 5.6 ± 0.43 6.7 ± 0.29 0.22 0.19 42.3 10.8 ± 0.19 0.35 37.8 36 
 Sp. hagerdaliae sp. nov. Y303 – 99.3 7.7 ± 0.45 8.8 ± 0.80 0.28 0.24 54.9 6.4 ± 0.29 0.21 22.4 36 
  Y424 – 94.3 5.5 ± 0.25 7.3 ± 0.20 0.25 0.20 48.2 7.1 ± 0.94 0.24 26.4 36 
 Sp. gorwiae sp. nov. Y312 – 76.9 11.7 ± 0.28 2.4 ± 0.29 0.10 0.04 20.4 0.0 ± 0.00 0.00 0.0 60 
D-xylose/D-glucose Sp. girioi sp. nov. Y302 100.0 99.6 9.1 ± 0.05 10.7 ± 0.02 0.18 0.15 35.1 16.5 ± 0.52 0.52 57.2 72 
co-fermentation Sp. hagerdaliae sp. nov. Y303 100.0 99.2 8.2 ± 0.47 19.0 ± 1.49 0.32 0.53 63.5 11.3 ± 1.29 0.37 39.8 36 
  Y424 100.0 98.7 7.5 ± 0.46 17.7 ± 0.87 0.30 0.49 58.4 11.3 ± 1.08 0.36 39.3 36 
 Sp. gorwiae sp. nov. Y312 86.7 20.4 11.3 ± 0.72 7.2 ± 1.05 0.24 0.30 47.3 0.0 ± 0.00 0.00 0.0 24 
a

Sugar consumption (%)—percentage of initial D-glucose or D-xylose consumed.

b

Yp/set (g g−1) and Yp/sxyl (g g−1)—ethanol or xylitol yield: correlation between ethanol or xylitol (ΔP) produced with sugar (ΔS) consumed.

c

Qpet (g L−1 h−1)—ethanol productivity: ratio between ethanol concentration (g L−1) and time (h).

d

ηet (%) and ηxyl (%)—conversion efficiency: percentage of the maximum theoretical ethanol or xylitol yield (0.511 g ethanol per g D-xylose and 0.917 g xylitol per g D-xylose).

e

Time of maximum ethanol production (g L−1) reached at the end of the experiment.

When cultured in YP medium with 3% D-xylose, the new species were able to consume D-xylose with the extent of consumption ranging from 76.9% for Sp. gorwiae sp. nov. within 60 h (consumption rate of 0.6 g L−1 h−1), up to nearly 100% for Sp. hagerdaliae sp. nov. and Sp. girioi sp. nov. within 36 h (consumption rate of 0.8–0.9 g L−1 h−1). Under the conditions tested, the main products of D-xylose metabolism were ethanol for Sp. hagerdaliae sp. nov., xylitol for Sp. girioi sp. nov. and biomass for Sp. gorwiae sp. nov. The two strains of Sp. hagerdaliae sp. nov. had similar ethanol conversion efficiencies, with approximately half of the carbon available directed to ethanol (7.3–8.8 g L−1) and CO2 and the remaining to biomass and xylitol in approximately equal proportions (20%–30%), slightly more biomass for UFMG-CM-Y303 and more xylitol for UFMG-CM-Y424. Spathaspora girioi sp. nov. UFMG-CM-Y302 produced more xylitol (10.8 g L−1) than ethanol (6.7 g L−1), but the efficiencies of xylitol and ethanol conversion were similar (∼40%). Spathaspora gorwiae sp. nov. revealed the lowest ethanol yield (∼20%) with the majority of carbon directed to biomass. Interestingly, xylitol production was not detected. Cadete et al. (2012) previously found that Sp. brasiliensis, Sp. xylofermentans, Sp. roraimanensis and Sp. suhii were able to ferment D-xylose with different efficiencies. The conversion efficiency of D-xylose to ethanol and/or xylitol by these new species could be optimized by fine-tuning fermentation conditions, including sugar concentration, cell inoculum and aeration, since, in the first stage (∼2 generations) aerobic conditions promoted biomass formation only.

The results of the co-fermentation assay revealed that both Sp. hagerdaliae sp. nov. and Sp. girioi sp. nov. consumed 100% of the D-xylose and D-glucose in 36 and 72 h, respectively. D-glucose is consumed first, after 12–18 h with Sp. hagerdaliae sp. nov. and after 30 h with Sp. girioi sp. nov., generating, during this period, ∼7 g L−1 biomass and 10–12 and 7.0–7.5 g L−1 of ethanol, respectively. Co-consumption of D-xylose reached 10%–20% before D-glucose exhaustion. During the D-xylose growth phase, ethanol increased to 18–19 g L−1 with Sp. hagerdaliae sp. nov. and 11 g L−1 with Sp. girioi sp. nov., but xylitol was the main product from D-xylose, achieving 11 g L−1 and 16 g L−1. Under the conditions tested, Sp. gorwiae sp. nov. consumed ∼90% of D-glucose and 20% of D-xylose in 24 h, the time of maximum ethanol production (7.2 g L−1). After this period, ethanol started to be consumed to produce biomass, while D-glucose and D-xylose were slowly consumed up to a total of 98% and 43%, respectively, after 72 h, with no xylitol production. The ability of Sp. arborariae to co-ferment D-glucose and D-xylose, as well as D-glucose, D-xylose and L-arabinose has been demonstrated (Cadete et al. 2009; Cunha-Pereira et al. 2011). In that species, the consumption of D-xylose started after D-glucose depletion, and efficiencies of 50.2% for ethanol and 14.2% for xylitol were found. Long et al. (2012) showed that Sp. passalidarum consumed D-glucose and D-xylose simultaneously under the aerobic conditions, but no ethanol was produced. However, under oxygen limitation, the utilization of D-glucose and D-xylose by Sp. passalidarum was different, as D-glucose was consumed first. Therefore, since the conditions tested rapidly reached oxygen limitation (approximately after 2 generations), the co-fermentation of D-glucose and D-xylose by the novel Spathaspora species is similar to that previously observed in other species of the genus.

Biochemical characterization of XRs and XDHs

The activity of the enzymes related to the first steps of D-xylose metabolism, XRs and XDHs, was determined in crude extracts of the three new Spathaspora species after 16 h of D-xylose metabolism in YPX cultures (Table 2). The XDH activities were strictly dependent on NAD+. The XR activities were NADPH-dependent for Sp. hagerdaliae sp. nov. and Sp. girioi sp. nov., while Sp. gorwiae sp. nov. revealed XR activities with both NADH and NADPH used as co-factors. Dependence of the XR activity on NADH was previously observed in Sp. passalidarum (Hou 2012), where the XR enzyme preferred NADH over NADPH, an observation that contrasts with Sc. stiptis, whose XR activity prefers NADPH. Hou (2012) suggested that the higher ethanol yield observed in Sp. passalidarum (compared to Sc. stipits) could be explained by a more balanced NADH-NAD+ supply-and-demand relation between XR and XDH. In contrast, Sc. stipits produced more xylitol. In the present study, we observed that Sp. hagerdaliae sp. nov. and Sp. girioi sp. nov. produce both ethanol and xylitol from D-xylose, while Sp. gorwiae sp. nov. produced only ethanol. The latter yeast was the only one with NADH-dependent XR activities. The lower D-xylose consumption rate observed for Sp. gorwiae sp. nov. may be explained by the lower XR and XDH activities. Spathaspora girioi sp. nov. showed the highest NADPH-XR and NAD+-XDH activities, 0.94 and 0.19 U (mg protein)−1, respectively. This result could explain the xylose metabolism by this species, once Sp. girioi sp. nov. produced the highest titers of xylitol in both fermentation experiments, 10.8 and 16.5 g L−1, in D-xylose and D-glucose/D-xylose media, respectively.

Table 2.

Xylose reductase (XR) and xylitol dehydrogenase (XDH) activities expressed in units (U) per mg protein [U (mg protein)−1] by the new Spathaspora species.

  XR  XDH 
Yeast species Yeast strains (UFMG-CM-) NADH NADPH Ratio NADH/NADPH NAD+ 
Sp. girioi sp. nov. Y302 – 0.94 ± 0.12 – 0.19 ± 0.02 
Sp. hagerdaliae sp. nov. Y303 – 0.42 ± 0.02 – 0.06 ± 0.01 
 Y424 – 0.67 ± 0.14 – 0.18 ± 0.02 
Sp. gorwiae sp. nov. Y312 0.40 ± 0.08 0.39 ± 0.05 1.03 0.03 ± 0.01 
  XR  XDH 
Yeast species Yeast strains (UFMG-CM-) NADH NADPH Ratio NADH/NADPH NAD+ 
Sp. girioi sp. nov. Y302 – 0.94 ± 0.12 – 0.19 ± 0.02 
Sp. hagerdaliae sp. nov. Y303 – 0.42 ± 0.02 – 0.06 ± 0.01 
 Y424 – 0.67 ± 0.14 – 0.18 ± 0.02 
Sp. gorwiae sp. nov. Y312 0.40 ± 0.08 0.39 ± 0.05 1.03 0.03 ± 0.01 
Table 2.

Xylose reductase (XR) and xylitol dehydrogenase (XDH) activities expressed in units (U) per mg protein [U (mg protein)−1] by the new Spathaspora species.

  XR  XDH 
Yeast species Yeast strains (UFMG-CM-) NADH NADPH Ratio NADH/NADPH NAD+ 
Sp. girioi sp. nov. Y302 – 0.94 ± 0.12 – 0.19 ± 0.02 
Sp. hagerdaliae sp. nov. Y303 – 0.42 ± 0.02 – 0.06 ± 0.01 
 Y424 – 0.67 ± 0.14 – 0.18 ± 0.02 
Sp. gorwiae sp. nov. Y312 0.40 ± 0.08 0.39 ± 0.05 1.03 0.03 ± 0.01 
  XR  XDH 
Yeast species Yeast strains (UFMG-CM-) NADH NADPH Ratio NADH/NADPH NAD+ 
Sp. girioi sp. nov. Y302 – 0.94 ± 0.12 – 0.19 ± 0.02 
Sp. hagerdaliae sp. nov. Y303 – 0.42 ± 0.02 – 0.06 ± 0.01 
 Y424 – 0.67 ± 0.14 – 0.18 ± 0.02 
Sp. gorwiae sp. nov. Y312 0.40 ± 0.08 0.39 ± 0.05 1.03 0.03 ± 0.01 

XYL1, XYL2 and XYL3 genes from the new Spathaspora species

BLAST searches yielded one gene encoding a XR (XYL1) in the three novel species. The existence of XYL1 was also detected in Sp. arborariae (Lobo et al. 2014), but Sp. passalidarum revealed two XR-encoding genes (XYL1.1 and XYL1.2) (Wohlbach et al. 2011). The genome sequence of Sp. girioi sp. nov. contains two genes encoding XDHs (XYL2.1 and XYL2.2), which is similar to Sp. passalidarum and Sp. arborariae (Wohlbach et al. 2011; Lobo et al. 2014), while the genomes of Sp. hagerdaliae sp. nov. and Sp. gorwiae sp. nov. contain only one XYL2. Each of the three novel Spathaspora species has only one copy of XYL3, which encodes a xylulokinase, as Sp. passalidarum and Sp. arborariae (Wohlbach et al. 2011; Lobo et al. 2014).

To analyze the genes encoding XR, XDH and xylulokinase (XKS) in Sp. hagerdaliae sp. nov., Sp. girioi sp. nov. and Sp. gorwiae sp. nov., their nucleotide (XYL1, XYL2 and XYL3) and predicted amino acid (Xyl1p, Xyl2p and Xyl3p) sequences were compared by multiple sequence alignment with the published gene sequences of Sp. passalidarum, Sp. arborariae and Sc. stipitis.

The XYL1 or XYL1.1 genes, detected in all Spathaspora species studied, have a coding region of 957 bp that encode XRs with 318 amino acid residues (aa), while the XR gene XYL1.2, identified only in Sp. passalidarum, has a coding region of 954 bp that encodes an XR with 317 amino acids. The alignment of the predicted amino acid sequences of the XRs and their conserved domains is shown in Fig. S1, Supporting Information. The Xyl1p and Xyl1.1p sequences are highly conserved with protein identities ranging from 88% to 97%. The 25 residues of the active site of the conserved aldo-keto-reductase domains have been found and mapped, indicating that the enzymes likely share the same function in these species. The second XR (Xyl1.2p), which is found exclusively in Sp. passalidarum, has 27 differences when compared to the amino acid sequences of Xyl1.1p (Sp. passalidarum) and Xyl1p from other Spathaspora species (Table S1, Supporting Information). These amino acid differences could be important for enzyme activity and may help to explain the higher efficiency of D-xylose fermentation for Sp. passalidarum.

The XYL2, XYL2.1 or XYL2.2 genes have different sizes of coding regions, ranging from 1089 to 1104 bp, which encode XDHs with 362–367 amino acid residues (aa). The alignment of the predicted amino acid sequences of the XDHs and their conserved domains is shown in Fig. S2, Supporting Information. The Xyl2p and Xyl2.1p were highly conserved among the Spathaspora species, with protein identities ranging from 84% to 96%. The second XDH (Xyl2.2p), which was found in Sp. girioi sp. nov., has protein identities of 89% and 92% with Xyl2.2p of Sp. arborariae and Sp. passalidarum, respectively. The 18 residues that compose the NADP binding site in the conserved domains of these proteins have been found and mapped, indicating that the enzymes likely share the same function in these species. This second XDH (Xyl2.2p) has 20 differences when compared to the amino acid sequences of Xyl2.1p (Sp. girioi sp. nov., Sp. passalidarum and Sp. arborariae) and Xyl2p from other Spathaspora species (Table S2, Supporting Information).

The XYL3 genes have coding regions with different sizes, ranging from 1830 to 1872 bp, that encode XKSs with 609 to 623 amino acid residues (aa). The alignment of the predicted amino acid sequences of the XKSs and their conserved domains is shown in Fig. S3, Supporting Information. The Xyl3p were highly conserved on the Spathaspora species, with protein identities ranging from 77% to 96%. The 23 residues that compose the conserved nucleotide binding site have been found and mapped, indicating that the enzymes likely share the same function in these species.

Description of Spathaspora girioi sp. nov. Lopes, Morais, Cadete, Fonseca, Kominek, Hittinger, Lachance and Rosa

Spathaspora girioi (gi.ri.o'i NL. gen. sing. m. n., pertaining to Gírio, in honor of Francisco Gírio, in recognition of his contribution to the study of the utilization of hemicelluloses for the production of ethanol and xylitol).

On YM agar, after 3 days at 25°C, the cells are spherical to ovoid (2.2–4.3 × 2.3–4.0 μm). Budding is multilateral. On YM agar, after 2 days at 25°C, colonies are white, mucoid, glistening and flat. Sporulation occurs on YCB agar supplemented with 0.01% ammonium sulfate and dilute (1:9) V8 incubated at 25°C after 5 days. Unconjugated asci are formed from single cells with a single greatly elongated ascospore tapered and curved at the ends (Fig. 3a). Asci are persistent. Fermentation of D-glucose, D-xylose, maltose, trehalose and D-galactose are positive. Assimilation of the following carbon compounds occurs: D-glucose, D-galactose, L-sorbose, maltose, sucrose, cellobiose, trehalose, melezitose, D-xylose, ethanol, erythritol, ribitol, D-mannitol, D-glucitol, salicin, DL-lactate, succinate, citrate, hexadecane, xylitol, D-gluconate and N-acetyl-D-glucosamine. No growth occurs on lactose, melibiose, raffinose, inulin, soluble starch, L-arabinose, D-arabinose, D-ribose, L-rhamnose, glycerol, galactitol, myo-inositol, methanol, acetone, ethylacetate and isopropanol. Assimilation of nitrogen compounds is as follows: positive for lysine and negative for nitrate and nitrite. Growth in amino-acid-free medium is positive. Growth at 37°C is negative. Growth on YM agar with 10% sodium chloride is negative. Growth in 50% glucose/yeast extract (0.5%) is negative. Acid production is negative. Starch-like compounds are not produced. In 100 μg cycloheximide mL–1, growth is negative. The known habitat is rotting wood in Brazil. The type strain UFMG-CM-Y302T was isolated from rotting wood in the Atlantic Rainforest ecosystem, Bahia state, Brazil. It has been deposited in the Collection of Microorganisms and Cells of Federal University of Minas Gerais (Coleção de Micro-organismos e Células da Universidade Federal de Minas Gerais, UFMG), Belo Horizonte, Minas Gerais, Brazil, as strain UFMG-CM-Y302T, and it is permanently preserved in a metabolically inactive state. An ex-type culture has been deposited in the collection of the Yeast Division of the Centraalbureau voor Schimmelcultures (CBS), Utrecht, the Netherlands, as strain CBS 13476. The Mycobank number is MB 815455.

Figure 3.

Budding yeast cells and ascus with an ascospore with a curved end of Sp. girioi sp. nov. on YCB agar supplemented with 0.01% ammonium sulfate, after 5 days at 25°C (a); budding cells of Sp. hagerdaliae sp. nov. on YM agar, after 3 days at 25°C (b) and Sp. gorwiae sp. nov. on YM agar, after 3 days at 25°C (c). Scale bar = 10 μM.

Figure 3.

Budding yeast cells and ascus with an ascospore with a curved end of Sp. girioi sp. nov. on YCB agar supplemented with 0.01% ammonium sulfate, after 5 days at 25°C (a); budding cells of Sp. hagerdaliae sp. nov. on YM agar, after 3 days at 25°C (b) and Sp. gorwiae sp. nov. on YM agar, after 3 days at 25°C (c). Scale bar = 10 μM.

Description of Spathaspora hagerdaliae f.a., sp. nov. Lopes, Morais, Cadete, Fonseca, Kominek, Hittinger, Lachance and Rosa

Spathaspora hagerdaliae (ha.ger.da'li.ae NL. gen. sing. f. n., pertaining to Hahn-Hägerdal, in honor of Bärbel Hahn-Hägerdal, in recognition of her contribution to the study of D-xylose fermentation by yeasts).

On YM agar, after 3 days at 25°C, the cells are spherical to ovoid (2.3–4.3 × 2.3–4.4 μm). Budding is multilateral. On YM agar, after 2 days at 25°C, colonies are grayish-white, mucoid, glistening and flat. On YM agar, after 3 weeks, pseudohyphae are present. Asci or signs of conjugation were not seen on sporulation media (Fig. 3b). Fermentation of D-glucose, D-xylose, sucrose, maltose and D-galactose are positive. Assimilation of the following carbon compounds occurs: D-glucose, D-galactose, maltose, sucrose, cellobiose, trehalose, melezitose, D-xylose, L-arabinose, ethanol, ribitol, D-mannitol, D-glucitol, salicin, DL-lactate, succinate, citrate, hexadecane, xylitol, D-gluconate and N-acetyl-D-glucosamine. No growth occurs on L-sorbose, lactose, melibiose, raffinose, inulin, soluble starch, D-arabinose, D-ribose, L-rhamnose, glycerol, erythritol, galactitol, myo-inositol, methanol, acetone, ethylacetate and isopropanol. Assimilation of nitrogen compounds is as follows: positive for lysine and negative for nitrate and nitrite. Growth in amino-acid-free medium is positive. Growth at 37°C is negative. Growth on YM agar with 10% sodium chloride is negative. Growth in 50% glucose/yeast extract (0.5%) is negative. Acid production is negative. Starch-like compounds are not produced. In 100 μg cycloheximide mL–1, growth is positive. The known habitat is rotting wood in Brazil. The type strain UFMG-CM-Y303T was isolated from rotting wood in the Atlantic Rainforest ecosystem, Bahia state, Brazil. It has been deposited in the Collection of Microorganisms and Cells of Federal University of Minas Gerais (Coleção de Micro-organismos e Células da Universidade Federal de Minas Gerais, UFMG), Belo Horizonte, Minas Gerais, Brazil, as strain UFMG-CM-Y303T, and it is permanently preserved in a metabolically inactive state. An ex-type culture has been deposited in the collection of the Yeast Division of the Centraalbureau voor Schimmelcultures (CBS), Utrecht, the Netherlands, as strain CBS 13475. The Mycobank number is MB 815456.

Description of Spathaspora gorwiae f.a., sp. nov. Lopes, Morais, Cadete, Fonseca, Kominek, Hittinger, Lachance and Rosa

Spathaspora gorwiae (gor.wi.’ae NL. gen. sing. f. n., pertaining to Gorwa, in honor of Marie-Francoise Gorwa-Grauslund, in recognition of her contribution to the study of D-xylose fermentation by yeasts).

On YM agar, after 3 days at 25°C, the cells are spherical to ovoid (2.5–4.0 × 3.3–4.3 μm) and occur singly or in pairs. Budding is multilateral. On YM agar, after 2 days at 25°C, colonies are grayish-white, mucoid, glistening and flat. Asci were not seen on sporulation media (Fig. 3c). Fermentation of D-glucose, D-xylose, sucrose and D-galactose are positive. Assimilation of the following carbon compounds occurs: D-glucose, D-galactose, L-sorbose, maltose, sucrose, cellobiose, trehalose, melezitose, D-xylose, L-arabinose, ethanol, ribitol, D-mannitol, D-glucitol, salicin, DL-lactate, succinate, citrate, hexadecane, xylitol, D-gluconate and N-acetyl-D-glucosamine. No growth occurs on lactose, melibiose, raffinose, inulin, soluble starch, D-arabinose, D-ribose, L-rhamnose, glycerol, erythritol, galactitol, myo-inositol, methanol, acetone, ethylacetate and isopropanol. Assimilation of nitrogen compounds is as follows: positive for lysine and negative for nitrate and nitrite. Growth in amino-acid-free medium is positive. Growth at 37°C is very week, and it is negative at 40°C. Growth on YM agar with 10% sodium chloride is negative. Growth in 50% glucose/yeast extract (0.5%) is negative. Acid production is positive. Starch-like compounds are not produced. In 100 μg cycloheximide mL–1, growth is positive. The known habitat is rotting wood in Brazil. The type strain UFMG-CM-Y312T was isolated from rotting wood in the Atlantic Rainforest ecosystem, Bahia state, Brazil. It has been deposited in the Collection of Microorganisms and Cells of Federal University of Minas Gerais (Coleção de Micro-organismos e Células da Universidade Federal de Minas Gerais, UFMG), Belo Horizonte, Minas Gerais, Brazil, as strain UFMG-CM-Y312T, and it is permanently preserved in a metabolically inactive state. An ex-type culture has been deposited in the collection of the Yeast Division of the Centraalbureau voor Schimmelcultures (CBS), Utrecht, the Netherlands, as strain CBS 13472. The Mycobank number is MB 815458.

SUPPLEMENTARY DATA

Supplementary Data.

We thank the University of Wisconsin Biotechnology Center DNA Sequencing Facility for providing Illumina facilities and services; Xiaofan Zhou, David Peris and Antonis Rokas for early access to iWGS and advice on its utilization.

FUNDING

This work was supported by the (CNPq—Brazil, processes numbers , , and ), , the (FINEP, process ), the (MAL), the ), the (US DOE Office of Science ) and the (Hatch project ). CTH is a Pew Scholar in the Biomedical Sciences and an Alfred Toepfer Faculty Fellow, supported by the Pew Charitable Trusts and the Alexander von Humboldt Foundation, respectively.

Conflict of interest. None declared.

REFERENCES

Altschul
SF
Gish
W
Miller
W
et al. 
Basic local alignment search tool
J Mol Biol
 
1990
21
403
10
Amorim
HV
Lopes
ML
de Castro Oliveira
JV
et al. 
Scientific challenges of bioethanol production in Brazil
Appl Microbiol Biot
 
2011
91
1267
75
Bankevich
A
Nurk
S
Antipov
D
et al. 
SPAdes: A new genome assembly algorithm and its applications to single-cell sequencing
J Comput Biol
 
2012
19
455
77
Barbosa
AC
Cadete
RM
Gomes
FCO
et al. 
Candida materiae sp. nov., a yeast species isolated from rotting wood in the Atlantic Rain Forest
Int J Syst Evol Micr
 
2009
59
2104
6
Bolger
AM
Lohse
M
Usadel
B
Trimmomatic: a flexible trimmer for Illumina sequence data
Bioinformatics
 
2014
30
2114
20
Cadete
RM
Fonseca
C
Rosa
CA
Novel yeast strains from Brazilian biodiversity: Biotechnological applications in lignocellulose conversion into biofuels
da Silva
SS
Chandel
AK
Biofuels in Brazil
 
Cham
Springer International Publishing
2014
255
79
Cadete
RM
Melo
MA
Dussán
KJ
et al. 
Diversity and physiological characterization of D-xylose-fermenting yeasts isolated from the Brazilian Amazonian Forest
PLoS One
 
2012
7
e43135
Cadete
RM
Melo
MA
Zilli
JE
et al. 
Spathaspora brasiliensis sp. nov., Spathaspora suhii sp. nov., Spathaspora roraimanensis sp. nov. and Spathaspora xylofermentans sp. nov., four novel D-xylose-fermenting yeast species from Brazilian Amazonian forest
Anton Van Lee
 
2013
103
421
31
Cadete
RM
Santos
RO
Melo
MA
et al. 
Spathaspora arborariae sp. nov., a D-xylose-fermenting yeast species isolated from rotting wood in Brazil
FEMS Yeast Res
 
2009
9
1338
42
Chikhi
R
Medvedev
P
Informed and automated k-mer size selection for genome assembly
Bioinformatics
 
2014
30
31
7
Cunha-Pereira
F
Hickert
LR
Sehnem
NT
et al. 
Conversion of sugars present in rice hull hydrolysates into ethanol by Spathaspora arborariae, Saccharomyces cerevisiae, and their co-fermentations
Bioresource Technol
 
2011
102
4218
25
Daniel
HM
Lachance
MA
Kurtzman
CP
On the reclassification of species assigned to Candida and other anamorphic ascomycetous yeast genera based on phylogenetic circumscription
Anton Van Lee
 
2014
106
67
84
Fonseca
C
Romao
R
Rodrigues de Sousa
H
et al. 
L-Arabinose transport and catabolism in yeast
FEBS J
 
2007
274
3589
600
Gabaldón
T
Martin
T
Marcet-Houben
M
et al. 
Comparative genomics of emerging pathogens in the Candida glabrata clade
BMC Genomics
 
2013
14
623
Gírio
FM
Fonseca
C
Carvalheiro
F
et al. 
Hemicelluloses for fuel ethanol: a review
Bioresource Technol
 
2010
101
4775
800
Gurevich
A
Saveliev
V
Vyahhi
N
et al. 
QUAST: quality assessment tool for genome assemblies
Bioinformatics
 
2013
29
1072
5
Harner
NK
Wen
X
Bajwa
PK
et al. 
Genetic improvement of native xylose-fermenting yeasts for ethanol production
J Ind Microbiol Biot
 
2015
42
1
20
Hickert
LR
Souza-Cruz
PBD
Rosa
CA
et al. 
Simultaneous saccharification and co-fermentation of un-detoxified rice hull hydrolysate by Saccharomyces cerevisiae ICV D254 and Spathaspora arborariae NRRL Y-48658 for the production of ethanol and xylitol
Bioresource Technol
 
2013
143
112
6
Hittinger
CT
Gonçalves
P
Sampaio
JP
et al. 
Remarkably ancient balanced polymorphisms in a multi-locus gene network
Nature
 
2010
464
54
8
Hittinger
CT
Rokas
A
Bai
FY
et al. 
Genomics and the making of yeast biodiversity
Curr Opin Genet Dev
 
2015
35
100
9
Holt
C
Yandell
M
MAKER2: an annotation pipeline and genome-database management tool for second-generation genome projects
BMC Bioinformatics
 
2011
12
491
Hou
X
Anaerobic xylose fermentation by Spathaspora passalidarum
Appl Microbiol Biot
 
2012
94
205
14
Hou
X
Yao
S
Improved inhibitor tolerance in xylose-fermenting yeast Spathaspora passalidarum by mutagenesis and protoplast fusion
Appl Microbiol Biot
 
2012
93
2591
601
Katoh
K
Standley
DM
MAFFT multiple sequence alignment software version 7: improvements in performance and usability
Mol Biol Evol
 
2013
30
772
80
Konishi
J
Fukuda
A
Mutaguchi
K
et al. 
Xylose fermentation by Saccharomyces cerevisiae using endogenous xylose-assimilating genes
Biotechnol Lett
 
2015
37
1623
30
Korf
I
Gene finding in novel genomes
BMC Bioinformatics
 
2004
5
59
Kurtzman
CP
Fell
JW
Boekhout
T
et al. 
Methods for isolation phenotypic characterization and maintenance of yeasts
Kurtzman
CP
Fell
JW
Boekhout
T
The Yeasts: A Taxonomic Study
 
Amsterdam
Elsevier Science
2011
87
110
Kurtzman
CP
Robnett
CJ
Identification and phylogeny of ascomycetous yeasts from analysis of nuclear large subunit (26S) ribosomal DNA partial sequences
Anton Van Lee
 
1998
73
331
71
Lachance
MA
In defense of yeast sexual life cycles: the forma asexualis – an informal proposal
Yeast Newsl
 
2012
61
24
5
Lachance
MA
Bowles
JM
Starmer
WT
et al. 
Kodamaea kakaduensis and Candida tolerans, two new ascomycetous yeast species from Australian Hibiscus flowers
Can J Microbiol
 
1999
45
172
7
Latimer
LN
Lee
ME
Medina-Cleghorn
D
et al. 
Employing a combinatorial expression approach to characterize xylose utilization in Saccharomyces cerevisiae
Metab Eng
 
2014
25
20
9
Le
SQ
Gascuel
O
An improved general amino acid replacement matrix
Mol Biol Evol
 
2008
25
1307
20
Lobo
FP
Gonçalves
DL
Alves
SL
et al. 
Draft genome sequence of the D-xylose-fermenting yeast Spathaspora arborariae UFMG-HM19.1AT
Genome Announc
 
2014
2
e01163
13
Long
TM
Su
YK
Headman
J
et al. 
Cofermentation of glucose, xylose, and cellobiose by the beetle-associated yeast Spathaspora passalidarum
Appl Environ Microb
 
2012
78
5492
500
Luo
R
Liu
B
Xie
Y
et al. 
SOAPdenovo2: an empirically improved memory-efficient short-read de novo assembler
Gigascience
 
2012
1
18
Martiniano
SE
Philippini
RR
Chandel
AK
et al. 
Evaluation of rice bran extract as a nitrogen source for improved hemicellulosic ethanol production from sugarcane bagasse by new xylose-fermenting yeast strains isolated from Brazilian forests
Sugar Technol
 
2013
16
1
8
Morais
CG
Cadete
RM
Uetanabaro
APT
et al. 
D-xylose-fermenting and xylanase-producing yeast species from rotting wood of two Atlantic rainforest habitats in Brazil
Fungal Genet Biol
 
2013
60
19
28
Moriya
Y
Itoh
M
Okuda
S
et al. 
KAAS: an automatic genome annotation and pathway reconstruction server
Nucleic Acids Res
 
2007
35
W182
5
Nguyen
NH
Suh
SO
Marshall
CJ
et al. 
Morphological and ecological similarities: wood-boring beetles associated with novel xylose-fermenting yeasts, Spathaspora passalidarum gen. sp. nov. and Candida jeffriesii sp. nov
Mycol Res
 
2006
110
1232
41
Nogué
VS
Karhumaa
K
Xylose fermentation as a challenge for commercialization of lignocellulosic fuels and chemicals
Biotechnol Lett
 
2014
37
761
72
Novy
V
Krahulec
S
Wegleiter
M
et al. 
Process intensification through microbial strain evolution: mixed glucose-xylose fermentation in wheat straw hydrolyzates by three generations of recombinant Saccharomyces cerevisiae
Biotechnol Biofuels
 
2014
7
49
O'Donnell
K
Fusarium and its near relatives
Reynolds
DR
Taylor
JW
The Fungal Holomorph: Mitotic, Meiotic and Pleomorphic Speciation in Fungal Systematic
 
Wallingford
CAB International
1993
225
33
Parreiras
LS
Breuer
RJ
Narasimhan
RA
et al. 
Engineering and two-stage evolution of a lignocellulosic hydrolysate-tolerant Saccharomyces cerevisiae strain for anaerobic fermentation of xylose from AFEX pretreated corn stover
PLoS One
 
2014
9
e107499
Salichos
L
Rokas
A
Inferring ancient divergences requires genes with strong phylogenetic signals
Nature
 
2013
497
327
31
Simão
FA
Waterhouse
RM
Ioannidis
P
et al. 
BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs
Bioinformatics
 
2015
31
3210
2
Simpson
JT
Durbin
R
Efficient de novo assembly of large genomes using compressed data structures
Genome Res
 
2012
22
549
56
Simpson
JT
Wong
K
Jackman
SD
et al. 
ABySS: a parallel assembler for short read sequence data
Genome Res
 
2009
19
1117
23
Song
L
Florea
L
Langmead
B
Lighter: fast and memory-efficient sequencing error correction without counting
Genome Biol
 
2014
15
509
Stamatakis
A.
RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies
Bioinformatics
 
2014
30
1312
3
Stanke
M
Schöffmann
O
Morgenstern
B
et al. 
Gene prediction in eukaryotes with a generalized hidden Markov model that uses hints from external sources
BMC Bioinformatics
 
2006
7
62
Su
YK
Willis
LB
Jeffries
TW
Effects of aeration on growth, ethanol and polyol accumulation by Spathaspora passalidarum NRRL Y‐27907 and Scheffersomyces stipitis NRRL Y‐7124
Biotechnol Bioeng
 
2015
112
457
69
Sudiyani
Y
Sembiring
KC
Adilina
IB
Bioethanol G2: Production process and recent studies
Hakeem
KR
Jawaid
M
Rashid
U
Biomass and Bioenergy
 
Cham
Springer International Publishing
2014
345
64
Tamura
K
Stecher
G
Peterson
D
et al. 
MEGA6: Molecular Evolutionary Genetics Analysis Version 6.0
Mol Biol Evol
 
2013
30
2725
9
Ter-Hovhannisyan
V
Lomsadze
A
Chernoff
YO
et al. 
Gene prediction in novel fungal genomes using an ab initio algorithm with unsupervised training
Genome Res
 
2008
18
1979
90
Thompson
JD
Higgins
DG
Gibson
TJ
CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice
Nucleic Acids Res
 
1994
22
4673
80
van der Walt
JP
Ferreira
NP
Steyn
RL
Candida lyxosophila sp. nov., a new D-xylose fermenting yeast from southern Africa
Anton Van Lee
 
1987
53
93
7
Wang
Y
Ren
YC
Zhang
ZT
et al. 
Spathaspora allomyrinae sp. nov., a D-xylose-fermenting yeast species isolated from a scarabeid beetle Allomyrina dichotoma
Int J Syst Evol Micr
 
2016
66
2008
12
Weisenfeld
NI
Yin
S
Sharpe
T
et al. 
Comprehensive variation discovery in single human genomes
Nat Genet
 
2014
46
1350
5
White
TJ
Bruns
T
Lee
S
et al. 
Amplification and direct sequencing of fungal ribosomal RNA genes for phylogenetics
Innis
MA
Gelfand
DH
Sninsky
JJ
et al. 
PCR Protocols: A Guide to Methods and Applications
 
San Diego
Academic Press
1990
315
22
Wohlbach
DJ
Kuo
A
Sato
TK
et al. 
Comparative genomics of xylose-fermenting fungi for enhanced biofuel production
P Natl Acad Sci USA
 
2011
108
13212
7
Wolfe
KH
Armisén
D
Proux-Wera
E
et al. 
Clade- and species-specific features of genome evolution in the Saccharomycetaceae
FEMS Yeast Res
 
2015
15
Wong
S
Wolfe
KH
Birth of a metabolic gene cluster in yeast by adaptive gene relocation
Nat Genet
 
2005
37
777
82
Zimin
AV
Marçais
G
Puiu
D
et al. 
The MaSuRCA genome assembler
Bioinformatics
 
2013
29
2669
77
Zou
X
Peris
D
Hittinger
CT
et al. 
In silico whole genome sequencer and analyzer (iWGS): a computational pipeline to guide the design and analysis of the novo genome sequencing studies
BioRxiv
 
2016

Supplementary data