Abstract

Aspergillus fumigatus is the Aspergillus species most commonly associated with aspergillosis. Of the various presentations of aspergillosis, one of the most frequently observed in cases involving A. fumigatus pulmonary infections is aspergilloma (PA). In such infections one finds a fungus ball composed of fungal hyphae, inflammatory cells, fibrin, mucus, and tissue debris. Chronic necrotizing pulmonary aspergillosis (CNPA), also known as semi-invasive or invasive aspergillosis, is locally invasive and predominantly seen in patients with mild immunodeficiency or with a chronic lung disease. In the present study, with the aid of a next-generation sequencer, we conducted whole genome sequence (WGS) analyses of 17 strains isolated from patients in Japan with PA and CNPA. A total of 99,088 SNPs were identified by mapping the reads to A. fumigatus genome reference strain Af293, and according to genome-wide phylogenetic analysis, there were no correlations between the whole genome sequence typing results and pathologic conditions of patients. Here, we conducted the first multi-genome WGS study to focus on the A. fumigatus strains isolated from patients with PA and CNPA, and comprehensively characterized genetic variations of strains. WGS approach will help in better understanding of molecular mechanisms of aspergillosis cases caused by A. fumigatus.

Introduction

Aspergillus fumigatus is a ubiquitous fungus commonly found in soil [1] but also reported as a major cause of deep seated aspergillosis in humans, especially in patients with compromised or with suppressed host immunity [2]. Pulmonary infections caused by A. fumigatus have a wide range of clinical presentations [3] but predominantly seen as the following four types; (1) allergic bronchopulmonary aspergillosis, (2) aspergilloma or pulmonary aspergilloma (PA), (3) chronic necrotizing aspergillosis (CNPA), and (4) invasive pulmonary aspergillosis [3,4]. PA includes a fungus ball composed of fungal hyphae, inflammatory cells, fibrin, mucus, and tissue debris and in most cases change little over months or years. CNPA, which is also known as semi-invasive or slowly progressive aspergillosis, was first reported by Gefter et al. in 1981 [5] and is mainly seen in patients with mild immunodeficiency or with a chronic lung disease. The progress of the illness is generally slow and chronic, but occasionally aggressive. Some cases of PA progress to CNPA, but the transition mechanism remains unknown.

To date, there have been no comprehensive studies of single nucleotide polymorphisms (SNPs) using whole genome sequence (WGS) in A. fumigatus strains from patients. To clarify the relationships between genotypes and pathological conditions, and to reinforce the genomic resources of A. fumigatus, we conducted WGS analysis of 17 strains isolated in Japan from patients with PA and CNPA. By mapping the sequencing reads to the genome sequence of A. fumigatus genome reference strain Af293, we identified as SNPs a total of 99,088 positions. Although genome-wide phylogenetic analysis could separate the 17 strains into three subclades, associations between the whole genome sequence typing and pathological conditions were not found. Through this study, we were able to comprehensively unravel SNPs of 17 A. fumigatus strains from patients. Although genomic features associated with pathological conditions were not found in the present study, accumulation of sequence data from pathogenic fungus strains would provide useful information regarding candidate genes associated with aspergillosis, and more sophisticated classification of strains.

Materials and methods

Fungal strains

All strains used in this study have been stored and maintained at the Medical Mycology Research Center, Chiba University (IFM strains) in Japan. Of the 17 sequenced strains, eight (IFM 55369, IFM 58026, IFM 58401, IFM 59056, IFM 59359, IFM 59361, IFM 59365, and IFM 59777) were isolated from patients with PA, and the other nine (IFM 58029, IFM 59073, IFM 60514, IFM 61118, IFM 61407, IFM 61578, IFM 61610, IFM 62115, and IFM 62516) were recovered from patients with CNPA (Table 1). The pathologic conditions suggested by Denning to classify the patients having PA or CNPA were employed in these studies [6]. Patients with PA were characterized by extremely low disease activity without clinical signs or symptoms and CT scans in diagnosed patients showed the presence of fungus balls in intrathoracic cavity and no progress in the infections were observed over months or years. Patients with CNPA on the other hand were symptomatic, complained of productive cough, and had chest images showing active inflammation process. The cavity expansion and paracavitary infilterates were verified by CT scans in diagnosed patients.

Table 1.

Strain information

Aspergillosis   
type Pulmonary aspergilloma (PA) Chronic necrotic pulmonary aspergillosis (CNPA) 
 IFM IFM IFM IFM IFM IFM IFM IFM IFM IFM IFM IFM IFM IFM IFM IFM IFM 
Strain ID 55369 58026 58401 59056 59359 59361 59365 59777 58029 59073 60514 61118 61407 61578 61610 62115 62516 
Geographical source Ishikawa Tottori Chiba Chiba Chiba Chiba Chiba Okayama Nagasaki Chiba Chiba Chiba Chiba Osaka Tokyo Aichi Tokyo 
Collection date 2009 2009 2009 2009 2007 2009 2010 2010 2009 2009 2011 2012 2012 2012 2012 2013 2014 
Medication history profile NAa Nob No No No No No No ITCZc VRCZd No VRCZ No VRCZ ITCZ ITCZ ITCZ/MCFGe 
Aspergillosis   
type Pulmonary aspergilloma (PA) Chronic necrotic pulmonary aspergillosis (CNPA) 
 IFM IFM IFM IFM IFM IFM IFM IFM IFM IFM IFM IFM IFM IFM IFM IFM IFM 
Strain ID 55369 58026 58401 59056 59359 59361 59365 59777 58029 59073 60514 61118 61407 61578 61610 62115 62516 
Geographical source Ishikawa Tottori Chiba Chiba Chiba Chiba Chiba Okayama Nagasaki Chiba Chiba Chiba Chiba Osaka Tokyo Aichi Tokyo 
Collection date 2009 2009 2009 2009 2007 2009 2010 2010 2009 2009 2011 2012 2012 2012 2012 2013 2014 
Medication history profile NAa Nob No No No No No No ITCZc VRCZd No VRCZ No VRCZ ITCZ ITCZ ITCZ/MCFGe 

aNo available information.

bNo therapy.

cTherapy with itraconazole.

dTherapy with voriconazole.

eTherapy with micafungin.

Whole genome sequencing

Fungi were grown on Potato Dextrose Agar (Difco) at 37°C for 7 days to obtain fully mature conidia which were then inoculated into Yeast Glucose Broth Media (0.1% Yeast extract: Difco, and 1% glucose: Wako) and incubated at 37°C at 200 rpm for 3 days. Mycelia were washed with H2O, and genomic DNA was purified with phenol-chloroform extraction. Nextera DNA Sample Prep Kit (Illumina) was used to prepare the shotgun library of DNA samples for multiplexed, 250bp paired-end sequencing. The qualities of all libraries were determined by Agilent 2100 Bioanalyzer (Agilent Technologies), and Quant-iT PicoGreen dsDNA Assay Kit (Invitrogen). Sequencing was performed with the aid of MiSeq (Illumina) according to the manufacturer's instructions using MiSeq Reagent Kit v2. Read depths of all strain data were 21–50 (Supplementary Table S1). The sequence data (DRA001281) were deposited in DRA (DNA Data Bank of Japan Sequence Read Archive, http://trace.ddbj.nig.ac.jp/dra/index.html).

Sequence analysis

Illumina data sets were trimmed using fastq-mcf in ea-utils (ver. 1.1.2–484), in other words, sequencing adapters and sequences with low quality scores (Phred score Q < 30) were trimmed [7]. Data sets were mapped to the genome sequence of A. fumigatus reference strain Af293 (29,420,142 bp, genome version: s03-m04-r03 [8,9]) using bowtie2 (ver. 2.0.0-beta7) at “very-sensitive” preset of parameters in “end-to-end” mode, which searches for alignments involving all of the read characters [10]. Duplicated reads were removed by Picard (ver. 1.112) (http://picard.sourceforge.net). For SNP calling, SAMtools mpileup (ver. 0.1.19–44428cd), and vcftools view were performed for each sample. The mpileup settings were positioned to -q20 to trim the reads with low mapping quality, and -Q30 to trim the base with low base quality [11]. The vcftools setting was to -c to use Bayesian inference. Consensus sites and SNPs were excluded if they did not meet a minimum coverage of 10×, or if the variant was present in less than 90% of the base calls [12,13]. The genotype field in VCF files indicates the probabilities of homozygote and heterozygote as the phred-scaled likelihoods. Consensus sites and SNPs were excluded if their genotype corresponded to heterozygote. Mapping results were visualized using Integrative Genomics Viewer (ver. 2.3.3) [14,15].

Genome-wide phylogenetic analysis

The distance matrix of the concatenated sequences at SNP positions was calculated with the DNADIST program in the PHYLIP package (ver. 3.69) using Kimura 2-parameter model [16]. Phylogenetic tree was constructed using neighbor-joining method in NJplot [17].

RNA-Seq analysis

A. fumigatus strains (IFM 59365 (PA), and IFM 62115 (CNPA)) were grown in 20 ml of potato dextrose broth at 37°C for 2 days after which cells were harvested, washed with water, frozen in liquid nitrogen, and stored at −80°C until use. Frozen mycelia were ground using MagNA Lyser multi-beads shocker (Roche), after which total RNAs were extracted, (approximately 1 g for each sample) using RNeasy Mini Kit (Qiagen), and treated with DNaseI (TakaraBio). TruSeq RNA Sample Prep Kit v2 (Illumina) was employed to prepare the libraries of mRNA samples for multiplexed, 50bp single-end sequencing according to the manufacturer's protocol. The qualities of all libraries were determined by Agilent 2100 Bioanalyzer. Single-end 50bp sequencing was performed with the aid of MiSeq, then mapped to Af293 genome by tophat2 (v2.0.4) [18], and reads per kilobase of transcript per million mapped reads (RPKM) values were calculated by cufflinks (v2.2.1) [19]. Results were analyzed using cummeRbund package (2.6.1) [20] in the R programming language [21]. For genes with significantly abundant transcripts, the threshold was set to be median values, in other words, 12.1 for IFM 59365 and 11.4 for IFM 62115.

Results and discussion

WGS analysis of A. fumigatus strains isolated from patients with PA and CNPA

In order to elucidate the genomic features of strains isolated from patients with PA and CNPA, 17 pathogenic A. fumigatus strains were selected and sequenced with the aid of a next-generation sequencer Illumina MiSeq using a 250-cycle paired-end protocol. All strains were isolated as human pathogens in Japan after 2007 (Table 1). Seven and two patients with PA and CNPA received no antifungal therapy. But seven patients with CNPA received itraconazole, voriconazole, or micafungin. All raw sequence data were trimmed, and mapped to the genome sequence Af293 by bowtie2 (see Materials and Methods). The coverage against the Af293 reference sequence ranged from 90.94% (IFM 59359) to 92.90% (IFM 59361) (Supplementary Table S1).

Genome-wide SNPs detection and phylogenetic analysis

To explore any genomic differences among the 17 strains, we performed single nucleotide polymorphisms (SNPs) using the genome sequence of Af293 strain as the reference sequence (see Materials and methods). In this study, threshold values for mapping quality, read depth, and variant frequency were set at 20, 10, and 0.9, respectively [12,13]. As a result, the numbers of SNPs we identified ranged from 24,631 to 42,370. Among of those, the numbers of nonsynonymous substitutions ranged from 6,270 to 8,345 (Table 2).

Table 2.

Features of SNPs

 IFM IFM IFM IFM IFM IFM IFM IFM  
Strain ID 55369 58026 58401 59056 59359 59361 59365 59777  
No. of SNPs 36,290 34,363 34,676 37,843 34,445 37,623 36,600 34,576  
Intergenic 15,092 14,110 14,749 16,283 13,251 16,387 15,628 14,949  
Intron 2,533 2,439 2,496 2,717 2,572 2,583 2,622 2,397  
UTR 4,146 4,099 4,049 4,414 4,092 4,246 4,229 4,075  
Synonymous 6,999 6,603 6,426 6,880 6,969 6,821 6,764 6,285  
Non-synonymous 7,499 7,101 6,944 7,514 7,538 7,558 7,344 6,838  
Functional RNA 42 43 37 62 52 53 42 52  
 IFM IFM IFM IFM IFM IFM IFM IFM IFM 
Strain ID 58029 59073 60514 61118 61407 61578 61610 62115 62516 
No. of SNPs 34,853 38,468 42,370 36,474 39,909 37,242 41,262 24,631 36,508 
Intergenic 14,989 17,000 18,535 15,260 16,924 15,588 18,175 7,923 15,478 
Intron 2,428 2,615 2,998 2,613 2,832 2,666 2,888 1,916 2,531 
UTR 4,179 4,326 4,825 4,250 4,645 4,387 4,716 2,690 4,342 
Synonymous 6,309 6,980 7,635 6,846 7,389 7,000 7,428 5,822 6,801 
Non-synonymous 6,922 7,512 8,345 7,493 8,099 7,577 8,031 6,270 7,335 
Functional RNA 49 60 60 43 58 54 51 33 43 
 IFM IFM IFM IFM IFM IFM IFM IFM  
Strain ID 55369 58026 58401 59056 59359 59361 59365 59777  
No. of SNPs 36,290 34,363 34,676 37,843 34,445 37,623 36,600 34,576  
Intergenic 15,092 14,110 14,749 16,283 13,251 16,387 15,628 14,949  
Intron 2,533 2,439 2,496 2,717 2,572 2,583 2,622 2,397  
UTR 4,146 4,099 4,049 4,414 4,092 4,246 4,229 4,075  
Synonymous 6,999 6,603 6,426 6,880 6,969 6,821 6,764 6,285  
Non-synonymous 7,499 7,101 6,944 7,514 7,538 7,558 7,344 6,838  
Functional RNA 42 43 37 62 52 53 42 52  
 IFM IFM IFM IFM IFM IFM IFM IFM IFM 
Strain ID 58029 59073 60514 61118 61407 61578 61610 62115 62516 
No. of SNPs 34,853 38,468 42,370 36,474 39,909 37,242 41,262 24,631 36,508 
Intergenic 14,989 17,000 18,535 15,260 16,924 15,588 18,175 7,923 15,478 
Intron 2,428 2,615 2,998 2,613 2,832 2,666 2,888 1,916 2,531 
UTR 4,179 4,326 4,825 4,250 4,645 4,387 4,716 2,690 4,342 
Synonymous 6,309 6,980 7,635 6,846 7,389 7,000 7,428 5,822 6,801 
Non-synonymous 6,922 7,512 8,345 7,493 8,099 7,577 8,031 6,270 7,335 
Functional RNA 49 60 60 43 58 54 51 33 43 

*Some positions have multiple features.

A total of 99,088 positions were identified as SNPs, of which, 42,345 positions were called consensus sites or SNPs in all 17 strains. Other positions were excluded in least one strain because of low depth, low variant frequency, or genotype predicted as heterozygote. Concatenated sequences (42,345) were then used for the phylogenetic analysis. Figure 1 shows the phylogenetic tree including the Af293 reference strain. The 17 strains were separated into three subclades, suggesting that the positions we found could be used as markers for distinguishing among A. fumigatus strains. One subclade consisted of IFM 62516 (CNPA), IFM 55369 (PA), IFM 59073 (CNPA), IFM 59777 (PA), and IFM 59361 (PA), a second of IFM 58029 (CNPA), IFM 59056 (PA), IFM 59359 (PA), IFM 58401 (PA), IFM 60514 (CNPA), IFM 61610 (CNPA), IFM 58026 (PA), IFM 59365 (PA), IFM 61407 (CNPA), IFM 61118 (CNPA), and IFM 61578 (CNPA) and the third of IFM 62115 (CNPA). Two subclades (I and II) consisted of both strains isolated from patients with PA and CNPA, but the association between the whole genome sequence typing and pathological conditions was not found. In addition, the association between the whole genome sequence typing and medication history profile was not found and we couldn't find SNPs common to only PA or CNPA strains.. There were no differences between phylogenetic analysis using 42,345 and 20,891 concatenated sequences, locating in the coding regions (Supplementary Figure S1). Chazalet et al. showed the absence of a common strain responsible for an invasive aspergillosis outbreak based on the fingerprinting of more than 700 clinical and environmental A. fumigatus strains [22]. Hadrich et al. indicated that there is no apparent correlation between the genotypes of the isolates and the clinical presentation of the disease in A. flavus [23] which is is in agreement with our results.

Figure 1.

Phylogenetic analysis based on the sequences of concatenated 42,345 nucleotides. PA and CNPA correspond to pathological conditions.

Figure 1.

Phylogenetic analysis based on the sequences of concatenated 42,345 nucleotides. PA and CNPA correspond to pathological conditions.

SNPs common to 17 strain

We found that 8,192 SNPs against the Af293 genome sequence were identical in all 17 strains, of which we noted 10 nonsense mutations, and 7 substitutions from stop codon to particular amino acids (Table 3). In order to confirm the expressions of the mutated 17 genes, we performed RNA-Seq analyses for IFM 59365 (PA), and IFM 62115 (CNPA), observing a total of 3,857,048 and 1,148,545 sequencing reads, respectively. While under our experimental conditions, nine genes were expressed in both strains, but, seven were not (Table 3). For the nonsense mutations, the transcripts of the genes were not terminated around SNP region, but expressed, although the expression of Afu5g06230 was not confirmed in our data (Table 3A), Morton et al. has reported that the expression of Afu5g06230 was down-regulated by co-incubation with human immature dendritic cells using DNA array [24]. Ser87 is located at 3rd exon of Afu5g06230 encoded by 8 exons, suggesting that the 3rd exon might not be a translated region. The expression of Afu8g01260 with 549 amino acids (aa.) was confirmed such as by da Silva Ferreira et al. [25]. They have reported that the expression level of Afu8g01260 gene was upregulated in treatment with voriconazole using microarray analysis. The Arg17* substitution is located at very first part of Afu8g01260, indicating that this would not be nonsense mutation, but translation start site might be located at after 17 aa. For the genes with the substitutions from stop codon, there were three genes with high expression, in other words, Afu5g07650, Afu6g00555, and Afu8g01680, with the predicted sizes 373 aa, 167 aa, and 162 aa, respectively. NCBI blastp searches [26] were conducted, and best hit proteins of Afu5g07650, and Afu6g00555 were identified as NCBI accession number EDP51510 (99% identity), and KEY81336 (98% identity), respectively. The expression of Afu5g07650 was transcribed after 1,911,036 on chromosome 5 (Figure 2A). The predicted sequence of Afu5g07650 had the similarity with that of F-box domain and ankyrin repeat protein of A. fumigatus A1163 (EDP51510) [27], which has additional residues flanking the C-terminal region of Afu5g07650 (Figure 2B), indicating that the length of Afu5g07650 in 17 strains used would be 373 aa, but not 222 aa. Similarly, the predicted sequence of Afu6g00555 had the similarity with that of hypothetical protein BA78_8037 in A. fumigatus var. RP-2014 (KEY81336), indicating the length of Afu6g00555 in 17 strains would be 167 aa but not 139 aa. blastp results demonstrated that there were no candidates with high similarity for 162 aa of Afu8g01680, suggesting that this gene might be a novel gene in A. fumigatus, and necessary to improve the annotation by further experimental works.

Figure 2.

Afu5g07650 of IFM 59365. (A) The expression profile of Afu5g07650 by RNA-Seq visualized by Integrative Genomics Viewer. The gene was transcribed after 1,911,036 on chromosome 5, where the substitution from stop codon was observed. The reads derived from mRNA covered the region of Afu5g07650. (B) The multiple alignment of amino acid sequences by ClustalW in GenomeNet (http://www.genome.jp/tools/clustalw/). Afu5g07650_IFM59365 was the predicted sequence by substitution of *223Glu. This Figure is reproduced in color in the online version of Medical Mycology.

Figure 2.

Afu5g07650 of IFM 59365. (A) The expression profile of Afu5g07650 by RNA-Seq visualized by Integrative Genomics Viewer. The gene was transcribed after 1,911,036 on chromosome 5, where the substitution from stop codon was observed. The reads derived from mRNA covered the region of Afu5g07650. (B) The multiple alignment of amino acid sequences by ClustalW in GenomeNet (http://www.genome.jp/tools/clustalw/). Afu5g07650_IFM59365 was the predicted sequence by substitution of *223Glu. This Figure is reproduced in color in the online version of Medical Mycology.

Table 3.

Genes with nonsense mutation or non-synonymous substitutions from stop codon

(A) Nonsense mutation  
     Amino acid Length of RPKMa in RPKM in 
Chromosome Position Ref/SNP Gene Function substitutions protein sequence IFM 59365 IFM 62115 
Chr. 1 407,398 A/T Afu1g01160 salivary apyrase Tyr415* 586 aa 4.7 0.9 
Chr. 1 467,385 T/G Afu1g01450 toxin biosynthesis Tyr70* 421 aa 11.1 16.2 
    protein     
Chr. 2 3,917,269 A/T Afu2g14840 glucose oxidase Leu5* 636 aa 17.9 17.9 
Chr. 3 3,885,391 C/T Afu3g14650 hypothetical protein Gln644* 661 aa 50.0 22.3 
Chr. 4 3,124,978 C/G Afu4g11830 cell cycle Tyr707* 713 aa 6.5 4.0 
    control protein (Cwf19)     
Chr. 5 1,492,516 C/G Afu5g06230 gaba-specific permease Ser87* 549 aa 2.3 6.4 
Chr. 6 869,122 A/T Afu6g03930 hypothetical protein Leu361* 361 aa 19.6 22.5 
Chr. 6 2,985,420 C/T Afu6g11930 hypothetical protein Gln80* 745 aa 0.9 0.0 
Chr. 8 294,936 T/A Afu8g01260 hypothetical protein Arg17* 549 aa 16.6 40.1 
Chr. 8 457,354 G/A Afu8g01750 acyl-CoA oxidase Gln166* 630 aa 3.3 6.3 
(B) Substitutions from stop codon 
     Amino acid RPKM in RPKM in  
Chromosome Position Ref/SNP Gene Function substitutions IFM 59365 IFM 62115  
Chr. 2 3,555,146 T/C Afu2g13660 hypothetical protein *396Gln 14.9 12.1  
Chr. 3 2,274,190 A/G Afu3g08890 hypothetical protein *186Arg 12.4 18.6  
Chr. 5 1,911,036 T/G Afu5g07650 F-box domain and *223Glu 181.9 28.4  
    ankyrin repeat protein     
Chr. 5 3,619,848 C/A Afu5g13760 hypothetical protein *399Tyr 1.5 2.7  
Chr. 6 138,703 A/G Afu6g00555  *140Arg 101.0 109.3  
Chr. 7 1,638,016 T/G Afu7g06730 FAD monooxygenase *131Cys 5.8 0.0  
Chr. 8 444,209 T/A Afu8g01680 hypothetical protein *120Tyr 60.1 16.7  
(A) Nonsense mutation  
     Amino acid Length of RPKMa in RPKM in 
Chromosome Position Ref/SNP Gene Function substitutions protein sequence IFM 59365 IFM 62115 
Chr. 1 407,398 A/T Afu1g01160 salivary apyrase Tyr415* 586 aa 4.7 0.9 
Chr. 1 467,385 T/G Afu1g01450 toxin biosynthesis Tyr70* 421 aa 11.1 16.2 
    protein     
Chr. 2 3,917,269 A/T Afu2g14840 glucose oxidase Leu5* 636 aa 17.9 17.9 
Chr. 3 3,885,391 C/T Afu3g14650 hypothetical protein Gln644* 661 aa 50.0 22.3 
Chr. 4 3,124,978 C/G Afu4g11830 cell cycle Tyr707* 713 aa 6.5 4.0 
    control protein (Cwf19)     
Chr. 5 1,492,516 C/G Afu5g06230 gaba-specific permease Ser87* 549 aa 2.3 6.4 
Chr. 6 869,122 A/T Afu6g03930 hypothetical protein Leu361* 361 aa 19.6 22.5 
Chr. 6 2,985,420 C/T Afu6g11930 hypothetical protein Gln80* 745 aa 0.9 0.0 
Chr. 8 294,936 T/A Afu8g01260 hypothetical protein Arg17* 549 aa 16.6 40.1 
Chr. 8 457,354 G/A Afu8g01750 acyl-CoA oxidase Gln166* 630 aa 3.3 6.3 
(B) Substitutions from stop codon 
     Amino acid RPKM in RPKM in  
Chromosome Position Ref/SNP Gene Function substitutions IFM 59365 IFM 62115  
Chr. 2 3,555,146 T/C Afu2g13660 hypothetical protein *396Gln 14.9 12.1  
Chr. 3 2,274,190 A/G Afu3g08890 hypothetical protein *186Arg 12.4 18.6  
Chr. 5 1,911,036 T/G Afu5g07650 F-box domain and *223Glu 181.9 28.4  
    ankyrin repeat protein     
Chr. 5 3,619,848 C/A Afu5g13760 hypothetical protein *399Tyr 1.5 2.7  
Chr. 6 138,703 A/G Afu6g00555  *140Arg 101.0 109.3  
Chr. 7 1,638,016 T/G Afu7g06730 FAD monooxygenase *131Cys 5.8 0.0  
Chr. 8 444,209 T/A Afu8g01680 hypothetical protein *120Tyr 60.1 16.7  

aRPKM values more than median value were indicated as bold text.

Conclusion

To our knowledge, this is the first attempt to sequence A. fumigatus strains isolated from patients using WGS in order to perform genome-scale comparisons, and elucidate SNPs. In the present study, we focused on 17 strains isolated from patients with two forms of pulmonary aspergillosis, in other words, PA and CNPA. Comprehensive analyses enabled us to identify a total of 99,088 SNPs, with the phylogenetic analysis using 42,345 positions referred to as consensus sites or SNPs in all 17 strains, clearly separated these strains into three subclades. We could not, however, see an apparent correlation between the whole genome sequence typing and pathological conditions. Moreover, we found 10 nonsense mutations, and 7 substitutions from stop codon to particular amino acid common to all 17 strains. The expressions of nine genes were confirmed under our experimental conditions. Further studies might be necessary for elucidating the expressions of seven genes under different experimental conditions. These sequence differences might come from the geographical variations of A. fumigatus across the countries, because all strains used in this study were isolated in Japan.

The comprehensive SNP data obtained by WGS analysis will help in better understanding of Aspergillus species. These SNPs would be useful to improve gene annotation by integrating with further experimental works, such as RNA-Seq analysis. Although molecular mechanisms in PA or CNPA remain to be clarified, and current infection models in mice cannot mimic these pathologies in the lung, these sequence data would be useful for studying the molecular mechanism, and provide important hints regarding pathological conditions and therapy for aspergillosis. Further accumulation of SNP data will be necessary for unraveling the molecular mechanisms of aspergillosis.

This work has been partly supported by The Ministry of Education, Culture, Sports, Science and Technology (MEXT) Special Budget for Research Projects: The Project on Controlling Aspergillosis and the Related Emerging Mycoses, the Cooperative Research Grant of NEKKEN, and Takeda Science Foundation. This work has been partly supported by the National BioResource Project–Pathogenic Microbes funded by MEXT, Japan (http://www.nbrp.jp/).

Author Contributions

Conceived and designed the experiments: AW, SK, KK, TG, HT. Performed the experiments: ATN, YM, DH, KS. Analyzed the data: ATN, YM, HT. Wrote the paper: ATN, DH, TT, TG, HT. All authors have read and approved the manuscript.

Declaration of interest

The authors report no conflicts of interest. The authors alone are responsible for the content and the writing of the paper.

Supplementary material

Supplementary material is available at Medical Mycology online (http://www.mmy.oxfordjournals.org/).

References

1.
Thom
C
Raper
KB
A Manual of the Aspergilli
 , 
1945
Baltimore
Williams & Wilkins
2.
Bodey
GP
Vartivarian
S
Aspergillosis
Eur J Clin Microbiol Infect Dis
 , 
1989
, vol. 
8
 
5
(pg. 
413
-
437
)
3.
Kousha
M
Tadi
R
Soubani
AO
Pulmonary aspergillosis: a clinical review
Eur Respir Rev
 , 
2011
, vol. 
20
 
121
(pg. 
156
-
174
)
4.
Greene
R
The pulmonary aspergilloses: three distinct entities or a spectrum of disease
Radiology
 , 
1981
, vol. 
140
 
2
(pg. 
527
-
530
)
5.
Gefter
WB
Weingrad
TR
Epstein
DM
, et al.  . 
“Semi-invasive” pulmonary aspergillosis: a new look at the spectrum of Aspergillus infections of the lung
Radiology
 , 
1981
, vol. 
140
 
2
(pg. 
313
-
321
)
6.
Denning
DW
Chronic forms of pulmonary aspergillosis
Clin Microbiol Infect
 , 
2001
, vol. 
7
 
Suppl 2
(pg. 
25
-
31
)
7.
Aronesty
E
Comparison of sequencing utility programs
Open Bioinformat J
 , 
2013
, vol. 
7
 
3
(pg. 
1
-
8
)
8.
Arnaud
MB
Cerqueira
GC
Inglis
DO
, et al.  . 
The Aspergillus Genome Database (AspGD): recent developments in comprehensive multispecies curation, comparative genomics and community resources
Nucleic Acids Res
 , 
2012
, vol. 
40
 
Database issue
(pg. 
D653
-
659
)
9.
Nierman
WC
Pain
A
Anderson
MJ
, et al.  . 
Genomic sequence of the pathogenic and allergenic filamentous fungus Aspergillus fumigatus
Nature
 , 
2005
, vol. 
438
 
7071
(pg. 
1151
-
1156
)
10.
Langmead
B
Salzberg
SL
Fast gapped-read alignment with Bowtie 2
Nat Methods
 , 
2012
, vol. 
9
 
4
(pg. 
357
-
359
)
11.
Li
H
Handsaker
B
Wysoker
A
, et al.  . 
The Sequence Alignment/Map format and SAMtools
Bioinformatics
 , 
2009
, vol. 
25
 
16
(pg. 
2078
-
2079
)
12.
Gillece
JD
Schupp
JM
Balajee
SA
, et al.  . 
Whole genome sequence analysis of Cryptococcus gattii from the Pacific Northwest reveals unexpected diversity
PLoS One.
 , 
2011
, vol. 
6
 
12
pg. 
e28550
 
13.
Holt
KE
Teo
YY
Li
H
, et al.  . 
Detecting SNPs and estimating allele frequencies in clonal bacterial populations by sequencing pooled DNA
Bioinformatics
 , 
2009
, vol. 
25
 
16
(pg. 
2074
-
2075
)
14.
Thorvaldsdóttir
H
Robinson
JT
Mesirov
JP
Integrative Genomics Viewer (IGV): high-performance genomics data visualization and exploration
Brief Bioinform
 , 
2013
, vol. 
14
 
2
(pg. 
178
-
192
)
15.
Robinson
JT
Thorvaldsdóttir
H
Winckler
W
, et al.  . 
Integrative genomics viewer
Nat Biotechnol
 , 
2011
, vol. 
29
 
1
(pg. 
24
-
26
)
16.
Felsenstein
J
PHYLIP (Phylogeny Inference Package), version 3.6
 , 
2005
University of Washington
Department of Genome Sciences
17.
Perrière
G
Gouy
M
WWW-query: an on-line retrieval system for biological sequence banks
Biochimie
 , 
1996
, vol. 
78
 
5
(pg. 
364
-
369
)
18.
Trapnell
C
Pachter
L
Salzberg
SL
TopHat: discovering splice junctions with RNA-Seq
Bioinformatics
 , 
2009
, vol. 
25
 
9
(pg. 
1105
-
1111
)
19.
Trapnell
C
Williams
BA
Pertea
G
, et al.  . 
Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation
Nat Biotechnol
 , 
2010
, vol. 
28
 
5
(pg. 
511
-
515
)
20.
Goff
L
Trapnell
C
Kelley
D
CummeRbund: analysis, exploration, manipulation, and visualization of Cufflinks high-throughput sequencing data
 , 
2012
 
R package 2.6.1
21.
R Core Team
R: a language and environment for statistical computing
 , 
2014
Vienna, Austria
R Foundation for Statistical Computing
22.
Chazalet
V
Debeaupuis
JP
Sarfati
J
, et al.  . 
Molecular typing of environmental and patient isolates of Aspergillus fumigatus from various hospital settings
J Clin Microbiol
 , 
1998
, vol. 
36
 
6
(pg. 
1494
-
1500
)
23.
Hadrich
I
Neji
S
Drira
I
, et al.  . 
Microsatellite typing of Aspergillus flavus in patients with various clinical presentations of aspergillosis
Med Mycol
 , 
2013
, vol. 
51
 
6
(pg. 
586
-
591
)
24.
Morton
CO
Varga
JJ
Hornbach
A
, et al.  . 
The temporal dynamics of differential gene expression in Aspergillus fumigatus interacting with human immature dendritic cells in vitro
PLoS One
 , 
2011
, vol. 
6
 
1
pg. 
e16016
 
25.
da Silva Ferreira
ME
Malavazi
I
Savoldi
M
, et al.  . 
Transcriptome analysis of Aspergillus fumigatus exposed to voriconazole
Curr Genet
 , 
2006
, vol. 
50
 
1
(pg. 
32
-
44
)
26.
Benson
DA
Karsch-Mizrachi
I
Clark
K
, et al.  . 
GenBank
Nucleic Acids Res
 , 
2012
, vol. 
40
 
Database issue
(pg. 
D48
-
53
)
27.
Fedorova
ND
Khaldi
N
Joardar
VS
, et al.  . 
Genomic islands in the pathogenic filamentous fungus Aspergillus fumigatus
PLoS Genet
 , 
2008
, vol. 
4
 
4
pg. 
e1000046