Abstract

As soon as whole-genome sequencing entered the scene in the mid-1990s and demonstrated its use in revealing the entire genetic potential of any given microbial organism, this technique immediately revolutionized the way pathogen (and many other fields of) research was carried out. The ability to perform whole-genome comparisons further transformed the field and allowed scientists to obtain information linking phenotypic dissimilarities among closely related organisms and their underlying genetic mechanisms. Such comparisons have become commonplace in examining strain-to-strain variability, as well as comparing pathogens to less, or nonpathogenic near neighbors. In recent years, a bloom in novel sequencing technologies along with continuous increases in throughput has occurred, inundating the field with various types of massively parallel sequencing data and further transforming comparative genomics research. Here, we review the evolution of comparative genomics, its impact in understanding pathogen evolution and physiology and the opportunities and challenges presented by next-generation sequencing as applied to pathogen genome comparisons.

COMPARATIVE GENOMICS AND SEQUENCING

Comparative genomics is a broad field, which has moved beyond the simple comparison of two or even a few genomes. The field encompasses global comparative hybridization studies using nonsequencing technologies such as microarrays [1–3], targeted gene (or genomic region) studies that focus on only specific gene(s) or noncoding region(s) [4–8], studies of specific pathways [9–11] and whole-genome sequence alignments. Whole-genome comparisons have become more sophisticated over time, particularly when dealing with many genomes in a single study. Genomic comparisons have also become more complicated as a result of next-generation sequencing (NGS) technologies that produce an enormous amount of data with varied error models and biases. Despite the aforementioned advancements, the goal of comparative genomics remains to identify all differences among genomes, and then elucidate which sequence differences are responsible for phenotypic shifts in organisms, such as virulence differences among pathogens.

Compared with ‘traditional’ genome studies that focus on a single genome per study, comparative genomics provides an additional layer of detail. A single genome can reveal the functional potential encoded within an organism with ‘tried-and-true’ annotation strategies using BLASTX, or HMM searches of protein families (e.g. COG, Pfam, TIGRfam, etc.) in order to glean functional potential of a given organism. Comparisons between different pathogen genomes, however, often lead to faster identification of distinct mechanisms underlying pathogenicity. Many types of differences have been observed between pathogens and non- or less-pathogenic relatives, from very large genomic differences as in the genomic islands found among pathogenic or benign strains within a single species (Escherichia coli) [12, 13], to smaller genomic differences between closely related species of the same genus (e.g. between Yersinia pestis and Yersinia pseudotuberculosis) [14]. While the set of genome sequence differences alone may not provide any conclusive answer as to which sequence(s) may be responsible for a specific phenotype, nevertheless, genome comparisons do generate manageable lists of genomic regions and gene candidates for further study.

Studies comparing multiple strains of the same species have also lead to the development of the ‘core’ and ‘pan’genomes concepts, motivated in part by efforts to understand the entire gene repertoire of a species, and to understand intra-species genomic diversity [15–19]. The set of genes that are found in all genomes (the core genome) within a species diminishes as more genomes from the same species (or genus, etc.) are compared, while the remaining accessory genes are either unique to individual strains or ‘variable’ among two or more, but not all genomes. As an example, Figure 1 shows the Burkholderia pangenome when comparing 25 complete Burkholderia genomes. There are 56 777 genes in this pangenome. The pangenome reflects all genes, including both the core and accessory genomes, and this concept has been expanded beyond strains of the same species to all levels of taxonomy, including the bacterial pangenome [19–21].

Figure 1:

Example pangenome from the Burkholderia genus. Comparative gene content of 25 genomes of Burkholderia. Beginning clockwise from the top of the circle, the outermost circle represents the 56 777 gene family Burkholderia pangenome, all genes (constant wedge width per gene) originating from the genome whose name is indicated outside the circle. Only the unique genes, absent (i.e. no orthologs) from any of the preceding genomes, are displayed (outer circle) such that each wedge indicates a new gene (protein) family. For all 25 inner circles, all genes are shown that match any gene family represented in the outer circles. The first genome is that of B. cenocepacia J2315 and proceeds inward to Burkholderia rhizoxinica. This order is the same clockwise as it is for outermost to innermost circles, and they are also colored/shaded accordingly. Genomes of different strains of the same species are colored/shaded the same, and genes are ordered with respect to position in the chromosomes and in the chromosome order. The small black lines between the outer circle and the J2315 ring are the core genes shared among all genomes.

Figure 1:

Example pangenome from the Burkholderia genus. Comparative gene content of 25 genomes of Burkholderia. Beginning clockwise from the top of the circle, the outermost circle represents the 56 777 gene family Burkholderia pangenome, all genes (constant wedge width per gene) originating from the genome whose name is indicated outside the circle. Only the unique genes, absent (i.e. no orthologs) from any of the preceding genomes, are displayed (outer circle) such that each wedge indicates a new gene (protein) family. For all 25 inner circles, all genes are shown that match any gene family represented in the outer circles. The first genome is that of B. cenocepacia J2315 and proceeds inward to Burkholderia rhizoxinica. This order is the same clockwise as it is for outermost to innermost circles, and they are also colored/shaded accordingly. Genomes of different strains of the same species are colored/shaded the same, and genes are ordered with respect to position in the chromosomes and in the chromosome order. The small black lines between the outer circle and the J2315 ring are the core genes shared among all genomes.

While the core genome for any species consists of genes that define that species, including all basic housekeeping functions required for survival, those genes appearing only in the accessory genome are more likely to be linked to strain-specific phenotypes. There remains an open question whether the pangenome for any given species is ‘open’ (i.e. will continue to increase as new strains are sequenced [16, 22]), or whether the pangenome is finite [22–24]. It is likely that for most species found in the environment, the pangenome remains open, given the constant opportunity for lateral gene transfer, whereas for some pathogens with limited lifestyles, their pangenome may indeed be essentially finite.

More recently, comparative genomics have been greatly influenced by the introduction and continuous rapid development of NGS. These technologies have displaced traditional Sanger sequencing due to their tremendous throughput (up to billions of reads and hundreds of billions of bases per run) accompanied by substantially lower per-base costs (reviewed in [25, 26]). Application of NGS to whole-genome sequencing and resequencing, has enabled the rapid construction of many nearly complete microbial genomes, drafts of eukaryotic genomes and even surveys of communities (metagenomics). As a result, obtaining genomic sequences is now a routine first step in the characterization of any given bacterium of interest. Indeed, there are currently over 1863 complete genomes that have been published with 6916 more in progress (http://www.genomesonline.org/cgi-bin/GOLD/bin/gold.cgi). In this article, we review some of the methods and discoveries of early pathogen genome comparisons, and focus on how current comparative genomics studies have been influenced by NGS technologies.

EVOLUTION OF COMPARATIVE GENOMIC STUDIES

Before the advent of NGS, whole-genome alignment studies dominated the field of comparative genomics. By aligning completed whole genomes, researchers obtained a global survey of all genetic differences including insertions and deletions (indels), single nucleotide polymorphisms (SNPs), as well as information on genome structure with respect to rearrangements (inversions, translocations, large deletions)[14, 27–30]. Using reference sequences as bait, microarrays have been used for comparative genome hybridization studies in order to explore differences within the known genic content found in sequenced strains, which are present or absent among diverse panels of closely related strains [31–34]. For example, a microarray analysis of 1199 chromosomal genes and 92 721 bp of the large virulence plasmid (pO157) in E. coli O157:H7 (STEC O157) revealed 906 SNPs in 523 chromosomal genes and a high level of DNA polymorphisms among the pO157 plasmids [35].

More recently, NGS techniques have allowed researchers to generate multiple draft genome sequences at once, further expanding pangenome studies. One study of the pangenome of six pathogenic isolates of Streptococcus agalactiae concluded that the core genome shared by all isolates account for ~80% of any single genome, and fitting their data to an exponential decay function calculated a minimal core genome of 1806 genes [16]. Another study of 17 E. coli genomes inferred a similar core genome size of ~2200 genes with an open pangenome numbering approximately 13 000 genes at the time [36]. It was surprising, however, that the E. coli core genome only accounts for ~17% of its pangenome and <50% of the average E. coli genome, which indicates the extreme genomic content variability of this species. Open pangenomes suggest that the species continue to actively evolve by gene acquisition, loss and diversification. In contrast, a study of Bacillus anthracis genomes concluded that B. anthracis may have a closed pangenome since the number of unique genes added to the pangenome became zero after the addition of only a fourth strain [16]. Unless this observation was simply due to an evolutionarily biased sampling of strains, this closed pangenome would suggest that the species have very limited opportunity for lateral gene acquisition, likely due to a restricted niche and lifestyle [17]. In this context, the definition and boundaries of bacterial species and the evolutionary age of the lineage in question would contribute significantly to the observation of a closed or open pangenome.

Whereas global gene content analysis (like in pangenome studies) provides insight into differences in functional potential and possible phenotypic differences among organisms, analyses of specific core genes have also been used for phylogenetic diversity studies. Because many pathogens are so closely related, SNPs in conserved core genes (ranging from a single gene to the core genome) have been used to discriminate and infer phylogenetic relationships between closely related pathogenic strains [37–41]. Such studies have helped precisely place pathogenic strains onto phylogenetic trees, and to estimate the amount of time since a clonal pathogenic lineage has diverged from its most recent common ancestor [29, 42–44].

The use of NGS for drafting many genomes can now quickly uncover SNPs, insertions and deletions by mapping unassembled reads against a well-annotated reference genome, and thus provide a list of possible gene differences that may be the basis for any functional variation among strains. However, such read-mapping strategies require a reference sequence highly similar to the one(s) being sequenced, such as when studying panels of pathogenic isolates of the same species. While mapping of NGS reads can also be used for studying pathogens present within metagenomic samples, a gene-based functional analysis of ortholog presence/absence is normally carried out for comparisons of more distantly related organisms. This analysis is typically performed after assembly into contigs, followed by gene-finding and annotation. While SNPs and indels have been known to inactivate genes involved in virulence, acquisition of novel functions has also been shown to be involved in bacterial pathogenicity [45].

VIRULENCE GENES, PATHOGENICITY AND EPIDEMIOLOGY

As mentioned above, bacteriophage and plasmids are common vehicles of lateral gene transfer and in some cases are main contributors to virulence, toxicity and antibiotic resistance. For example, Perna et al. [12] compared the genome of the hemorrhagic E. coli O157:H7 to the nonpathogenic laboratory K-12 strain in order to identify candidate genes responsible for pathogenesis and suggested phage-mediated lateral gene transfer as the main contributor of virulence. However, it is important to point out that, as always, the results of any given set of analyses should be interpreted within the context of the comparison, as evidenced 7 years later, by the extreme genomic content variability of the species when more genomes are compared [36]. Other examples of role of mobile elements in virulence include the toxin-carrying plasmids of B. anthracis [46], plasmids required for virulence in E. coli [47] and novel plasmids found in certain strains of Y. pestis that confer resistance to several antibiotics and may be actively transferred in the environment even among different species [48]. Analysis of the toxin cluster genes from Clostridium butyricum [49] showed that many recombination events have occurred, including several events within the Clostridium botulinum nontoxic-nonhemagglutinin(ntnh) gene. One such recombination event resulted in the integration of the C. botulinum neurotoxins serotype A subtype 1 (BoNT/A1) into the serotype toxin B ntnh multigene cluster, resulting in a successful lineage commonly associated with food borne botulism outbreaks. A separate study has suggested a mechanism of virulence evolution in actinobacteria that is based on the co-option of existing core actinobacterial traits, triggered by key host niche–adaptive lateral gene transfer events [50].

More subtle variations in the genome may also contribute to pathogenicity, including mutations that change protein sequences or affect protein abundance. Protein abundance may be influenced by any number of changes that do not directly alter the protein sequence, including mutations in: (i) transcription factors, (ii) ribosomal gene sequences and (iii) noncoding sequences that regulate the expression of the gene. In addition, Kudla et al. [51] showed that although synonymous mutations do not alter the encoded protein, even these can influence gene expression as much as 250-fold suggesting that, like nonsynonymous mutations, synonymous mutations may also lead to differences in pathogenicity in rare cases. A detailed review of the effects of synonymous mutations can be found in Ref. [52]. It is thus important to catalog all differences in genome comparisons in order to identify the underlying mechanism(s) involved in virulence.

In addition to providing a convenient method to identify genomic changes and lateral gene transfer events related to virulence and pathogenicity, comparative genomics has been successfully applied in vaccine development [53–55]. Compared with traditional protein-based vaccines, which may be effective for only a small subset of a pathogen species (e.g. strains of flu found in one geographic region), highly desirable universal vaccines may be developed based on comparative genomic studies. The design of a universal vaccine requires identification of surface proteins with epitope(s) shared among all strains in the same pathogen species. For example, a universal vaccine for group B. streptococcus (GBS) has been developed recently based on screening the core genome of eight GBS genomes for surface-exposed proteins [51].

Whereas comparative genomics has certainly led to advances in understanding the genetic underpinnings of virulence and aided in the development of universal vaccines for some species, other studies have focused more on determining the prevalence of pathogenic strains within natural reservoirs such as the human gastrointestinal tract (reviewed in [56]). A metagenomics approach coupled with a functional screen for targeting putative virulence, antibiotic resistance or other functions has also been used with complex samples. An example of this kind of study has been performed by Sommer et al. [57] to functionally characterize the antibiotic resistance reservoir in the microbial flora of healthy individuals. They cloned metagenomic DNA into an E. coli expression vector and selected antibiotic-resistant clones to better understand the immense diversity of antibiotic resistance machinery that has not yet been found using traditional methods.

Complimentary to studies investigating natural reservoirs of pathogens or of virulence genes, broad scale epidemiologic surveys of pathogens have also been a focus of comparative genomic research. For instance, 6449 human and avian isolates of influenza genome have been completely sequenced and have been made publicly available (see the National Institute of Health's Influenza Genome Sequencing Project: http://www.niaid.nih.gov/labsandresources/resources/dmid/gsc/influenza/). While such large-scale comparisons are currently only feasible for viruses, targeted epidemiologic sequencing of fast evolving markers such as variable number tandem repeats (VNTR) have been used for some time in pathogenic bacteria as epidemiological markers [44, 58, 59]. Price et al. [60] also used multilocus variable-number tandem repeat analysis (MLVA) to study the short-term evolution of B. pseudomallei isolates cultured from multiple body sites from patients and found that substantial divergence from the putative founder genotype had occurred even in a short period of infection in all patients.

COMPUTATIONAL CHALLENGES IN UTILIZING GENOMIC INFORMATION

The different data types generated by today's NGS technologies have required the implementation of novel algorithms that scale with the millions (to billions) of short read sequences (Table 1). These include recent efforts in fast alignment algorithms that map reads to reference genome(s) (e.g. Refs. [61, 62]), as well as in genome assembly and analysis (reviewed in [63]). Assembly of NGS data generally produces draft genomes with 10 s (for simple bacteria) to 1000 s (metagenomes and complex large eukaryotes) of contigs that can be compared with reference genomes in much the same way as whole-genome comparative studies [64, 65].

Table 1:

Some selected bioinformatics software and algorithms used with NGS data

 Software/algorithm References 
Indel and SNP detection and analysis GATK toolkit [66
SOAPsnp http://soap.genomics.org.cn/soapsnp.html 
SsahaSNP http://www.sanger.ac.uk/resources/software/ssahasnp 
SAMtools [67
SNAP http://www.hiv.lanl.gov/content/sequence/SNAP/SNAP.html 
Gene detection Prodigal [68
Glimmer3 [69
Genemark [70
MetaGene [71
tRNAscan-SE [72
Structural variation detection BreakDancer [73
VariationHunter [74
SVDetect [75
AGE [76
Mauve [77
Genome annotation RAST [78
IMG ER [79
Eragatis [80
DIYA [81
CloVR http://clovr.org/ 
RATT [82
Genome/sequence visualization IGV [83
SeqMonk http://www.bioinformatics.bbsrc.ac.uk/projects/seqmonk 
GBrowse http://gmod.org/wiki/GBrowse 
Tablet [84
ACT [85
RNA-seq assembly Cufflinks [86
Rnnotator [87
Oases http://www.ebi.ac.uk/~zerbino/oases 
Trans-ABySS [88
NGS assembly RAY [89
Velvet [90
Soapdenovo [91
Newbler [92
ABySS [93
Mapping/alignment BWA [62
bowtie [61
Novoalign [94
SOAP [95
MrsFAST [96
CloudBurst [97
BFAST [98
MUMer [99
MOSAIK http://bioinformatics.bc.edu/marthlab/Mosaik 
 Software/algorithm References 
Indel and SNP detection and analysis GATK toolkit [66
SOAPsnp http://soap.genomics.org.cn/soapsnp.html 
SsahaSNP http://www.sanger.ac.uk/resources/software/ssahasnp 
SAMtools [67
SNAP http://www.hiv.lanl.gov/content/sequence/SNAP/SNAP.html 
Gene detection Prodigal [68
Glimmer3 [69
Genemark [70
MetaGene [71
tRNAscan-SE [72
Structural variation detection BreakDancer [73
VariationHunter [74
SVDetect [75
AGE [76
Mauve [77
Genome annotation RAST [78
IMG ER [79
Eragatis [80
DIYA [81
CloVR http://clovr.org/ 
RATT [82
Genome/sequence visualization IGV [83
SeqMonk http://www.bioinformatics.bbsrc.ac.uk/projects/seqmonk 
GBrowse http://gmod.org/wiki/GBrowse 
Tablet [84
ACT [85
RNA-seq assembly Cufflinks [86
Rnnotator [87
Oases http://www.ebi.ac.uk/~zerbino/oases 
Trans-ABySS [88
NGS assembly RAY [89
Velvet [90
Soapdenovo [91
Newbler [92
ABySS [93
Mapping/alignment BWA [62
bowtie [61
Novoalign [94
SOAP [95
MrsFAST [96
CloudBurst [97
BFAST [98
MUMer [99
MOSAIK http://bioinformatics.bc.edu/marthlab/Mosaik 

For most genomes, NGS-based draft sequencing is relatively inexpensive and easy compared with the expense and difficulty of complete genome closure [100]. As a result, a larger proportion of studies with NGS include many strains of the same species to take advantage of reference genome mapping [101–103]. To provide adequate references for sequencing targets and for better understanding community composition in metagenomes, a number of efforts have begun to sample and sequence references, such as the Genomic Encyclopedia of Bacteria and Archaea [104] and the human microbiome effort ([20], http://www.hmpdacc-resources.org/cgi-bin/img_hmp/main.cgi). Despite these endeavors, characterization of complex samples for biosurveillance remains difficult due to the sheer diversity of microbes and thus insufficient reference genomes available.

Since our last review of software tools for comparative genomics [65], many new software and algorithms have been developed, some to simply deal with the large number of comparisons that would take too long using conventional algorithms, and some to cope with the constantly increasing amount of data by implementing parallelization. Other novel algorithms such as those for NGS assembly require vast amounts of memory and are accommodated only by a handful of highly specific hardware options. We have listed in Table 1 some of the growing number of computer software and algorithms that attempt to efficiently handle the ever-increasing amounts of data from NGS comparative genomics studies. In addition, the various flavors of comparative genomics each require their own set(s) of bioinformatics tools that are tailored to the specific goal(s) of the project. Standardization of these postprocessing programs has been difficult due to the varied nature and goals of projects, as well as the continued changes in types and amounts of NGS data.

Last, challenges remain with respect to visualization and integration of ‘omics’ data. Development or improvement of tools that enable the visualization of many genomes together with the underlying sequencing data is needed to enable annotation of gene function and track observed differences in those genes among genome sequences. Ideally, these same tools would incorporate multi-omic data such as transcriptome and metabolome data for simultaneous analysis. Although such an integrated multi-omic analysis platform does not yet exist, some of its components have already been developed. For example, information about metabolic pathways can be found in online databases [105–108]. A list of variable genes within a pangenome could thus be mapped onto metabolic pathways to reveal likely phenotypic differences among strains. Even genome-scale metabolic models have been developed and could be contrasted [109, 110].

Unfortunately, comparative genomic studies demand from researchers a computer science fluency that is not commonly found in experimental biologists. Most genomic NGS data processing tools work most efficiently via command line, rather than via graphic user interfaces (GUIs). Although this often fosters interaction and collaboration among scientists with different backgrounds, this basic knowledge and capability gap may exist for some time. Efforts have been undertaken, however, to make comparative genomics more accessible to experimental biologists. Intuitive workflow management environments with web-based GUIs, that essentially allow researchers to drag and drop a multitude of bioinformatics tools into a NGS workflow, are being developed and actively maintained (e.g. http://usegalaxy.org, [111]). This in turn, will help standardize data processing and analysis, along with file formats, which will only further increase the utility of such systems. In Figure 2, we provide an example workflow diagram for processing NGS data and list a number of tools that can be used for such purposes (also listed in Table 1).

Figure 2:

Diagram of possible NGS data processing workflows. When sequencing the genome of a phylogenetically distinct organism, one first assembles the data, followed by annotation and then comparative analysis with other genome(s) based on ortholog analysis. When a reference genome is available, more options are available for comparative analysis and often more detailed analyses are possible, including SNP discovery. Some of the tools commonly used for analysis steps are listed and can be found in.

Figure 2:

Diagram of possible NGS data processing workflows. When sequencing the genome of a phylogenetically distinct organism, one first assembles the data, followed by annotation and then comparative analysis with other genome(s) based on ortholog analysis. When a reference genome is available, more options are available for comparative analysis and often more detailed analyses are possible, including SNP discovery. Some of the tools commonly used for analysis steps are listed and can be found in.

FUTURE OUTLOOK

With the rapid and continued development of NGS technologies, new comparative genomic methods and algorithms are now being employed to study pathogens. Whole genome, or contig alignments, together with read-mapping strategies are commonly used to identify minute differences among isolates of a clonal lineage, including SNPs and indels and even rearrangements. Despite methodological differences due to the sequencing technologies, the goals of comparative genomics remain the same: (i) to identify all genomic differences among organisms by alignment of their genome sequences [112]; (ii) to understand the evolutionary history of the genome(s) based on the differences [113]; and (iii) to predict functional and phenotypic differences among organisms based on the genomic content and comparisons.

Realizing the potential of genomic comparisons will likely require the integration of various other omic data and metadata to begin associating gene expression, metabolic function and other phenotypes together with detailed information on the organism and its genome. Though not reviewed here, NGS has also been used for RNA sequencing (RNAseq) in order to assess global gene expression profiles [114–117]. For example, using RNAseq a number of potential B. cenocepacia genes, including novel small noncoding RNAs, were found as candidates for contributing to the colonization of the human lung of cystic fibrosis immunocompromized patients [118]. These types of comparative studies are becoming more common, and will require efficient tools for integrating with genomic and metadata and other tools that allow adequate visualization of NGS data together with nongenomic information.

Just as whole-genome sequencing changed the way biologists approach their research, this new NGS genomic era has revolutionized the way basic experiments are conducted. NGS has enabled a more comprehensive examination of human pathogens, their core genome, their evolutionary history, their mechanisms of virulence and their natural reservoirs, as well as potential reservoirs of additional virulence, toxin or antibiotic resistance genes. As the number of genome sequences continues to grow, so does the opportunity to compare chromosomal sequences and bacterial function across a wider taxonomic and phenotypic spectrum. This bourgeoning field of comparative metagenomics will certainly impact the field of microbial pathogenesis and host–pathogen interactions profoundly, and promise tremendous insights into the response and interaction of the human microbiome once humans are infected with pathogens.

Key Points

  • NGS throughput has greatly influenced comparative genomics and algorithm development but is beginning to outpace hardware/software improvements.

  • Studies comparing multiple strains of the same species have lead to the development of the ‘core’ and ‘pan'genomes concepts, which capture the spirit of a ‘species’ genome.

  • Metagenomics holds great future promise to understand pathogen populations, their evolution and their interaction with other members of a given community.

  • Development of new analysis tools that enable visualization and integration of 'omics' data is needed.

FUNDING

This study was supported in part by Los Alamos National Laboratory Laboratory-Directed Research and Development grants (numbers 20100034DR and 20110051DR); the US Department of Energy Joint Genome Institute through the Office of Science of the US Department of Energy (under Contract No. DE-AC02-05CH11231); the US Defense Threat Reduction Agency (contract numbers B104153I and B084531I).

References

1
Willenbrock
H
Hallin
PF
Wassenaar
TM
, et al.  . 
Characterization of probiotic Escherichia coli isolates with a novel pan-genome microarray
Genome Biol
 , 
2007
, vol. 
8
 pg. 
R267
 
2
You
YH
Wang
P
Wang
YH
, et al.  . 
Assessment of comparative genomic hybridization experiment by an in situ synthesized CombiMatrix microarray with Yersinia pestis vaccine strain EV76 DNA
Biomed Environ Sci
 , 
2010
, vol. 
23
 (pg. 
384
-
90
)
3
Gresham
D
Dunham
MJ
Botstein
D
Comparing whole genomes using DNA microarrays
Nat Rev Genet
 , 
2008
, vol. 
9
 (pg. 
291
-
302
)
4
Iyer
LM
Anantharaman
V
Wolf
MY
, et al.  . 
Comparative genomics of transcription factors and chromatin proteins in parasitic protists and other eukaryotes
Int J Parasitol
 , 
2008
, vol. 
38
 (pg. 
1
-
31
)
5
Sone
T
Kasahara
K
Kimura
H
, et al.  . 
Comparative analysis of epidermal growth factor receptor mutations and gene amplification as predictors of gefitinib efficacy in Japanese patients with nonsmall cell lung cancer
Cancer
 , 
2007
, vol. 
109
 (pg. 
1836
-
44
)
6
Hay
CW
Docherty
K
Comparative analysis of insulin gene promoters: implications for diabetes research
Diabetes
 , 
2006
, vol. 
55
 (pg. 
3201
-
13
)
7
Rauscher
S
Flamm
C
Mandl
CW
, et al.  . 
Secondary structure of the 3′-noncoding region of flavivirus genomes: comparative analysis of base pairing probabilities
RNA
 , 
1997
, vol. 
3
 (pg. 
779
-
91
)
8
Thomas
JW
Touchman
JW
Blakesley
RW
, et al.  . 
Comparative analyses of multi-species sequences from targeted genomic regions
Nature
 , 
2003
, vol. 
424
 (pg. 
788
-
93
)
9
Burroughs
AM
Iyer
LM
Aravind
L
Comparative genomics and evolutionary trajectories of viral ATP dependent DNA-packaging systems
Genome Dyn
 , 
2007
, vol. 
3
 (pg. 
48
-
65
)
10
Wang
X
Gowik
U
Tang
H
, et al.  . 
Comparative genomic analysis of C4 photosynthetic pathway evolution in grasses
Genome Biol
 , 
2009
, vol. 
10
 pg. 
R68
 
11
Yang
C
Rodionov
DA
Li
X
, et al.  . 
Comparative genomics and experimental characterization of N-acetylglucosamine utilization pathway of Shewanella oneidensis
J Biol Chem
 , 
2006
, vol. 
281
 (pg. 
29872
-
85
)
12
Perna
NT
Plunkett
G
3rd
Burland
V
, et al.  . 
Genome sequence of enterohaemorrhagic Escherichia coli O157:H7
Nature
 , 
2001
, vol. 
409
 (pg. 
529
-
33
)
13
Dobrindt
U
Hochhut
B
Hentschel
U
, et al.  . 
Genomic islands in pathogenic and environmental microorganisms
Nat Rev Microbiol
 , 
2004
, vol. 
2
 (pg. 
414
-
24
)
14
Chain
PS
Carniel
E
Larimer
FW
, et al.  . 
Insights into the evolution of Yersinia pestis through whole-genome comparison with Yersinia pseudotuberculosis
Proc Natl Acad Sci USA
 , 
2004
, vol. 
101
 (pg. 
13826
-
31
)
15
Medini
D
Donati
C
Tettelin
H
, et al.  . 
The microbial pan-genome
Curr Opin Genet Dev
 , 
2005
, vol. 
15
 (pg. 
589
-
94
)
16
Tettelin
H
Masignani
V
Cieslewicz
MJ
, et al.  . 
Genome analysis of multiple pathogenic isolates of Streptococcus agalactiae: implications for the microbial “pan-genome”
Proc Natl Acad Sci USA
 , 
2005
, vol. 
102
 (pg. 
13950
-
5
)
17
Bentley
S
Sequencing the species pan-genome
Nat Rev Microbiol
 , 
2009
, vol. 
7
 (pg. 
258
-
9
)
18
Li
R
Li
Y
Zheng
H
, et al.  . 
Building the sequence map of the human pan-genome
Nat Biotechnol
 , 
2010
, vol. 
28
 (pg. 
57
-
63
)
19
Lapierre
P
Gogarten
JP
Estimating the size of the bacterial pan-genome
Trends Genet
 , 
2009
, vol. 
25
 (pg. 
107
-
110
)
20
Nelson
KE
Weinstock
GM
Highlander
SK
, et al.  . 
A catalog of reference genomes from the human microbiome
Science
 , 
2010
, vol. 
328
 (pg. 
994
-
9
)
21
Tettelin
H
Riley
D
Cattuto
C
, et al.  . 
Comparative genomics: the bacterial pan-genome
Curr Opin Microbiol
 , 
2008
, vol. 
11
 (pg. 
472
-
7
)
22
Donati
C
Hiller
NL
Tettelin
H
, et al.  . 
Structure and dynamics of the pan-genome of Streptococcus pneumoniae and closely related species
Genome Biol
 , 
2010
, vol. 
11
 pg. 
R107
 
23
Lefebure
T
Bitar
PD
Suzuki
H
, et al.  . 
Evolutionary dynamics of complete Campylobacter pan-genomes and the bacterial species concept
Genome Biol Evol
 , 
2010
, vol. 
2
 (pg. 
646
-
55
)
24
Phillippy
AM
Deng
X
Zhang
W
, et al.  . 
Efficient oligonucleotide probe selection for pan-genomic tiling arrays
BMC Bioinformatics
 , 
2009
, vol. 
10
 pg. 
293
 
25
Mardis
ER
Next-generation DNA sequencing methods
Annu Rev Genomics Hum Genet
 , 
2008
, vol. 
9
 (pg. 
387
-
402
)
26
Shendure
JA
Porreca
GJ
Church
GM
Overview of DNA sequencing strategies
Curr Protoc Mol Biol
 , 
2008
pg. 
Chapter 7:Unit 7.1
 
27
Deng
W
Burland
V
Plunkett
G
3rd
, et al.  . 
Genome sequence of Yersinia pestis KIM
J Bacteriol
 , 
2002
, vol. 
184
 (pg. 
4601
-
11
)
28
Chain
PS
Comerci
DJ
Tolmasky
ME
, et al.  . 
Whole-genome analyses of speciation events in pathogenic Brucellae
Infect Immun
 , 
2005
, vol. 
73
 (pg. 
8353
-
61
)
29
Chain
PS
Hu
P
Malfatti
SA
, et al.  . 
Complete genome sequence of Yersinia pestis strains Antiqua and Nepal516: evidence of gene reduction in an emerging pathogen
J Bacteriol
 , 
2006
, vol. 
188
 (pg. 
4453
-
63
)
30
Eisen
JA
Heidelberg
JF
White
O
, et al.  . 
Evidence for symmetric chromosomal inversions around the replication origin in bacteria
Genome Biol
 , 
2000
, vol. 
1
  
RESEARCH0011
31
Read
TD
Peterson
SN
Tourasse
N
, et al.  . 
The genome sequence of Bacillus anthracis Ames and comparison to closely related bacteria
Nature
 , 
2003
, vol. 
423
 (pg. 
81
-
6
)
32
Cassat
JE
Dunman
PM
McAleese
F
, et al.  . 
Comparative genomics of Staphylococcus aureus musculoskeletal isolates
J Bacteriol
 , 
2005
, vol. 
187
 (pg. 
576
-
92
)
33
Yao
Y
Sturdevant
DE
Villaruz
A
, et al.  . 
Factors characterizing Staphylococcus epidermidis invasiveness determined by comparative genomics
Infect Immun
 , 
2005
, vol. 
73
 (pg. 
1856
-
60
)
34
Dziejman
M
Balon
E
Boyd
D
, et al.  . 
Comparative genomic analysis of Vibrio cholerae: genes that correlate with cholera endemic and pandemic disease
Proc Natl Acad Sci USA
 , 
2002
, vol. 
99
 (pg. 
1556
-
61
)
35
Zhang
W
Qi
W
Albert
TJ
, et al.  . 
Probing genomic diversity and evolution of Escherichia coli O157 by single nucleotide polymorphisms
Genome Res
 , 
2006
, vol. 
16
 (pg. 
757
-
67
)
36
Rasko
DA
Rosovitz
MJ
Myers
GS
, et al.  . 
The pangenome structure of Escherichia coli: comparative genomic analysis of E. coli commensal and pathogenic isolates
J Bacteriol
 , 
2008
, vol. 
190
 (pg. 
6881
-
93
)
37
Achtman
M
Zurth
K
Morelli
G
, et al.  . 
Yersinia pestis, the cause of plague, is a recently emerged clone of Yersinia pseudotuberculosis
Proc Natl Acad Sci USA
 , 
1999
, vol. 
96
 (pg. 
14043
-
8
)
38
Wu
M
Eisen
JA
A simple, fast, and accurate method of phylogenomic inference
Genome Biol
 , 
2008
, vol. 
9
 pg. 
R151
 
39
Ciccarelli
FD
Doerks
T
von Mering
C
, et al.  . 
Toward automatic reconstruction of a highly resolved tree of life
Science
 , 
2006
, vol. 
311
 (pg. 
1283
-
7
)
40
Morelli
G
Song
Y
Mazzoni
CJ
, et al.  . 
Yersinia pestis genome sequencing identifies patterns of global phylogenetic diversity
Nat Genet
 , 
2010
, vol. 
42
 (pg. 
1140
-
3
)
41
Pirone
L
Bragonzi
A
Farcomeni
A
, et al.  . 
Burkholderia cenocepacia strains isolated from cystic fibrosis patients are apparently more invasive and more virulent than rhizosphere strains
Environ Microbiol
 , 
2008
, vol. 
10
 (pg. 
2773
-
84
)
42
Larsson
P
Elfsmark
D
Svensson
K
, et al.  . 
Molecular evolutionary consequences of niche restriction in Francisella tularensis, a facultative intracellular pathogen
PLoS Pathog
 , 
2009
, vol. 
5
 pg. 
e1000472
 
43
Achtman
M
Morelli
G
Zhu
P
, et al.  . 
Microevolution and history of the plague bacillus, Yersinia pestis
Proc Natl Acad Sci USA
 , 
2004
, vol. 
101
 (pg. 
17837
-
42
)
44
Keim
P
Price
LB
Klevytska
AM
, et al.  . 
Multiple-locus variable-number tandem repeat analysis reveals genetic relationships within Bacillus anthracis
J Bacteriol
 , 
2000
, vol. 
182
 (pg. 
2928
-
36
)
45
Parkhill
J
Wren
BW
Thomson
NR
, et al.  . 
Genome sequence of Yersinia pestis, the causative agent of plague
Nature
 , 
2001
, vol. 
413
 (pg. 
523
-
7
)
46
Read
TD
Salzberg
SL
Pop
M
, et al.  . 
Comparative genome sequencing for discovery of novel polymorphisms in Bacillus anthracis
Science
 , 
2002
, vol. 
296
 (pg. 
2028
-
33
)
47
Johnson
TJ
Nolan
LK
Pathogenomics of the virulence plasmids of Escherichia coli
Microbiol Mol Biol Rev
 , 
2009
, vol. 
73
 (pg. 
750
-
74
)
48
Welch
TJ
Fricke
WF
McDermott
PF
, et al.  . 
Multiple antimicrobial resistance in plague: an emerging public health risk
PLoS One
 , 
2007
, vol. 
2
 pg. 
e309
 
49
Hill
KK
Xie
G
Foley
BT
, et al.  . 
Recombination and insertion events involving the botulinum neurotoxin complex genes in Clostridium botulinum types A, B, E and F and Clostridium butyricum type E strains
BMC Biol
 , 
2009
, vol. 
7
 pg. 
66
 
50
Letek
M
Gonzalez
P
Macarthur
I
, et al.  . 
The genome of a pathogenic rhodococcus: cooptive virulence underpinned by key gene acquisitions
PLoS Genet
 , 
2010
, vol. 
6
 pg. 
e1001145
 
51
Kudla
G
Murray
AW
Tollervey
D
, et al.  . 
Coding-sequence determinants of gene expression in Escherichia coli
Science
 , 
2009
, vol. 
324
 (pg. 
255
-
8
)
52
Plotkin
JB
Kudla
G
Synonymous but not the same: the causes and consequences of codon bias
Nat Rev Genet
 , 
2011
, vol. 
12
 (pg. 
32
-
42
)
53
Maione
D
Margarit
I
Rinaudo
CD
, et al.  . 
Identification of a universal Group B streptococcus vaccine by multiple genome screen
Science
 , 
2005
, vol. 
309
 (pg. 
148
-
50
)
54
Gaschen
B
Taylor
J
Yusim
K
, et al.  . 
Diversity considerations in HIV-1 vaccine selection
Science
 , 
2002
, vol. 
296
 (pg. 
2354
-
60
)
55
Pizza
M
Scarlato
V
Masignani
V
, et al.  . 
Identification of vaccine candidates against serogroup B meningococcus by whole-genome sequencing
Science
 , 
2000
, vol. 
287
 (pg. 
1816
-
20
)
56
Kaper
JB
Nataro
JP
Mobley
HL
Pathogenic Escherichia coli
Nat Rev Microbiol
 , 
2004
, vol. 
2
 (pg. 
123
-
40
)
57
Sommer
MO
Dantas
G
Church
GM
Functional characterization of the antibiotic resistance reservoir in the human microflora
Science
 , 
2009
, vol. 
325
 (pg. 
1128
-
31
)
58
Keim
P
Pearson
T
Okinaka
R
Microbial forensics: DNA fingerprinting of Bacillus anthracis (anthrax)
Anal Chem
 , 
2008
, vol. 
80
 (pg. 
4791
-
9
)
59
Keim
PS
Wagner
DM
Humans and evolutionary and ecological forces shaped the phylogeography of recently emerged diseases
Nat Rev Microbiol
 , 
2009
, vol. 
7
 (pg. 
813
-
21
)
60
Price
EP
Hornstra
HM
Limmathurotsakul
D
, et al.  . 
Within-host evolution of Burkholderia pseudomallei in four cases of acute melioidosis
PLoS Pathog
 , 
2010
, vol. 
6
 pg. 
e1000725
 
61
Langmead
B
Trapnell
C
Pop
M
, et al.  . 
Ultrafast and memory-efficient alignment of short DNA sequences to the human genome
Genome Biol
 , 
2009
, vol. 
10
 pg. 
R25
 
62
Li
H
Durbin
R
Fast and accurate short read alignment with Burrows-Wheeler transform
Bioinformatics
 , 
2009
, vol. 
25
 (pg. 
1754
-
60
)
63
Bao
S
Jiang
R
Kwan
W
, et al.  . 
Evaluation of next-generation sequencing software in mapping and assembly
J Hum Genet
 , 
2011
, vol. 
56
 (pg. 
406
-
14
)
64
Delcher
AL
Phillippy
A
Carlton
J
, et al.  . 
Fast algorithms for large-scale genome alignment and comparison
Nucleic Acids Res
 , 
2002
, vol. 
30
 (pg. 
2478
-
83
)
65
Chain
P
Kurtz
S
Ohlebusch
E
, et al.  . 
An applications-focused review of comparative genomics tools: capabilities, limitations and future challenges
Brief Bioinform
 , 
2003
, vol. 
4
 (pg. 
105
-
23
)
66
Gysel
C
Pediatrics and stomatology, from Hippocrates to our time
Chir Dent Fr
 , 
1990
, vol. 
60
 (pg. 
42
-
56
)
67
Li
H
Improving SNP discovery by base alignment quality
Bioinformatics
 , 
2011
, vol. 
27
 (pg. 
1157
-
8
)
68
Hyatt
D
Chen
GL
Locascio
PF
, et al.  . 
Prodigal: prokaryotic gene recognition and translation initiation site identification
BMC Bioinformatics
 , 
2010
, vol. 
11
 pg. 
119
 
69
Delcher
AL
Bratke
KA
Powers
EC
, et al.  . 
Identifying bacterial genes and endosymbiont DNA with Glimmer
Bioinformatics
 , 
2007
, vol. 
23
 (pg. 
673
-
9
)
70
Besemer
J
Lomsadze
A
Borodovsky
M
GeneMarkS: a self-training method for prediction of gene starts in microbial genomes. Implications for finding sequence motifs in regulatory regions
Nucleic Acids Res
 , 
2001
, vol. 
29
 (pg. 
2607
-
18
)
71
Noguchi
H
Park
J
Takagi
T
MetaGene: prokaryotic gene finding from environmental genome shotgun sequences
Nucleic Acids Res
 , 
2006
, vol. 
34
 (pg. 
5623
-
30
)
72
Schattner
P
Brooks
AN
Lowe
TM
The tRNAscan-SE, snoscan and snoGPS web servers for the detection of tRNAs and snoRNAs
Nucleic Acids Res
 , 
2005
, vol. 
33
 (pg. 
W686
-
9
)
73
Chen
K
Wallis
JW
McLellan
MD
, et al.  . 
BreakDancer: an algorithm for high-resolution mapping of genomic structural variation
Nat Methods
 , 
2009
, vol. 
6
 (pg. 
677
-
81
)
74
Hormozdiari
F
Hajirasouliha
I
Dao
P
, et al.  . 
Next-generation VariationHunter: combinatorial algorithms for transposon insertion discovery
Bioinformatics
 , 
2010
, vol. 
26
 (pg. 
i350
-
7
)
75
Zeitouni
B
Boeva
V
Janoueix-Lerosey
I
, et al.  . 
SVDetect: a tool to identify genomic structural variations from paired-end and mate-pair sequencing data
Bioinformatics
 , 
2010
, vol. 
26
 (pg. 
1895
-
6
)
76
Abyzov
A
Gerstein
M
AGE: defining breakpoints of genomic structural variants at single-nucleotide resolution, through optimal alignments with gap excision
Bioinformatics
 , 
2011
, vol. 
27
 (pg. 
595
-
603
)
77
Darling
AC
Mau
B
Blattner
FR
, et al.  . 
Mauve: multiple alignment of conserved genomic sequence with rearrangements
Genome Res
 , 
2004
, vol. 
14
 (pg. 
1394
-
1403
)
78
Aziz
RK
Bartels
D
Best
AA
, et al.  . 
The RAST Server: rapid annotations using subsystems technology
BMC Genomics
 , 
2008
, vol. 
9
 pg. 
75
 
79
Markowitz
VM
Mavromatis
K
Ivanova
NN
, et al.  . 
IMG ER: a system for microbial genome annotation expert review and curation
Bioinformatics
 , 
2009
, vol. 
25
 (pg. 
2271
-
8
)
80
Orvis
J
Crabtree
J
Galens
K
, et al.  . 
Ergatis: a web interface and scalable software system for bioinformatics workflows
Bioinformatics
 , 
2010
, vol. 
26
 (pg. 
1488
-
92
)
81
Stewart
AC
Osborne
B
Read
TD
DIYA: a bacterial annotation pipeline for any genomics lab
Bioinformatics
 , 
2009
, vol. 
25
 (pg. 
962
-
3
)
82
Otto
TD
Dillon
GP
Degrave
WS
, et al.  . 
RATT: Rapid Annotation Transfer Tool
Nucleic Acids Res
 , 
2011
, vol. 
39
 pg. 
e57
 
83
Robinson
JT
Thorvaldsdottir
H
Winckler
W
, et al.  . 
Integrative genomics viewer
Nat Biotechnol
 , 
2011
, vol. 
29
 (pg. 
24
-
26
)
84
Milne
I
Bayer
M
Cardle
L
, et al.  . 
Tablet–next generation sequence assembly visualization
Bioinformatics
 , 
2010
, vol. 
26
 (pg. 
401
-
2
)
85
Carver
TJ
Rutherford
KM
Berriman
M
, et al.  . 
ACT: the Artemis Comparison Tool
Bioinformatics
 , 
2005
, vol. 
21
 (pg. 
3422
-
3
)
86
Trapnell
C
Williams
BA
Pertea
G
, et al.  . 
Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation
Nat Biotechnol
 , 
2010
, vol. 
28
 (pg. 
511
-
5
)
87
Martin
J
Bruno
VM
Fang
Z
, et al.  . 
Rnnotator: an automated de novo transcriptome assembly pipeline from stranded RNA-Seq reads
BMC Genomics
 , 
2010
, vol. 
11
 pg. 
663
 
88
Robertson
G
Schein
J
Chiu
R
, et al.  . 
De novo assembly and analysis of RNA-seq data
Nat Methods
 , 
2010
, vol. 
7
 (pg. 
909
-
12
)
89
Boisvert
S
Laviolette
F
Corbeil
J
Ray: simultaneous assembly of reads from a mix of high-throughput sequencing technologies
J Comput Biol
 , 
2010
, vol. 
17
 (pg. 
1519
-
33
)
90
Zerbino
DR
Birney
E
Velvet: algorithms for de novo short read assembly using de Bruijn graphs
Genome Res
 , 
2008
, vol. 
18
 (pg. 
821
-
9
)
91
Li
R
Zhu
H
Ruan
J
, et al.  . 
De novo assembly of human genomes with massively parallel short read sequencing
Genome Res
 , 
2010
, vol. 
20
 (pg. 
265
-
72
)
92
Chaisson
MJ
Pevzner
PA
Short read fragment assembly of bacterial genomes
Genome Res
 , 
2008
, vol. 
18
 (pg. 
324
-
30
)
93
Simpson
JT
Wong
K
Jackman
SD
, et al.  . 
ABySS: a parallel assembler for short read sequence data
Genome Res
 , 
2009
, vol. 
19
 (pg. 
1117
-
23
)
94
Krawitz
P
Rodelsperger
C
Jager
M
, et al.  . 
Microindel detection in short-read sequence data
Bioinformatics
 , 
2010
, vol. 
26
 (pg. 
722
-
9
)
95
Li
R
Yu
C
Li
Y
, et al.  . 
SOAP2: an improved ultrafast tool for short read alignment
Bioinformatics
 , 
2009
, vol. 
25
 (pg. 
1966
-
7
)
96
Hach
F
Hormozdiari
F
Alkan
C
, et al.  . 
mrsFAST: a cache-oblivious algorithm for short-read mapping
Nat Methods
 , 
2010
, vol. 
7
 (pg. 
576
-
7
)
97
Schatz
MC
CloudBurst: highly sensitive read mapping with MapReduce
Bioinformatics
 , 
2009
, vol. 
25
 (pg. 
1363
-
9
)
98
Homer
N
Merriman
B
Nelson
SF
BFAST: an alignment tool for large scale genome resequencing
PLoS One
 , 
2009
, vol. 
4
 pg. 
e7767
 
99
Khan
Z
Bloom
JS
Kruglyak
L
, et al.  . 
A practical algorithm for finding maximal exact matches in large sequence datasets using sparse suffix arrays
Bioinformatics
 , 
2009
, vol. 
25
 (pg. 
1609
-
16
)
100
Chain
PS
Grafham
DV
Fulton
RS
, et al.  . 
Genomics. Genome project standards in a new era of sequencing
Science
 , 
2009
, vol. 
326
 (pg. 
236
-
7
)
101
Yoon
S
Xuan
Z
Makarov
V
, et al.  . 
Sensitive and accurate detection of copy number variants using read depth of coverage
Genome Res
 , 
2009
, vol. 
19
 (pg. 
1586
-
92
)
102
Dutilh
BE
Huynen
MA
Strous
M
Increasing the coverage of a metapopulation consensus genome by iterative read mapping and assembly
Bioinformatics
 , 
2009
, vol. 
25
 (pg. 
2878
-
81
)
103
Kozarewa
I
Ning
Z
Quail
MA
, et al.  . 
Amplification-free Illumina sequencing-library preparation facilitates improved mapping and assembly of (G+C)-biased genomes
Nat Methods
 , 
2009
, vol. 
6
 (pg. 
291
-
5
)
104
Wu
D
Hugenholtz
P
Mavromatis
K
, et al.  . 
A phylogeny-driven genomic encyclopaedia of Bacteria and Archaea
Nature
 , 
2009
, vol. 
462
 (pg. 
1056
-
60
)
105
Croft
D
O'Kelly
G
Wu
G
, et al.  . 
Reactome: a database of reactions, pathways and biological processes
Nucleic Acids Res
 , 
2011
, vol. 
39
 (pg. 
D691
-
7
)
106
Ferrer
M
The microbial reactome
Microb Biotechnol
 , 
2009
, vol. 
2
 (pg. 
133
-
5
)
107
Vastrik
I
D'Eustachio
P
Schmidt
E
, et al.  . 
Reactome: a knowledge base of biologic pathways and processes
Genome Biol
 , 
2007
, vol. 
8
 pg. 
R39
 
108
D'Eustachio
P
Reactome knowledgebase of human biological pathways and processes
Methods Mol Biol
 , 
2011
, vol. 
694
 (pg. 
49
-
61
)
109
Feist
AM
Henry
CS
Reed
JL
, et al.  . 
A genome-scale metabolic reconstruction for Escherichia coli K-12 MG1655 that accounts for 1260 ORFs and thermodynamic information
Mol Syst Biol
 , 
2007
, vol. 
3
 pg. 
121
 
110
Henry
CS
DeJongh
M
Best
AA
, et al.  . 
High-throughput generation, optimization and analysis of genome-scale metabolic models
Nat Biotechnol
 , 
2010
, vol. 
28
 (pg. 
977
-
82
)
111
Goecks
J
Nekrutenko
A
Taylor
J
Galaxy: a comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life sciences
Genome Biol
 , 
2010
, vol. 
11
 pg. 
R86
 
112
Bek-Thomsen
M
Tettelin
H
Hance
I
, et al.  . 
Population diversity and dynamics of Streptococcus mitis, Streptococcus oralis, and Streptococcus infantis in the upper respiratory tracts of adults, determined by a nonculture strategy
Infect Immun
 , 
2008
, vol. 
76
 (pg. 
1889
-
96
)
113
Kilian
M
Poulsen
K
Blomqvist
T
, et al.  . 
Evolution of Streptococcus pneumoniae and its close commensal relatives
PLoS One
 , 
2008
, vol. 
3
 pg. 
e2683
 
114
Han
Y
Qiu
J
Guo
Z
, et al.  . 
Comparative transcriptomics in Yersinia pestis: a global view of environmental modulation of gene expression
BMC Microbiol
 , 
2007
, vol. 
7
 pg. 
96
 
115
Galindo
CL
Sha
J
Moen
ST
, et al.  . 
Comparative global gene expression profiles of wild-type Yersinia pestis CO92 and its Braun lipoprotein mutant at flea and human body temperatures
Comp Funct Genomics
 , 
2010
pg. 
342168
 
116
Galindo
CL
Moen
ST
Kozlova
EV
, et al.  . 
Comparative analyses of transcriptional profiles in mouse organs using a pneumonic plague model after infection with wild-type Yersinia pestis CO92 and Its Braun lipoprotein mutant
Comp Funct Genomics
 , 
2009
, vol. 
2009
 pg. 
914762
 
117
Chao
Y
Vogel
J
The role of Hfq in bacterial pathogens
Curr Opin Microbiol
 , 
2010
, vol. 
13
 (pg. 
24
-
33
)
118
Yoder-Himes
DR
Chain
PS
Zhu
Y
, et al.  . 
Mapping the Burkholderia cenocepacia niche response via high-throughput sequencing
Proc Natl Acad Sci USA
 , 
2009
, vol. 
106
 (pg. 
3976
-
81
)