-
PDF
- Split View
-
Views
-
Cite
Cite
Steffen Priebe, Christian Kreisel, Fabian Horn, Reinhard Guthke, Jörg Linde, FungiFun2: a comprehensive online resource for systematic analysis of gene lists from fungal species, Bioinformatics, Volume 31, Issue 3, February 2015, Pages 445–446, https://doi.org/10.1093/bioinformatics/btu627
- Share Icon Share
Abstract
Summary : Systematically extracting biological meaning from omics data is a major challenge in systems biology. Enrichment analysis is often used to identify characteristic patterns in candidate lists. FungiFun is a user-friendly Web tool for functional enrichment analysis of fungal genes and proteins. The novel tool FungiFun2 uses a completely revised data management system and thus allows enrichment analysis for 298 currently available fungal strains published in standard databases. FungiFun2 offers a modern Web interface and creates interactive tables, charts and figures, which users can directly manipulate to their needs.
Availability and implementation: FungiFun2, examples and tutorials are publicly available at https://elbe.hki-jena.de/fungifun/ .
Contact : [email protected] or [email protected]
1 INTRODUCTION
Fungi form an extremely diverse kingdom of organisms with different lifestyles and interesting human applications ( Blackwell, 2011 ). Fungi are not only important to produce food but also produce bioactive compounds known as secondary metabolites ( Brakhage, 2013 ), which are important for the pharmaceutical and chemical industries. On the other hand, there are many pathogenic fungi that destroy crops and infect humans. The growing amount of omics data from the fungal community will help to identify virulence factors as well as interesting bioactive compounds. Enrichment analysis is often applied along with omics data analysis. Here candidate genes/proteins are assigned to categories from structured vocabularies (ontologies). Afterward, statistical tools help to identify those categories that are significantly enriched with the given candidates. These enriched categories may represent molecular functions, pathways or cellular locations most affected by the experiment. A number of easy-to-use online tools exist, e.g. YeastMine ( Balakrishnan et al. , 2012 ), and are reviewed in Huang et al. , 2009 . However, no user-friendly online tool for the systematic analysis of long candidate lists existed for most fungal species.
Our group implemented the tool FungiFun ( Priebe et al. , 2010 ) supporting enrichment analysis for 28 species with a focus on fungal pathogens. In this article, we present the novel tool FungiFun2, which allows the systematic analysis of candidate lists from all the currently available fungal strains published in standard databases ( Fig. 1 ). Users can choose from 298 strains of 240 species. For data collection, FungiFun2 uses a semi-automatic procedure, which downloads gene to category associations and annotations (names and functions) from online databases. This procedure allows the database to be kept up-to-date and simplifies the addition of further species. In comparison to the previous version, which worked with flat files for annotations, FungiFun2 parses annotation into a standardized database allowing higher data connectivity and flexibility, e. g. alternative input identifiers (IDs), gene annotation and complex search queries. Finally, FungiFun2 offers a modern and user-friendly interface.

Overview of FungiFun2 functionality. ( A ) With help of a semi-automatic procedure, gene to category associations for three ontologies as well as gene names and functions are downloaded. The numbers in the Venn diagram indicate the number of available strains. ( B ) The user selects a strain and ontology. ( C ) On the Web server, gene to category association and significance tests are performed. ( D ) Schematic visualization of the output (dynamical figures, charts and tables) is shown
2 METHODS AND IMPLEMENTATION
Figure 1 A illustrates the three functional ontologies that are integrated into FungiFun2, i. e. Gene Ontology (GO; Ashburner et al. , 2000 ), Kyoto Encyclopedia of Genes and Genomes (KEGG; Kanehisa and Goto, 2000 ) and Functional Catalogue (FunCat; Rüpp et al. , 2004 ). FunCat gene to category associations were downloaded from MIPS through the PEDANT database ( Walter et al. , 2009 ). For GO, several data source have been used: Candida Genome Database (CGD; Inglis et al. , 2012 ), Aspergillus Genome Database (AspGD; Cerqueira et al. , 2014 ), Saccharomyces Genome Database (SGD; Cherry et al. , 2012 ), UniProt-GOA-project at European Bioinformatics Institute (EBI) and Ensembl Fungi ( Kersey et al. , 2010 ). Additionally, we included GO gene to category associations by applying Blast2GO ( Conesa et al. , 2005 ). To do so, proteomes were obtained from BROAD, NCBI or in-house data ( Schwartze et al. , 2014 , Linde et al. , 2014 ). Finally, KEGG gene to pathway associations were obtained from the KEGG FTP server.
With the help of a semi-automatic procedure, all available strains in the used databases are listed. Each strain may have different data sources, where the preferred version needs to be manually selected. Afterward, flat files are automatically downloaded and parsed into a MySQL database using Python and R scripts. These scripts guarantee that the database stays up-to-date with only small effort. Currently, ontologies formed by FunCat, KEGG and GO were downloaded from nine different sources. Primarily obtained from EBI, GO gene/protein to category association is available for 258 strains. FunCat gene/protein to category association is available for 180 strains. Finally, KEGG pathway association is available for 71 strains.
Figure 1 B illustrates main features of the user interface. To run FungiFun2, users need to chose a strain, select an ontology, supply the tool with a list of candidate IDs and choose a P -value cutoff. After strain selection, the user may check for available (alternative) IDs. Only those ontologies can be used for which annotation is currently available. Advanced options allow for alternative P -value calculations and multiple test corrections, for upload of a background list, for in/exclusion of categories and for the selection of GO evidence codes.
Figure 1 C illustrates main aspects for the calculation of enriched categories as well as results, graphs and tables. On the server side, a PHP script parses user input, controls calculations of statistics, graphs and tables and finally creates data for the result page. P -values indicating the significance of the enrichment are calculated with Fisher’s exact test or hypergeometric test. Multiple test correction may be performed, e.g. via FDR ( Benjamini and Hochberg, 1995 ). The R-package RamiGO ( Schröder et al. , 2013 ) is used to visualize significantly enriched GO categories within the GO hierarchy. Bar, pie and column charts are created with help of the JavaScript library Highcharts , whereas customizable result tables are created with JavaScript library DataTables .
Figure 1 D illustrates parts of the results of a FungiFun2 run. Each output can be customized directly in the Web interface as well as downloaded in commonly used formats. The number of enriched categories as well as the number of genes within enriched and non-enriched categories give an overview of the results. Specific pie and bar charts allow users to visualize the number of genes in the significant categories compared with the number of genes in the input list. Finally, graphs highlighting enriched categories within the hierarchies of the ontologies are available. Results are displayed in tables focusing on categories or genes, which can be interactively rearranged and filtered.
Funding : J.L. and S.P. were supported by the Deutsche Forschungsgemeinschaft (DFG) CRC/Transregio 124 ‘Pathogenic fungi and theirhuman host: Networks of interaction’, subproject INF.
Conflict of interest : none declared.
REFERENCES
Author notes
Associate Editor: Janet Kelso