BacMet: antibacterial biocide and metal resistance genes database

Antibiotic resistance has become a major human health concern due to widespread use, misuse and overuse of antibiotics. In addition to antibiotics, antibacterial biocides and metals can contribute to the development and maintenance of antibiotic resistance in bacterial communities through co-selection. Information on metal and biocide resistance genes, including their sequences and molecular functions, is, however, scattered. Here, we introduce BacMet (http://bacmet.biomedicine.gu.se)—a manually curated database of antibacterial biocide- and metal-resistance genes based on an in-depth review of the scientific literature. The BacMet database contains 470 experimentally verified resistance genes. In addition, the database also contains 25 477 potential resistance genes collected from public sequence repositories. All resistance genes in the BacMet database have been organized according to their molecular function and induced resistance phenotype.


INTRODUCTION
Over the past decades, widespread use, misuse and overuse of antibiotics have accelerated the development of antibiotic resistance (1). The increasing ability of pathogenic bacteria to survive antibiotic therapy has thus been recognized as one of the largest threats to public health (2). A particular concern is that bacteria are becoming resistant to more than one antibiotic compound-so-called multidrug resistance. This is either due to co-selection of multiple resistance genes within a single bacterial cell (e.g. multiple plasmids or arrays of genes on a single plasmid) and/or due to the presence of resistance genes (e.g. efflux pumps) with a broad substrate range (3).
Biocides and metals have been used as antibacterial agents for centuries. For instance, copper (Cu) and silver (Ag) vessels have been used for maintaining quality in stored potable water (4). Various metal salts such as copper, magnesium (Mg), mercury (Hg), tellurium (Te), arsenic (As) and gold (Au) have been used for treating infectious diseases such as leprosy, tuberculosis, gonorrhoea and syphilis (5)(6)(7)(8)(9). More recently metals such as copper and silver have become widely used in household products, and disinfectants are regularly used in, for example, hospitals, industries and elsewhere to sanitize equipment (10)(11). Copper, zinc and, to a lesser extent, cadmium (Cd) and arsenic are also used as animal growth promoters in certain regions, possibly mediated by their effects on the animal gut flora (12)(13)(14)(15). In aquaculture and agriculture, practices of metal-and biocide-containing products in feed additives, organic and inorganic fertilizers, pesticides and antifouling products are also well known worldwide, resulting in enrichment of aquaculture sediments and river water with metals such as zinc, copper, cadmium and lead (Pb) (16)(17)(18). All such uses potentially create venues for biocide and metal resistance development, but also for increased antibiotic resistance as a side effect of bactericide usage. The EU Scientific Committee for Emerging and Newly Identified Health Risks recently concluded that exposure of bacteria to biocides and metals may boost the spread of antibiotic resistance (19)(20). This is a parallel situation to the development of multidrug resistance, resulting from the presence of resistance genes for both antibiotics and biocides/metals located in close genetic vicinity of each other on mobile genetic elements, or by the presence of common resistance mechanisms with broader specificity (21)(22)(23)(24)(25)(26).
There is, however, still a substantial knowledge gap and a lack of easily accessible organized data resources on biocide and metal resistance for exploring these sources of antibiotic resistance development and understanding the co-and cross-resistance mechanisms between antibiotics, biocides and metals. Currently, databases of antibiotic resistance genes exist, such as the Antibiotic Resistance Genes Database (27) and the Comprehensive Antibiotic Resistance Database (28). The Arthropod Pesticide Resistance Database (http://www. pesticideresistance.com/) and the Insecticide Resistance database (29) are examples of biocide resistance databases mainly focusing on genes in animals and plants. However, there are no specialized databases available for bacterial resistance genes to biocides or metals.
To facilitate research on co-selection, there is a need for a database on bacterial biocide-and metal-resistance genes with highly reliable content. With the growing use of biocides and metals as food preservatives, disinfectants and within antibacterial coatings, it is becoming increasingly important to understand mechanisms behind bacterial tolerance to such substances, regardless of any potential co-selection for antibiotic resistance. We have therefore created a manually curated collection of antibacterial biocide and metal resistance genes with experimentally confirmed function described in the scientific literature. Additionally, we have predicted antibacterial biocide and metal resistance genes based on sequence data available in different public repositories, such as NCBI GenBank (30), NCBI nonredundant protein database (31) and UniprotKB (32). The creation of BacMet-the Antibacterial Biocide and Metal Resistance Genes Database-is anticipated to facilitate research on single, co-and cross-resistance by enabling rapid screening of genomes and metagenomes. BacMet is freely available at http://bacmet.biomedicine.gu.se.

Data collection
Only bacterial genes experimentally confirmed to be involved in resistance/tolerance were collected through an extensive literature search in PubMed (31) using a variety of terms relating to biocide and metal resistance. This was followed by searching the NCBI nonredundant protein database, UniprotKB and the Gene Ontology (33) database to identify the corresponding annotated biocideand metal-resistance genes within these databanks, including references. Special attention was given to the listed references in each collected paper and the review papers in the biocide and metal resistance area to find primary articles describing yet other resistance genes. Each article was manually reviewed to identify the presence of targeted genes, any related genes, crossresistance and experimental procedures and compounds used to verify the resistance phenotype. In a few cases, we found experimental evidence of resistance genes from the literature but could not include these genes in the database owing to lack of data submission to public databases.
A gene was considered to have experimentally confirmed resistance function only if (i) the removal/mutation or insertion/overexpression of a single gene of interest into the genome of a bacterial host, or into an inserted plasmid, resulted in an increased or decreased susceptibility to the biocide/metal, respectively; or (ii) insertion of a plasmid lacking the gene of interest showed increased susceptibility compared with insertion of the same plasmid carrying the gene. We have also included genes in operons, where the operon is experimentally confirmed to be involved in resistance, but where evidence for a resistance function of each individual component gene is lacking. The approach of identifying functional resistance genes (such as efflux pumps, chaperones and enzymes), by insertion, deletion or mutation of a particular target gene, has the potential to also identify regulators of such resistance genes. In many cases, such regulators are part of the same operon or resistance determinant gene cluster as the functional resistance gene, and are essential for resistance. In BacMet, to provide an overall picture of these resistance systems, we have included both types of genes with a note on whether they are regulators or functional resistance genes. Generally, the resistance phenotype conferred by regulators is highly dependent on expression level and context, thus a resistance phenotype should be not inferred only by their presence in a genome. Subsequently, metadata, such as source organism, location (plasmid/chromosomal) and gene description, were collected from NCBI GenBank and nonredundant protein database, UniprotKB and the Transporter Classification Database (34) and included into the database covering genes with experimentally confirmed function.
The scope of the database was to include resistance genes to metals as well as to antibacterial biocides. We have included genes providing resistance/tolerance to both essential metals (such as Fe, Mn, Cu, Co and Mg) and those that are only toxic (such as As, Cd and Hg), as all metals become toxic at elevated concentrations and thus can exert a selection pressure. Biocides were identified from a list of active substances for use in biocidal products on the EU market, i.e. European Commission Regulation (EC) No 1451/2007 (http://eur-lex.europa.eu/LexUriServ/ site/en/oj/2007/l_325/l_32520071211en00030065.pdf). As this list is not complete, any compound for which we could find evidence for an intentional use to kill or prevent growth of bacteria (biocidal use) were classified as 'biocides'. Classical antibiotic compounds were excluded, as resistance genes to antibiotics are covered in other databases such as the Antibiotic Resistance Genes Database and the Comprehensive Antibiotic Resistance Database; however, genes providing resistance to both biocides and antibiotics were included. The search for antibacterial biocide resistance genes also resulted in the identification of resistance genes to other chemical compounds that are toxic to bacteria, but are not generally used as biocides, such as dyes, organic solvents, etc. In principle, such compounds do also have the potential to co-select for antibiotic resistance, and we have therefore included genes conferring resistance to those compounds in the database, but under the heading 'other compounds'.

Curation of derived data
Initially, a core data set of 470 experimentally confirmed genes was created (referred to as 'BacMet Experimentally Confirmed database') based on 421 articles from PubMed.
The resistance genes in the experimentally confirmed database are biased toward model bacteria, since those are highly overrepresented in microbiology laboratory experiments (e.g. Escherichia coli and Pseudomonas aeruginosa). Thus, a large portion of resistance gene variants is almost certainly missing, as the resistance genes may differ between species and/or occur in different forms that are not (yet) experimentally investigated. Therefore, the experimentally confirmed data set of 470 resistance genes was used as a reference data set to obtain bacterial gene/protein sequences predicted to be conserved and have similar functions based on high sequence similarity. The predicted genes were selected from the NCBI nonredundant protein database by similarity searches using BLAST (version 2.2.25) (35), forming the larger 'BacMet Predicted database'. Here, a multistage filtering was performed to collect homologous sequences using the reference data set. Initially, a fixed E-value cutoff of 0.01 (to obtain 'significant' hits) was set, followed by a fixed coverage cutoff of 80% (to avoid inclusion of homologous sequences with good E-values solely due to virtually identical, but short, alignments between queries and target sequences) for all sequence matches. Thereafter, a 90% fixed identity cutoff was used for protein sequences originating from plasmids, while manual cutoffs (Supplementary  Table S1) were determined for each individual chromosomal-borne sequence (as the chromosomal gene sequences vary substantially between species and occur in different forms). During the manual identity cutoff step, for every experimentally confirmed gene we accepted similar sequences down to an identity cutoff value where the gene/protein annotation was still relevant to the source experimental gene, including conserved hypothetical proteins. It should be noted that the derivation of appropriate identity cutoffs is a partially subjective decision, based on previous knowledge of sequence similarity within groups of proteins sharing the same function. Redundancy was removed by clustering all extracted homologous protein sequences with 100% identity cutoff using USEARCH (version 5.2.32) (36) to form the large database (referred to as 'BacMet Predicted database').

Database contents
From 421 scientific articles from PubMed, we identified 470 bacterial genes experimentally confirmed to be involved in resistance/tolerance to biocides, metals and other similar chemical compounds. From these genes, 133 211 predicted sequences were extracted, generating a set of 25 477 nonredundant protein sequences included in the 'BacMet Predicted database' ( Table 1). The 470 experimentally confirmed resistance genes listed in BacMet are shown to confer resistance to 84 different compounds. These include 20 metals, 41 biocides, i.e. chemicals used with the purpose of killing bacteria or preventing bacterial growth, and 23 'other chemical compounds', verified as toxic to bacteria. The compounds have been listed by their names and classes in the BacMet database based on reviewed articles from PubMed, in which researchers used them to verify phenotypic changes governed by the presence of specific genes in laboratory experiments.
Out of these 470 genes, 29 genes had a verified resistance phenotype for both biocides and metals. For metals, the highest number of resistance genes was found for copper (60 genes), followed by zinc (58 genes) and nickel (51 genes), whereas the biocide associated with most genes was acriflavine (46 genes) followed by benzylkonium chloride (BAC) (41 genes) and sodium dodecyl sulfate (39 genes). Summaries of the resistance gene findings are presented in Figures 1 and 2.

Database schema
The BacMet workflow and architecture is described in Figure 3. The BacMet databases are built on Apache server 2.2.3 (http://www.apache.org). The database tables are stored in MySQL server 5.0.77 relational databases (http://www.mysql.com) and operated on RedHat Linux systems. The web interface was created in HTML using PHP, Java and Cascading style sheets. The Perl programming language (http://www.perl.org) along with the CGI, DBI and DBD modules were used for interconnection between the two databases, to retrieve the BacMet data from the MySQL databases, and to present result outputs in a web interface.

Browsing and searching database
A tutorial has been included on the BacMet webpage, which describes database access, data extraction, browsing and searching the database, as well as how to submit data. The BacMet databases can be explored in the web interface through the 'browse' option, where one can browse resistance genes and chemical compounds. Resistance genes can be browsed by individual compounds, by the chemical classes they provide resistance to or by their name. Users can also browse all the chemical compounds in BacMet to obtain more information such as general uses and chemical classes along with an external link to a chemical database of biological interest such as ChEBI (37), ChEMBL (38) or PubChem (39). Alternatively, one may use the 'search' function to perform a quick search for any term in all fields in both databases separately, including, for example, gene names, names of biocides, chemical classes and names of metals. It is also possible to search specifically for e.g. plasmid or chromosomal genes or genes resistance to certain chemical classes using the 'advanced search' function.
The output for browsing and search results from the two databases is slightly different. For the predicted database output, most of the information is externally linked to the NCBI protein database, to allow the user to acquire more information of the genes depending on his or her interests. As the two databases are interconnected, users have the opportunity to obtain information on the source of every gene in the BacMet predicted database from the experimentally confirmed database. Alternatively, users also can obtain all information on predicted genes with similar names by a single click on experimental confirmed genes.

BLAST search engine
A BLAST search web interface has been implemented to query the BacMet databases. A modified stand-alone version of the BLAST program (NCBI wwwblast version 2.2.2) (31) has been implemented on the BacMet web server for similarity searches against the BacMet sequence databases. The BLAST query can be performed against both BacMet databases using nucleotide and protein sequences as input. BLAST hit records can be used to obtain more information about the matched sequences in the BacMet database.

Data download
All the data included in the BacMet databases are available to download for offline analysis. All the data files, including protein sequences in FASTA format, can be downloaded in compressed form (.zip format), or the user can choose to download individual files as required. Along with the data from the BacMet database, users can also download in-house developed software to analyze genomes and metagenomes for screening biocide and metal resistance genes locally.

Data submission
A web-interface has been included for users to submit their data on resistance genes that are not listed in the BacMet database, along with the information of experimental support in the scientific literature. The submitted data will be added to the BacMet database after manual curation.

DISCUSSION AND CONCLUSION
We here present a collection of antibacterial biocide and metal resistance genes that have experimental evidence, along with a larger set of genes with computationally predicted resistance function. To our knowledge, this is the first comprehensive resource of antibacterial biocide and metal resistance genes. Along with this, we provide in-house-developed software to analyse genomes and metagenomes locally, to enable screening for biocideand metal-resistance genes in large-scale sequence data sets.
Although we have performed an extensive search to extract as many experimentally confirmed resistance genes as possible, we believe that we may still have missed some genes. Therefore, we encourage researchers to submit their data on experimentally confirmed resistance genes function, along with all related information. We also believe, due to the cross-resistance characteristic, some resistance genes in BacMet have the potential to confer resistance to various other biocide and metal compounds that have not been experimentally verified yet. Therefore, the list of compounds in BacMet, and every biocides and other compounds in the experimentally confirmed database. Some of the included genes are represented in more than one category. The figure reflects the most well-studied compounds, although the actual substrate range is likely to be much broader for many genes.  individual gene resistant toward these compounds, will be expanded when new experimental data are available. Importantly, phenotypic resistance to antibacterial biocides is described for a much larger range of compounds than those included in the database (40)(41)(42). This suggests that there are many more resistance genes to be discovered. We plan to update the BacMet database continuously with newly available experimental work and novel resistance gene information from the scientific literature and data submitted by users.
We believe that BacMet will facilitate research to understand co-and cross-resistance of biocide and metals to antibiotics within bacterial genomes, as well as in complex microbial communities (metagenomes) from different environments. The database will also facilitate gene annotation of new genomes and metagenomes. All in all, this may help us to better understand the bacterial resistome. Because biocides and metals are nowadays widely used as food preservatives, disinfectants and on antibacterial coatings for medical devices and health care facilities, as well as in residential and industrial settings, tolerance to these compounds by bacteria is a growing and upcoming concern as well. Hence, this database will not only allow microbiologists and toxicologists to address antibiotic resistance development in a more comprehensive way in the future, but it also may constitute an important resource for manufacturers of metal surfaces and coatings, biocide manufacturing companies and producers of food preservatives to understand the development of tolerance mechanisms to their products and possibly manage the use of metals and biocides in a more sustainable way.