-
PDF
- Split View
-
Views
-
Cite
Cite
Ralf J. M. Weber, Eva Li, Jonathan Bruty, Shan He, Mark R. Viant, MaConDa: a publicly accessible mass spectrometry contaminants database, Bioinformatics, Volume 28, Issue 21, November 2012, Pages 2856–2857, https://doi.org/10.1093/bioinformatics/bts527
Close - Share Icon Share
Abstract
Mass spectrometry is widely used in bioanalysis, including the fields of metabolomics and proteomics, to simultaneously measure large numbers of molecules in complex biological samples. Contaminants routinely occur within these samples, for example, originating from the solvents or plasticware. Identification of these contaminants is crucial to enable their removal before data analysis, in particular to maintain the validity of conclusions drawn from uni- and multivariate statistical analyses. Although efforts have been made to report contaminants within mass spectra, this information is fragmented and its accessibility is relatively limited. In response to the needs of the bioanalytical community, here we report the creation of an extensive manually well-annotated database of currently known small molecule contaminants.
Availability: The Mass spectrometry Contaminants Database (MaConDa) is freely available and accessible through all major browsers or by using the MaConDa web service http://www.maconda.bham.ac.uk.
Contact: m.viant@bham.ac.uk
Supplementary information: Supplementary data are available at Bioinformatics online.
1 INTRODUCTION
Our understanding of biological systems has considerably improved through recent developments in mass spectrometry (MS)-based metabolomics (Dettmer et al., 2007; Patti et al., 2012). Continuous efforts have been made to improve the quality of metabolome measurements, including in sample preparation (Villas-Boas et al., 2005), data collection (Dunn et al., 2011; Weber et al., 2011) and data analysis (Dunn et al., 2012; Weber et al., 2011). Nonetheless, sample preparation methods and MS analyses have the potential to introduce contaminants, such as plasticizers, additives and solvents (Keller et al., 2008). Such contaminants of laboratory origin can obscure or even falsify biological interpretation of the data. For example, when using univariate or multivariate statistical analyses for biomarker discovery, the conclusions of that study can be fundamentally flawed if signals remain unidentified and are later discovered to be exogenous chemicals. Several analytical methods have been reported to minimize the interference caused by MS contaminants. Despite these improvements, contaminants are still a major problem in MS experiments. Improved methods to identify and then treat contaminants appropriately are required urgently. Although in most cases identified contaminants should be eliminated from datasets, occasionally they can be beneficial, for example, when used for internal mass calibration of spectra (Scheltema et al., 2008).
Here we present a manually well-annotated database of currently known MS contaminants to assist both the metabolomics and bioanalytical chemistry communities in their data processing.
2 METHODS AND IMPLEMENTATION
The information contained in Mass spectrometry Contaminants Database (MaConDa) is based on published literature (e.g. scientific papers, notes and hand books; see Supplementary Material) and data provided by several colleagues and instrument manufacturers, including the extensive resource by Keller et al. (2008). The raw data have been manually searched and curated, and parsed to a relational MySQL 5.0 database using Python 2.7.2 scripts. Theoretical and/or experimental data were stored for each MS contaminant, such as name, type of contaminant (e.g. plasticizer, detergent or buffer; see Table 1), empirical formula and details regarding the MS platform used (e.g. ion trap, triple quadrupole or time-of-flight MS). Theoretical mass values calculated to six decimal places and observed mass values for related ion forms (e.g. [M + H]+, [M + Na]+ and [M − H]-) were also recorded, along with the original reference. Finally, cross references (e.g. PubChem Compound Identifier and Standard InChI code) were added for all MS contaminants when available. A user-friendly web interface to search the database has been implemented in PHP 5.3.3 JavaScript and deployed on an Apache 2.2 web server. The NuSOAP PHP library (http://sourceforge.net/projects/nusoap/) has been used to create a Web service, trivializing the integration of MaConDa into existing MS data analysis workflows. The Web service has been tested using a Simple Object Access Protocol (SOAP) Python client (https://fedorahosted.org/suds/) and Taverna (Hull et al., 2006).
Examples of contaminants within MaConDa and that are commonly observed in mass spectra
| ID . | Exact mass . | Name . | Formula . | Type . |
|---|---|---|---|---|
| CON00019 | 278.15183 | Dibutyl phthalate | C16H22O4 | Plasticizer |
| CON00053 | 281.27185 | Oleamide | C18H35NO | Slip agent |
| CON00103 | 82.00308 | Sodium acetate | C2H3O2Na | Solvent |
| CON00121 | 121.07389 | TRIS | C4H11NO3 | Buffer |
| CON00298 | 189.04259 | 4-HCCA | C10H7NO3 | Matrix compound |
| ID . | Exact mass . | Name . | Formula . | Type . |
|---|---|---|---|---|
| CON00019 | 278.15183 | Dibutyl phthalate | C16H22O4 | Plasticizer |
| CON00053 | 281.27185 | Oleamide | C18H35NO | Slip agent |
| CON00103 | 82.00308 | Sodium acetate | C2H3O2Na | Solvent |
| CON00121 | 121.07389 | TRIS | C4H11NO3 | Buffer |
| CON00298 | 189.04259 | 4-HCCA | C10H7NO3 | Matrix compound |
Examples of contaminants within MaConDa and that are commonly observed in mass spectra
| ID . | Exact mass . | Name . | Formula . | Type . |
|---|---|---|---|---|
| CON00019 | 278.15183 | Dibutyl phthalate | C16H22O4 | Plasticizer |
| CON00053 | 281.27185 | Oleamide | C18H35NO | Slip agent |
| CON00103 | 82.00308 | Sodium acetate | C2H3O2Na | Solvent |
| CON00121 | 121.07389 | TRIS | C4H11NO3 | Buffer |
| CON00298 | 189.04259 | 4-HCCA | C10H7NO3 | Matrix compound |
| ID . | Exact mass . | Name . | Formula . | Type . |
|---|---|---|---|---|
| CON00019 | 278.15183 | Dibutyl phthalate | C16H22O4 | Plasticizer |
| CON00053 | 281.27185 | Oleamide | C18H35NO | Slip agent |
| CON00103 | 82.00308 | Sodium acetate | C2H3O2Na | Solvent |
| CON00121 | 121.07389 | TRIS | C4H11NO3 | Buffer |
| CON00298 | 189.04259 | 4-HCCA | C10H7NO3 | Matrix compound |
3 RESULTS
MaConDa contains more than 200 contaminant records detected across several MS platforms. The majority of records include theoretical as well as experimental MS data. In a few cases, experimental data were included without rigorous identification (Sumner et al., 2007). The majority of experimental data reported in the literature has been collected in positive ion mode, which is reflected in the database. Also, the amount of MS/MS data for contaminants is currently rather limited. However, the database has the capability to store this type of data as more is recorded by the community. As such, and to the best of our knowledge, this is the first publicly accessible, readily searchable, readily implementable into an automated computational pipeline, readily expandable database of mass spectral contaminants.
A summary of the MaConDa features:
Database access via SOAP web service;
Database access via a user-friendly browser Web interface;
Batch processing of peak lists;
Searching of contaminants using additional ion forms;
Exporting results into different formats (e.g. tab-delimited and CSV);
Multiple database identifiers (e.g. PubChem Compound Identifier and Standard InChI code) for each contaminant to allow cross-referencing with other resources or databases;
The total database is freely available in several formats (e.g. tab-delimited, CSV, XML and SQL format).
4 CONCLUSIONS
MaConDa is an extensive manually well-annotated database that provides a useful and unique resource for the MS community. Analytical techniques used in metabolomics and proteomics are continually enhanced to improve their sensitivity. As a result, new contaminants are introduced into the experimental pipeline. Continued input of these new contaminants from the MS community and our own laboratory will enhance MaConDa as a valuable resource.
Acknowledgements
We gratefully thank our colleagues (David Watson, University of Strathclyde; John Draper, Aberystwyth University; John Langley, University of Southampton; John Newman, University of California Davis; Warwick Dunn, University of Manchester; William Griffiths, Swansea University) and instrument manufacturers (Thermo Fisher Scientific and Bruker Daltonics) who provided us with MS contaminant data. We thank Cheng Cao for his contribution to the website.
Funding: We thank both the British Heart Foundation (PG/10/036/28341) and UK Engineering and Physical Sciences Research Council (EP/J501414/1) for support, as well as the University of Birmingham’s Systems Science for Health initiative.
Conflict of Interest: none declared.
References
Author notes
Associate Editor: Jonathan Wren