-
PDF
- Split View
-
Views
-
Cite
Cite
Michael H. W. Weber, Ingo Fricke, Niclas Doll, Mohamed A. Marahiel, CSDBase: an interactive database for cold shock domain-containing proteins and the bacterial cold shock response, Nucleic Acids Research, Volume 30, Issue 1, 1 January 2002, Pages 375–378, https://doi.org/10.1093/nar/30.1.375
- Share Icon Share
Abstract
CSDBase (http://www.chemie.uni-marburg.de/~csdbase/) is an interactive Internet-embedded research platform providing detailed information on proteins containing the cold shock domain (CSD). It consists of two separated database cores, one dedicated to CSD protein information, and one to provide a powerful resource to relevant literature with emphasis on the bacterial cold shock response. In addition to detailed protein information and useful cross links to other web sites, CSDBase contains computer-generated CSD structure models for most CSD-containing protein sequences available at NCBI non-redundant protein database at the time of CSDBase establishment. These models were calculated on the basis of known crystal and/or NMR structures using SWISS-MODEL and can be downloaded as PDB structure coordinate files for viewing and for manipulation with other software tools. CSDBase will be regularly updated and is organized in a compact form providing user friendly interfaces to both database cores which allow for easy data retrieval.
Received September 20, 2001; Accepted October 10, 2001.
INTRODUCTION
Proteins interacting with the biological information molecules DNA and RNA play important cellular roles in all organisms. One widespread superfamily of proteins implicated in such function(s) contains the cold shock domain (CSD) which consists of ∼70 amino acids and harbors the nucleic acid binding motifs RNP1 and RNP2. This domain is highly conserved from bacteria to man (1) and the corresponding protein superfamily has been classified into five distinct subgroups which comprise (i) the bacterial cold shock proteins (CSPs), (ii) eukaryotic Y-box proteins (YBPs), (iii) a fraction of the plant glycine-rich protein family (GRPs), (iv) LIN-28 proteins from the Caenorhabditis nematode family, and (v) the mammalian Unr protein (2). Out of this large variety of proteins, so far only structures of members of the bacterial CSP subfamily have been determined (3–8). With ∼70 amino acids in length, CSPs represent the prototype of the CSD and share a highly similar overall fold consisting of five antiparallel β-sheets forming a β-barrel structure with surface exposed aromatic and basic residues (RNP1 and RNP2 motif) that have been shown to be responsible for nucleic acid binding properties of variable binding affinities and sequence selectivities (9–12).
In addition to at least one CSD located near the N-terminus, the eukaryotic counterparts of this protein superfamily have incorporated a broad variety of other motifs that are thought to confer either a more specific template recognition or an ancillary enzymatic function (2). It has been demonstrated that these proteins are involved in a variety of important functions such as mRNA masking, coupling of transcription to translation and developmental timing and regulation (13,14). However, although a large body of information has been accumulated, most features of the CSD protein family members are yet to be explored. Strikingly, recent data indicated that, in spite of a very similar overall fold, even the most simple members of the CSD superfamily, the bacterial CSPs, have surprisingly diverse function(s). Apart from its properties as a transcriptional activator (15), CspA from Escherichia coli has also been shown to destabilize RNA secondary structures (16) and possesses transcriptional antiterminator functions like its homologs CspC and CspE (17). Moreover, CspE was found to be implicated in promoting or protecting chromosome folding and to act as a high-copy suppressor of mutations in chromosomal partition gene mukB (18), while CspD, which appears to exist exclusively as a homodimer, is specifically expressed in the stationary phase and has been shown to inhibit replication (19,20). At least for Bacillus subtilis it is evident that the presence of one out of three individuals of this hierarchically organized protein class is essential even under optimal growth conditions (21). Detailed investigations have further revealed that removal of two of the three csp genes present in B.subtilis results in abnormal nucleoid structure, growth defects, deregulated protein synthesis and the inability to differentiate to endospores (21,22). Although it has been shown that B.subtilis CSPs colocalize with ribosomes in vivo (23), and can be functionally replaced in part by translation initiation factor IF1 from E.coli (24), the nature of CSP function is far from being understood.
In the light of rapidly accumulating sequence data, collections of smaller data subsets stored in specialized databases allow for a much more convenient access to molecular biological information. Therefore, in order to facilitate research activities focused on the interesting and rapidly growing CSD protein superfamily briefly outlined above, we have created CSDBase, an interactive Internet-embedded research platform providing detailed information on CSD containing proteins and the bacterial cold shock response (http://www.chemie.uni-marburg.de/~csdbase/).
DATABASE CONTENT, ACCESS AND USAGE
CSD protein sequences were acquired by performing a BLASTP search at NCBI non-redundant database using cold shock protein CspB from B.subtilis as template (http://www.ncbi.nlm.nih.gov/blast/; 25,26). Except for the number of descriptions and the number of displayed alignments which were both set to a non-limiting value, BLAST default settings were used. On the basis of experimentally determined CSD protein structures (5–8), each of the protein sequences retrieved and thereby detected to have homologies to CSD containing CspB, was subjected to comparative protein modeling using SWISS-MODEL first approach mode (http://www.expasy.ch/swissmod/SWISS-MODEL.html; 27,28). The constructed homology-models were automatically analyzed by the WHAT IF verification program (29,30) and were stored in CSDBase as multi-layered atom coordinate files (PDB format). Each layer holds the atom coordinates of a separate model where the first layer represents the model structure generated for the protein of interest while the other layers contain the coordinates of the experimental structure(s) used as template(s) for the calculations as taken from PDB (http://www.rcsb.org/pdb/). To demonstrate the quality of the generated protein models, Figure 1 shows stereo images of a typical raw model structure generated for CspB from Bacillus caldolyticus in comparison to its crystal structure that was experimentally determined later (r.m.s.d. = 0.87 Å). Although the theoretical structure model shown in Figure 1 fits well to the experimentally determined, the reader should be reminded, that such homology models should be regarded as raw models which often require additional inspection and further refinement.
For those sequences which allowed for generation of a protein model, all relevant information was extracted via NCBI from the original database entries and was stored together with the downloadable protein model atomic coordinate file in CSDBase. In addition to the protein sequences, accession numbers, protein names, gene names and a brief description of the proteins, the modeled regions of the protein sequences were indicated, the number of amino acids for each protein was determined and its pI and mass were calculated using the proteomic tools available at ExPASy server (http://ca.expasy.org/tools/pi_tool.html; 31). One of the interesting results obtained during this data acquisition was that, in contrast to what is sometimes believed, bacterial CSPs are not uniformly acidic but rather stretch over a broad pI range and can have a basic pI of as high as 10.28 (CspH from E.coli). Moreover, apart from a N-terminal CSD, the genus Mycobacterium harbors CSP sequences which contain an additional C-terminal domain of approximately 70 amino acids of unknown structure and function. Since Mycobacteria have some features in common with eukaryotes and utilize these as hosts, it might be possible that this CSD extension feature reflects further development of simple CSD proteins in the direction of higher organized Y-box proteins.
In addition to the CSD protein data and model structures described above, CSDBase provides a powerful, easy-to-use retrieval system for the relevant literature which was compiled using the PubMed interface at NCBI (http://www.ncbi.nlm.nih.gov/PubMed/; 25). CSDBase literature entries (as well as those for the protein data) can be searched by choosing from a range of pre-defined topics such as author, year of publication, journal name, etc., followed by entering a user-defined phrase (an asterisk will result in display of all entries currently available). In all cases, data output can be ordered according to different aspects. It should be noted that at this stage the content of the literature database core is restricted to bacterial CSPs and the bacterial cold shock response. We are currently working on adding the eukaryotic publications.
CSDBase can be accessed via the Internet at http://www.chemie.uni-marburg.de/~csdbase/ using Microsoft Internet Explorer version 5 or higher and is organized in a compact self-explanatory form. Note that with most other browsers, database functions remain non-operational. Further details concerning usage of CSDBase can be viewed directly by selecting the ‘Info’ submenu displayed in the title bar of every CSDBase page.
FUTURE PERSPECTIVES
Since the CSD-containing cold shock proteins are implicated to play a major role during the bacterial cold shock response (32,33), in an additional pilot project a third database core is currently under construction to incorporate detailed proteomic data including clickable 2D gels of cold shock stressed bacteria. Moreover, two novel CSD protein structures await release in the near future (PDB accession nos 1G6P and 1H95) which will allow us to further refine the model structures presently available in CSDBase. In this manner, it is our intention to continuously extend the capabilities of this initial version of CSDBase step-by-step to provide a powerful starting platform for researchers working on CSD containing proteins and the bacterial cold shock response.
ACKNOWLEDGEMENTS
We would like to thank Rolf Henrich for a quick introduction into Internet-embedded database organization. This work was supported by Sonderforschungsbereich 395 and Fonds der Chemischen Industrie.
To whom correspondence should be addressed. Tel: +49 6421 282 5722; Fax: +49 6421 282 2191; Email: [email protected]
![Figure 1. Comparison of the crystal structure of cold shock protein CspB from B.caldolyticus [(A) PDB accession no. 1C9O; A-chain (4)] and its computer-generated model structure counterpart as stored in CSDBase [(B) first layer of CSDBase file name CspB_baccl (this work)]. Ribbon model stereo pairs were generated with Swiss-PdbViewer version 3.7b2 (34) and images were rendered using POV-Ray version 3.1g. Superimposition (C) of CspB crystal structure (yellow) on CspB computer model (cyan) was performed using the ‘iterative magic fit’ function of Swiss-PdbViewer (r.m.s.d. = 0.87 Å for the carbon α trace). Note that the presented computer model was generated with SWISS-MODEL (27,28) on the basis of known structures for CspB and CspA from B.subtilis and E.coli, respectively, at a time where the crystal structure of CspB from B.caldolyticus was not yet available.](https://oup.silverchair-cdn.com/oup/backfile/Content_public/Journal/nar/30/1/10.1093/nar/30.1.375/2/m_gkf07201.jpeg?Expires=1747914168&Signature=nuVXVe8LjsCFD7usc6764hj9NktiIuVuXZitXduoVLY-E9lJF~eV-WyzXQ4r256GMaOAvBM6V1ksfxzGM11DmqMKMxpknm4M3knTcvTwcOHJdpN5bF6YN-iWsUcCg~Bwfw0BkrLumU6DBD~RqkTw32~o1f0~X1xW9wZNvwqwuEwlVmfgzFww-FJvE7imK395X4KjUbgzJkHUcyQc9kb1KR3GUwlid36cU1rkHRQhkDGZRYG64e2AQsuCz1c2FbfOLmGccMjtwA-IfBSRxsulJ8hbT4avkHlGn3LGIit7ItyM7PpCWmXIgX9sPZEVZncrZze3Zfa9U7xFYOJ9qS0sIg__&Key-Pair-Id=APKAIE5G5CRDK6RD3PGA)
Figure 1. Comparison of the crystal structure of cold shock protein CspB from B.caldolyticus [(A) PDB accession no. 1C9O; A-chain (4)] and its computer-generated model structure counterpart as stored in CSDBase [(B) first layer of CSDBase file name CspB_baccl (this work)]. Ribbon model stereo pairs were generated with Swiss-PdbViewer version 3.7b2 (34) and images were rendered using POV-Ray version 3.1g. Superimposition (C) of CspB crystal structure (yellow) on CspB computer model (cyan) was performed using the ‘iterative magic fit’ function of Swiss-PdbViewer (r.m.s.d. = 0.87 Å for the carbon α trace). Note that the presented computer model was generated with SWISS-MODEL (27,28) on the basis of known structures for CspB and CspA from B.subtilis and E.coli, respectively, at a time where the crystal structure of CspB from B.caldolyticus was not yet available.
References
1 Wolffe,A.P., Tafuri,S., Ranjan,M. and Familari,M. (
2 Graumann,P.L. and Marahiel,M.A. (
3 Kremer,W., Schuler,B., Harrieder,S., Geyer,M., Gronwald,W., Welker,C., Jaenicke,R. and Kalbitzer,H.R. (
4 Mueller,U., Perl,D., Schmid,F.X. and Heinemann,U. (
5 Newkirk,K., Feng,W., Jiang,W., Tejero,R., Emerson,S.D., Inouye,M. and Montelione,G.T. (
6 Schindelin,H., Marahiel,M.A. and Heinemann,U. (
7 Schindelin,H., Jiang,W., Inouye,M. and Heinemann,U. (
8 Schnuchel,A., Wiltscheck,R., Czisch,M., Herrler,M., Willimsky,G., Graumann,P., Marahiel,M.A. and Holak,T.A. (
9 Lopez,M.M. and Makhatadze,G.I. (
10 Lopez,M.M., Yutani,K. and Makhatadze,G.I. (
11 Phadtare,S. and Inouye,M. (
12 Schröder,K., Graumann,P., Schnuchel,A., Holak,T.A. and Marahiel,M.A. (
13 Sachetto-Martins,G., Franco,L.O. and de Oliveira,D.E. (
14 Sommerville,J. (
15 Brandi,A., Pon,C.L. and Gualerzi,C.O. (
16 Jiang,W., Hou,Y. and Inouye,M. (
17 Bae,W., Xia,B., Inouye,M. and Severinov,K. (
18 Hu,K.H., Liu,E., Dean,K., Gingras,M., DeGraff,W. and Trun,N.J. (
19 Yamanaka,K. and Inouye,M. (
20 Yamanaka,K., Zheng,W., Crooke,E., Wang,Y.-H. and Inouye,M. (
21 Graumann,P., Wendrich,T.M., Weber,M.H.W., Schröder,K. and Marahiel,M.A. (
22 Weber,M.H.W., Volkov,A.V., Fricke,I., Marahiel,M.A. and Graumann,P.L. (
23 Mascarenhas,J., Weber,M.H.W. and Graumann,P.L. (
24 Weber,M.H.W., Beckering,C.L. and Marahiel,M.A. (
25 Wheeler,D.L., Church,D.M., Lash,A.E., Leipe,D.D., Madden,T.L., Pontius,J.U., Schuler,G.D., Schriml,L.M., Tatusova,T.A., Wagner,L. and Rapp,B.A. (
26 Altschul,S.F., Madden,T.L., Schäffer,A.A., Zhang,J., Zhang,Z., Miller,W. and Lipman,D.J. (
28 Peitsch,M.C. (
29 Hooft,R.W.W., Vriend,G., Sander,C. and Abola,E.E. (
30 Vriend,G. (
31 Bjellqvist,B., Hughes,G.J., Pasquali,C., Paquet,N., Ravier,F., Sanchez,J.-C., Frutiger,S. and Hochstrasser,D.F. (
32 Graumann,P.L. and Marahiel,M.A. (
33 Yamanaka,K. (
Comments