The Protein Data Bank in Europe (PDBe; pdbe.org) is a partner in the Worldwide PDB organization (wwPDB; wwpdb.org) and as such actively involved in managing the single global archive of biomacromolecular structure data, the PDB. In addition, PDBe develops tools, services and resources to make structure-related data more accessible to the biomedical community. Here we describe recently developed, extended or improved services, including an animated structure-presentation widget (PDBportfolio), a widget to graphically display the coverage of any UniProt sequence in the PDB (UniPDB), chemistry- and taxonomy-based PDB-archive browsers (PDBeXplore), and a tool for interactive visualization of NMR structures, corresponding experimental data as well as validation and analysis results (Vivaldi).
Since the early 1970s, the Protein Data Bank (PDB) has been the single global archive in which 3D structure information about biomacromolecules (including complexes) is archived (1,2). Since 2003, the PDB archive has been managed by an international organization called the Worldwide PDB (wwPDB; wwpdb.org) (3,4). It consists of the Research Collaboratory for Structural Bioinformatics (RCSB) (5) and the BioMagResBank (BMRB) (6) in the USA, the Protein Data Bank Japan (PDBj) (7) and the Protein Data Bank in Europe (PDBe; pdbe.org) (8,9). The four wwPDB partners accept and process depositions of new structures and supporting experimental data and jointly curate, remediate and distribute the PDB archive. They also work together (often in consultation with the community) to define deposition and annotation policies and procedures, file formats, descriptions of chemical compounds and polymer components, and validation standards for structural data. In addition, each of the partners offers independent services to users of structural information. PDBe aims to develop tools, services and resources that help make the wealth of data about biomacromolecular structure and function more easily accessible to the wider biomedical community (10). Many of these tools have been described recently (8,9). In this article, we briefly describe several recently developed or enhanced services provided by PDBe.
PDBportfolio: HIGHLIGHTING SALIENT FEATURES OF A PDB ENTRY
In order to convey salient features and annotation in the context of 3D structure, PDBe has developed an animated widget called PDBportfolio (pdbe.org/portfolio), Figure 1. It presents a slide show of images that convey important information and value-added annotation about a selected PDB entry or entries. The legend of each image contains more details as well as links to relevant web pages at PDBe or external resources. The slide show covers:
Quaternary structure—the largest assembly identified by the depositors or PISA (11).
Deposited model—a cartoon and a surface representation are shown separately. The cartoon is coloured by polymer chain and shown with non-polymeric entities as space-filling (CPK) models. The surface is coloured by atomic properties using some simple rules as defined in PyMol (pymol.org) (12). In the case of protein–DNA/RNA complexes, the image shows only the protein surface for clarity.
Domain structure—separate images show SCOP (13), CATH (14) and Pfam (15) domains as annotated by the SIFTS resource (16). Each domain is highlighted using a coloured cartoon and its boundaries are further highlighted by a semi-transparent surface of the same colour. Different surface styles are used to distinguish multiple occurrences of the same domain type.
Ligands—the binding environment of at most three bound chemical compounds is shown. Compounds that are most likely crystallization agents (such as glycerol) are ignored.
Experiment-dependent information—for X-ray crystal structures, temperature-factor information is shown on the structure and red surface patches indicate where crystal contacts occur. For NMR entries, the entire ensemble of models is shown. For 3DEM entries, the EMDB (17,18) map is shown if available, with the PDB entry fitted into it.
PDBportfolio is used to display information about every PDB entry on its PDBe Atlas page (e.g. pdbe.org/1cbs). The widget can also be used freely in external web pages to convey key information about one or more PDB entries. The control buttons on the interface allow users to manipulate the slide show. They may also download an archive with all the PDBportfolio images of an entry (as well as the PyMol scripts used to generate them), or view all images and legends in one web page.
UniPDB: UniProt-PDB SEQUENCE COVERAGE
UniPDB is a widget that provides a graphical display of the sequence coverage in the PDB of any UniProt (19) entry (pdbe.org/unipdb), Figure 2. Proteins encountered in PDB entries may contain partial sequences (e.g. one or more stably folded domains), chimeric sequences, fusions with other proteins and all manner of modifications to the wild-type sequence. Some modifications occur naturally, whereas others are introduced by experimenters to facilitate purification or crystallization, or to allow investigation of the effect of a mutation on the behaviour of the protein (such as catalytic activity or ligand-binding specificity). In addition, the structure of a protein may have been determined many times, e.g. in different laboratories, using different techniques, under different conditions, or in complex with different ligands or other biomacromolecules. UniPDB provides an intuitive, graphical overview of the structural information available for a particular UniProt entry, based on mapping data provided by the SIFTS resource (16). It also provides annotation of the Pfam (15) domains occurring in the sequence. For every PDB entry that contains (a part of) the select UniProt sequence, PDBlogos (9) instantly reveal if it is an X-ray, NMR or EM structure, if it contains DNA or ligands, etc. From the UniPDB widget, a FASTA search of the PDB for related sequences can be launched; the results of this search are presented in the PDBeXplore browser (9).
PDBeXpress: PDB ANALYSIS TOOLS
PDBeXpress (pdbe.org/express) is an umbrella name for a collection of easy-to-use and powerful PDB analysis tools. Most of these use the PDBeMotif (20) web service as the underlying tool, while others access the PDBe search database directly. At present, there are two PDBeXpress modules in production and several others are under development. The first two modules can be used to answer the following common questions:
What residues are found in the binding sites of a given compound? Using PDBeMotif, PDBeXpress retrieves the residues with which a ligand interacts as observed in current PDB entries. A ligand can be selected by providing its name or three-character PDB identifier. The results are presented as a graph that shows the relative occurrence of the amino acids in the binding sites of the compound (extracted from the PDBeMotif database), Figure 3. There are options to view the PDB entries in which these interactions occur, or to perform further analyses using PDBeMotif. The graphs and the data can be downloaded.
What compounds are known to bind a given set of residues? PDBeMotif is used to retrieve all ligands observed in the PDB to interact with a given set of amino acids and the results are again shown in an interactive graph. This tool can be used to generate hypotheses about the type of compounds that could conceivably bind in a pocket or cavity, given the nature of the residues that line it.
The Electron Microscopy Data Bank (EMDB) was established at the EBI in 2002 (17) and is now managed and developed in collaboration with the RCSB and Baylor College of Medicine (18). In addition to the joint EMDB portal (EMDataBank.org), there are some EM-related resources at PDBe as well, which have recently been reorganized, restyled and expanded (pdbe.org/emdb). The data held in EMDB constitute a treasure trove of information on the state of, and trends in, the 3DEM field. Examples of interesting information that can be mined from the archive include trends in the resolution of EM studies and the size of the structures that have been deposited. Specialist users may also be interested in comparisons of the relative popularity of microscopes and software packages. EMstats (pdbe.org/emstats) is a new service that mines the database for such information and presents the results as interactive charts that are generated dynamically and represent the current state of the information in the database. The graphical elements of the charts (pie diagrams, histograms, etc.) are active, which means that clicking on them results in a query to the database, the results of which are shown below the chart, Figure 4.
PDBe provides a variety of NMR-related data to the scientific community (pdbe.org/nmr) (8,9). A statistics page listing the number of NMR entries in the PDB for which additional information is held at PDBe or elsewhere is now available and updated weekly. In the past year, we have added access to the logRECOORD database (22) that contains recalculated structures (using a log-normal potential for interpreting NOEs) for more than 300 NMR entries in the PDB.
Vivaldi (Hendrickx et al., manuscript in preparation) is an interactive graphical web tool aimed at both expert and non-expert users of NMR structural data (pdbe.org/vivaldi), Figure 5. It allows visualization of NMR ensembles and individual structures together with associated experimental data (such as chemical shifts, distance restraints and residual dipolar couplings, RDCs) and derived validation-related information. The latter is partly generated using the PDBe services OLDERADO (23) and VASCO (24) and partly extracted from the external NRG-CING database (nmr.cmbi.ru.nl/NRG-CING). Vivaldi uses the OpenAstexViewer (25) to present 3D displays of one or more models from the ensemble, Figure 5A. By default, the most representative model as identified by OLDERADO is shown. A separate interactive 1D graph displays any of a variety of validation scores or counts, such as the number of distance-restraint violations per residue, deviations of the chemical shift values from statistical averages as reported by VASCO, Figure 5B, or the fit between calculated and deposited RDCs. The 1D and 3D displays are coupled, which means that analysis and validation results can be inspected simultaneously as a function of residue number and in the context of the 3D model. Both the 3D structure views and the 1D graphs can be saved as high-resolution images for use in publications or presentations. In addition, an information panel offers explanations of the different views in plain English text as well as detailed residue-specific information. Vivaldi also has a user-friendly ‘wizard’ option to help users obtain a particular view, subject to data availability. Some of these views are also accessible directly from the PDBe Atlas pages of NMR entries as well as from OLDERADO and VASCO report pages.
OTHER NEW OR IMPROVED SERVICES
Chemistry and taxonomy-based structure browsers
PDBeXplore (pdbe.org/explore) is a browsing interface for retrieving and analysing information on subsets of structures in the PDB using various biological and chemical classifications (9). Previously released PDBeXplore modules enable browsing of the contents of the PDB based on Enzyme Class (26) (pdbe.org/ec), CATH domains (14) (pdbe.org/cath), Pfam families (15) (pdbe.org/pfam) or FASTA-based (27) sequence-similarity searches (pdbe.org/fasta). These browser modules retrieve results much faster than before and they have all been updated to include clickable pie charts that allow further refinement of the queries. In addition, two new browser modules have been released. A chemistry-based module (pdbe.org/compounds) enables analysis of all PDB entries that contain a chemical compound, while a taxonomy-based module (pdbe.org/taxonomy) allows users to retrieve and analyse all protein structures in the PDB for any taxonomy level. Taxonomy information is taken from the well-established NCBI taxonomy database (28,29). The browser module also provides easy access to the top 15 species present in the largest number of PDB entries.
Atlas entry pages
PDBe Atlas pages provide a summary of a PDB entry in a user-friendly lay-out and serve as a starting point for further exploration of sequence, structure, chemistry and function information related to that entry. The summary Atlas pages have been improved with several ‘action buttons’ that allow one-click access to commonly used functionality [e.g. downloading the PDB file, viewing the structure in 3D, launching PDBeFold (30) to find similar structures in the PDB, or accessing the PISA (11) results]. The summary Atlas pages now also contain a table that lists all UniProt entries contained in the entry and action buttons to launch either a sequence search of the entire PDB or the UniPDB widget (see above) for each of these UniProt entries. The ligand Atlas pages now provide links to the ChEMBL (bioactivity data; https://www.ebi.ac.uk/chembldb/) (31) and ChEBI (chemical annotation; www.ebi.ac.uk/chebi/) (32) resources at the EBI. Finally, the experiment-related Atlas pages for NMR entries now contain links to any NMR-related resources at PDBe and BMRB as well as to the Vivaldi viewer for interactive analysis of the structure in the light of experimental and validation-related data (see above).
Quips (‘Quite Interesting Pdb Structures’; pdbe.org/quips) are short stories about one or more interesting or topical structures, coupled with an interactive viewer and, often, a tutorial that allows the reader to carry out more detailed exploration of a structure using PDBe resources, Figure 6. The interactive structure displays comprise a number of predefined (often animated) views to highlight concepts explained in the text. The tutorials assume that the reader has a background in biology, chemistry or medicine and an interest in proteins, nucleic acids and ligand interactions. New Quips articles are added about once a month.
PDB highlights pages (pdbe.org/highlights) reveal PDB entries that are extreme in one sense or another, such as their age, their resolution, the number or length of the macromolecules contained in them, etc. Extreme entries can be listed separately for X-ray, NMR and EM structures or for the entire PDB archive.
Weekly updates of the PDB and EMDB archives can be monitored conveniently at pdbe.org/latest. This service provides lists of new, modified and removed PDB entries, of new and modified chemical compounds in the PDB, and of newly released and modified EMDB maps or summary files (‘headers’). Every entry is shown in a panel with core information and an image; the panels can be expanded to reveal more information as well as action buttons to access commonly used services or files. Result lists can be downloaded as easily parsable text files, and new RSS feeds provide similar information (see pdbe.org/rss for a list of available feeds).
European Molecular Biology Laboratory (EMBL); Wellcome Trust (grant number 088944); European Union (226073); UK Biotechnology and Biological Sciences Research Council (BB/G022577/1 and BB/E007511/1); the National Institutes of Health (R01GM079429-01A1). Funding for open access charge: The Wellcome Trust.
Conflict of interest statement. None declared.
The authors wish to thank all collaborators and partners in the EBI, EMBL, wwPDB, EMDB, BMRB, CCPN, CCP4, CCDC and other collaborative efforts, as well as the structural biology community for depositing its structures and experimental data in the PDB, BMRB and EMDB.