DNASU plasmid and PSI:Biology-Materials repositories: resources to accelerate biological research

The mission of the DNASU Plasmid Repository is to accelerate research by providing high-quality, annotated plasmid samples and online plasmid resources to the research community through the curated DNASU database, website and repository (http://dnasu.asu.edu or http://dnasu.org). The collection includes plasmids from grant-funded, high-throughput cloning projects performed in our laboratory, plasmids from external researchers, and large collections from consortia such as the ORFeome Collaboration and the NIGMS-funded Protein Structure Initiative: Biology (PSI:Biology). Through DNASU, researchers can search for and access detailed information about each plasmid such as the full length gene insert sequence, vector information, associated publications, and links to external resources that provide additional protein annotations and experimental protocols. Plasmids can be requested directly through the DNASU website. DNASU and the PSI:Biology-Materials Repositories were previously described in the 2010 NAR Database Issue (Cormier, C.Y., Mohr, S.E., Zuo, D., Hu, Y., Rolfs, A., Kramer, J., Taycher, E., Kelley, F., Fiacco, M., Turnbull, G. et al. (2010) Protein Structure Initiative Material Repository: an open shared public resource of structural genomics plasmids for the biological community. Nucleic Acids Res., 38, D743–D749.). In this update we will describe the plasmid collection and highlight the new features in the website redesign, including new browse/search options, plasmid annotations and a dynamic vector mapping feature that was developed in collaboration with LabGenius. Overall, these plasmid resources continue to enable research with the goal of elucidating the role of proteins in both normal biological processes and disease.


INTRODUCTION
Plasmids form a cornerstone in modern molecular and structural biology research laboratories. Considerable time, expertise and resources are expended cloning genes into the appropriate vector to express recombinant proteins for both in vivo functional studies or to produce purified proteins for biochemical studies. Many laboratories that work with limited budgets or have limited experience with the molecular biology techniques required for gene cloning create significant demand for pre-made plasmids. Furthermore, after laboratories have created and used plasmids for experiments and referenced them in publications, they are frequently lost or forgotten in the back of freezers, often known only to the students or postdocs who have since left the laboratory, thus losing a physical resource that would be valuable to the broader scientific community.
Commercial, nonprofit and laboratory groups have recognized the need to archive plasmids and make them available to the research community [selected references (1)(2)(3)(4)(5)(6)(7)]. These plasmid collections often focus on a particular theme or specialized collection. For example, the NIAID-funded BEI resources focuses on genes from organisms that cause infectious disease, the Drosophila Genomics Resource Center provides cell lines and plasmids from one model organism, Drosophila melanogaster, the nonprofit Addgene primarily distributes plasmids submitted by individual researchers after publication, and the for-profit company Invitrogen focuses on large collections based on a specific technology (such as Gateway Õ or shRNA expression) or an organism (such as human open reading frames or yeast BACS). All have the goal of providing plasmids to researchers, though they differ widely in the plasmids that they distribute as well as the costs, annotations, website accessibility and customer support.
The DNASU plasmid repository was established at the Biodesign Institute at Arizona State University as a nonprofit service core with the goal of providing sequence-verified, highly-annotated plasmids to researchers though an easy-to-use and accessible website (http://dnasu.org). In addition to functioning as a repository for plasmids that are created within our center, DNASU also stores and distributes plasmids created by individual external researchers and by collaborative groups of investigators like the ORFeome Collaboration (8,9) and the Protein Structure Initiative:Biology (PSI:Biology) (1), a multicenter biologically focused structural biology effort. DNASU focuses both on storing valuable plasmids that researchers and consortiums have created and on distributing these highly-annotated plasmids and plasmid collections to researchers worldwide. The goal is for researchers to avoid the time and effort needed to reclone genes so that they can focus immediately on their experiments, thus accelerating the pace of discovery.

Plasmid collection overview
Since the first report in 2010 (5,10), the number of plasmids stored and distributed through DNASU has nearly doubled to 198 000 plasmids containing gene inserts (e.g. cDNA, genomic DNA, intergenic regulatory element, etc.) from over 700 species. Over half of these plasmids contain genes from eukaryotic organisms, 45% from prokaryotes, with the remainder from archaea, viruses, and metagenome cloning projects. Over 74 000 of these plasmids contain human genes, representing 13 000 unique human genes.
As the PSI:Biology-Materials Repository (PSI:Biology-MR), 81 000 protein expression plasmids, 40% of the collection, were created and deposited by PSI:Biology centers. Of these, 3300 were used to solve 3D protein structures that have been deposited in the Protein Data Bank (PDB). Each of these plasmids has been fully sequence-verified at the PSI:Biology-MR and links to additional annotations, experimental protocols and contact information for the depositing laboratories to facilitate in the expression and purification of individual proteins.
To enhance functional annotation of the plasmids, we recently scanned protein sequences of the entire DNASU collection for 1067 protein domains by HMMER2 (http:// hmmer.janelia.org/) and the SMART HMM (hidden Markov model) motifs (http://smart.embl-heidelberg.de/) (11) with an E value threshold of 0.05. Based on this analysis, 978 domains were found in the collection. As an example, across the human proteome, the Gateway Õ donor collection covers 45% to 90% of the 34 most abundant domains that are found in more than 100 proteins, including RHO GTPases (90% coverage), protein kinases (88%), the protein-protein interaction BTB domain (82%), the phosphotyrosine-binding SH2 domain (80%) and the multifunctional WD40 domain (74%) ( Table 1). The domain information has been implemented into DNASU as a part of the new browse feature (described in more detail below).
To expedite systems biology and other high-throughput studies, DNASU also provides preorganized plasmid collections to researchers at a reduced recharge cost. DNASU distributes whole-genome collections from yeast and various pathogens as well as specialized collections of human genes, such as kinases or genes involved in breast cancer (listed here http://dnasu.org/DNASU/ Browse.do?tab=0).

Vectors
Genes in DNASU are cloned into vectors for numerous downstream applications including bacterial, yeast, mammalian, baculovirus or cell-free expression. In addition to plasmids containing gene inserts, the repository distributes 138 cloning or expression empty vectors ( Table 2). These vectors, especially those created by the PSI centers, have been optimized for high-level protein expression and purification in bacteria, wheat germ extract (12,13), mycoplasma (14) or yeast (15) and use different tag configurations and protease cleavage sites to accommodate a variety of experimental designs. Many of these vectors also take advantage of fast ligationindependent technologies such as Gateway Õ (16), Ligation Independent Cloning (LIC) [representative publications (17,18)] and Polymerase Incomplete Primer Extension (PIPE) (19), allowing for rapid and easy cloning of a gene insert into the vector of interest. Vector annotations on the DNASU website often include links to the PSI:Biology Structural Biology Knowledgebase (SBKB) Technology Portal (http:// technology.lbl.gov/) (20), which provides detailed descriptions of the cloning or expression technology. Details and publications about the most common cloning methods used for DNASU plasmids can also be found in the new Resources section (http://dnasu.org/DNASU/Resources/ index.jsp).
Downloading DNASU data Several options to download plasmid data are available to researchers who wish to analyze the entire collection using their own informatics tools, to sort search results or to track all of the plasmids they have ordered. On the BLAST search page (http://dnasu.org/DNASU/Search Options.do?tab=4), data for DNASU plasmids are available as FASTA-formatted files containing either all nucleotide or amino acid sequences with their associated plasmid IDs and vector names. These files are regularly updated when new plasmids are added to the repository. Individual search results can also be downloaded as an Excel file that includes the gene name, vector, growth conditions, gene insert sequence or collection information. Researchers may also download their ordering history and the list of plasmids in each order to track and find information on the plasmids they have requested from DNASU.

Search options
DNASU aims to facilitate research, and to accomplish this goal, we strive to help researchers find what they are looking for on the website quickly and easily. To this end, DNASU's five search options have been updated and six new browse features were created to provide flexibility in finding plasmids within the database. A key contribution has been to put search capabilities where researchers need them most. Thus, in the website redesign, a search box was added at the top of each page that will query the collection by any gene name or plasmid identifier. Since implementation, over one third of all searches are initiated through this universally-available search box.
The five standard searches (http://dnasu.org/DNASU/ SearchOptions.do) [described in detail in (1,10)] are now presented in individual tabs on a single page allowing researchers to see and access all available search options quickly without switching web pages. The Advanced Search, PSI:Biology Search and Vector Search provide searches by gene or vector name, author name, species, PDB ID, Target Track ID (for PSI plasmids), protein expression data (if available) or vector characteristics such as appended polypeptide tags, selectable markers or protein expression systems. Both the Plasmid ID and BLAST Searches allow a researcher to search the database with multiple queries simultaneously. The Plasmid ID Search accepts up to 50 mixed identifiers, including DNASU plasmid IDs, gene names and gene accession numbers, Only the plasmids in Gateway donor vectors were counted. Originally these search options were designed to perform optimally when querying a small database with several thousand entries. As the DNASU database has grown, the searches were optimized by introducing temporary lookup tables for every query. These lookup tables reduced the multiple database calls that fetched each detail for each clone, which significantly reduces the time it takes to perform large searches. After this optimization, searches with up to 20 different queries now average 2-3 s from clicking 'search' to receiving results.

Browse options
As DNASU has accumulated a large number of plasmids, researchers may want to see all plasmids with a specific feature of interest or available predefined collections. To help researchers more efficiently explore the collection, find what they are looking for, or find something new, we have created categorized lists of plasmids presented in dynamic, easy-to-navigate browse features (http:// dnasu.org/DNASU/Browse.do?tab=0). These six new browse options display preconfigured lists of annotations or information pulled directly from the database that summarize the materials available in DNASU, which provide a starting point to find plasmids based on particular criteria, such as species, protein domain or vector tag. The six browse features described below were chosen based on what we have observed researchers commonly search for, and designed in a visual format for easier use compared with a standard search feature.
(1) Collections. Browse all available preorganized collections. These collections include a set of human kinases, the ORFeome V8.1 collection both in a Gateway Õ entry vector and lentiviral expression vector (9), and whole genome collections from yeast and a variety of pathogens. In addition, we generated several collections that are based on widely studied protein families, such as kinases and phosphatases. These collections are stored on premade 96-well plates and can be requested at a discounted price. (2) Biological function. Browse for sets of plasmids with a particular function or feature, for example, all glycoenzymes, membrane proteins or G proteincoupled receptors. In order for researchers to obtain plasmid sets with specific biological functions, all clones were annotated for protein domains, and we implemented a plasmid browse function based on these domain names. This provides a powerful tool, allowing researchers to search the collection for plasmids representing a family of proteins that share specific domains or specific cellular functions (e.g. metabolic enzymes, kinases, DNA-binding proteins and membrane receptors) in a particular species (Table 1)

Plasmid information
For a researcher to use a plasmid from DNASU effectively and easily, the clone details were overhauled to provide detailed, well-annotated information about each plasmid on a single easy-to-navigate page that can be accessed directly from the search results. Plasmid information is now organized into three tabs containing information about the gene insert, the vector and annotations.

Gene details tab
The gene detail tab contains all information about the gene insert. This includes three main sections: (1) Synopsis. Information including gene name, institutions and researchers who deposited the plasmid, and publication information provided by the depositing scientist or researchers who previously requested the plasmid from DNASU. (2) Sequence data. Nucleotide sequence information, including the entire insert sequence (open reading frame highlighted in gray) and the flanking sequences (when available). This section also provides on-thefly protein translation and information about any mutations in the nucleotide or protein sequence. (3) Plasmid handling. The antibiotic selection required to grow the plasmid. For the majority of PSI plasmids, protein expression information, including the recommended expression host, the expression and purification results (if already expressed by the PSI center), and links to Target Track, which contains the experimental protocols for expression and purification are also listed.
Also on this tab is a box with information about price, availability, Material Transfer Agreement (MTA) details and a button to request the plasmid.

Vector details tab
The plasmid's vector information, found on the second tab, lists institutions and researchers who created the vector, the full sequence and the map that was provided by the depositing scientist. If available, a link to the empty vector, which can be used as a control in an experiment or for cloning, is displayed.
The new Dynamic (DyNA) Vector Map widget was created in collaboration with LabGenius ( Figure 2A). The map displays vector features, and scrolling over a Figure 1. Screen shot of the dynamic vector browse feature. Clicking on any of the headings (tag, protease cleavage site or promoter) will sort the list to cluster the vectors by that feature dynamically. Clicking on the vector name will link to vector details pages and ordering information. feature reveals a pop-up containing the start and stop positions of the feature and its description. These were manually annotated by the repository to ensure accurate display of features. Clicking 'Show restriction sites' maps selected restriction sites onto the vector, with unique sites indicated by an asterisk. The DyNA Map can be downloaded with or without the restriction sites and includes a table of all features and restriction sites along with their positions. The Advanced Viewer links out to the LabGenius website, which provides additional free tools to analyze the vector sequence further, design primers for cloning or sequencing and perform virtual restriction digests.

Annotations tab
The goal of the ever-expanding annotations tab is to provide links to additional information about the gene insert within the plasmid. These annotations can be roughly categorized into nucleotide, protein and experimental annotations [summarized in Table 3 (20)(21)(22)(23)(24)]. For plasmids that were used to determine the structure of a protein, we created a new structure viewer that presents a 2D image of the structure and a Jmol-enabled applet to explore the 3D structure ( Figure 2B).

Requesting plasmids
Researchers can request plasmids and plasmid collections directly through the DNASU website by registering for a free account, finding the plasmid(s) of interest and adding the plasmid(s) to the cart. All plasmids are stored as bacterial glycerol stocks in 2D-barcoded tubes in a Brooks BioStore freezer storage and retrieval system, allowing fast robotic cherry picking of samples as soon as an order is placed. In order to offset the costs of preparing the plasmid for shipment and to ensure the viability of the repository in the future, we have implemented a nominal recharge fee with discounts provided for full 96-well plates of plasmids and special predetermined collections to facilitate high-throughput studies. Over 4200 orders containing 200 000 plasmids have been shipped worldwide from DNASU to researchers in 47 states and 40 countries.
When researchers request plasmids, they typically require them for an experiment urgently. With the goal of eliminating the delays caused by waiting for the MTA to be reviewed and signed, we created an Expedited MTA Network [described in detail in (10)]. With the MTA terms agreed to in advance and institutional signatures already in place, plasmids can be sent immediately after an order is electronically placed. With over 260 Member institutions, half of all registered users are members of Expedited MTA Network institutions and >60% of all orders from DNASU benefit from being part of this growing Network. In the past 4 years, this system has been improved by providing Technology Transfer Officials (TTO) at member institutions two new reporting options. First, each TTO automatically receives an email, which includes information about who placed the order and what was ordered, at the time of shipping. Second, the TTOs have 24/7 secure access to a reporting feature on DNASU listing all orders and order details from their institution. These features, requested by Member institutions, allow TTOs to keep track of materials entering their institutions to ensure that their internal regulations are followed.
DNASU is dedicated to assisting researchers. To fulfill that goal, all questions about DNASU can be directed to a PhD-level scientist at a dedicated service email address dnasuhelp@asu.edu, which will be answered within 24 h.
Depositing plasmids DNASU openly encourages researchers to deposit plasmids in the repository. Depositors will be credited for sharing their materials and will benefit from having a secure archive site by being alleviated from the burden of storage, maintenance and distribution of their plasmids. The general research community benefits from having plasmids rapidly available from a central source. There is a standardized three-step process to deposit plasmids. First, we work with the depositing institution to establish the Depositor Agreement, the legal agreement that sets forth the terms by which DNASU can distribute plasmids deposited by that institution. This step is often fast if there is already a Depositor Agreement from that institution on record. Second, the researcher will provide information about each plasmid in an Excel spreadsheet. This is the information that will be presented on the Plasmid Details pages on the DNASU website. Finally, they will send the plasmid sample(s). We will transform the DNA and sequence the gene insert with an end-read to confirm its identity before releasing it publicly in DNASU. Researchers interested in submitting plasmids to DNASU can find details on the website http://dnasu.org/DNASU/ Submission.jsp and the PhD-level Scientific Liaison will facilitate each step of the submission process.

FUTURE DIRECTIONS
As DNASU continues to grow, we maintain the mission of storing and distributing plasmids while expanding the information for each plasmid available through DNASU. Toward this end, we plan to expand the collection by partnering with researchers to deposit plasmid collections with DNASU. To ensure accurate and up-to-date plasmid information, we are implementing an automated gene annotation framework to compare gene insert sequences with the most recent reference sequence in NCBI and Ensembl. In addition, annotations will be expanded by providing pathway and protein-protein interaction information for genes and visually displaying these in interactive browse features, which will allow researchers to search and build customized clone sets based on gene functions of interest. Finally, by collaborating with laboratory and research management websites, such as LabGuru and linking plasmids to their publications and additional experimental resources, we hope to integrate the plasmids in DNASU into researchers' experimental designs seamlessly. With the ever-increasing time and monetary constraints within laboratories, we expect that