aglgenes, a curated and searchable database of archaeal N-glycosylation pathway components

Whereas N-glycosylation is a posttranslational modification performed across evolution, the archaeal version of this protein-processing event presents a degree of diversity not seen in either bacteria or eukarya. Accordingly, archaeal N-glycosylation relies on a large number of enzymes that are often species-specific or restricted to a select group of species. As such, there is a need for an organized platform upon which amassing information about archaeal glycosylation (agl) genes can rest. Accordingly, the aglgenes database provides detailed descriptions of experimentally characterized archaeal N-glycosyation pathway components. For each agl gene, genomic information, supporting literature and relevant external links are provided at a functional intuitive web-interface designed for data browsing. Routine updates ensure that novel experimental information on genes and proteins contributing to archaeal N-glycosylation is incorporated into aglgenes in a timely manner. As such, aglgenes represents a specialized resource for sharing validated experimental information online, providing support for workers in the field of archaeal protein glycosylation. Database URL: www.bgu.ac.il/aglgenes

Introduction N-glycosylation, the covalent attachment of oligosaccharides to selected Asn residues of target proteins, was once thought to be a posttranslational modification unique to eukarya. It has since become clear that bacteria and archaea are also capable of this protein-processing event. However, while bacterial N-glycosylation is restricted to certain delta-epsilonproteobacteria strains (1), it appears that this protein modification is a common event in archaea (2). At the same time, archaeal N-glycosylation presents variety not seen in the parallel eukaryal or bacterial processes. This diversity is manifested in terms of glycan composition and architecture, the lipid carrier on which the N-linked glycan is assembled and the identity of the sugar that links the glycan to the lipid carrier or the target protein Asn residue (3).
In the past decade, considerable strides have been made in identifying components that catalyze steps of the archaeal N-glycosylation process, using genetics, biochemical and mass spectrometry approaches. In particular, detailed V C The Author(s) 2014. Published by Oxford University Press.

Page 1 of 4
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
(page number not for citation purposes) N-glycosylation pathways based on archaeal glycosylation (Agl) proteins have been delineated in the halophile Haloferax volcanii and in the methanogens Methanococcus maripaludis and Methanococcus voltae (3)(4)(5)(6). Insight into the process of N-glycosylation in the thermoacidophile Sulfolobus acidocaldrius has also been provided (7). For the most part, these pathways rely on species-specific pathway components or components restricted to a group of species. Examination of the limited number of archaeal N-linked glycans for which structural information has been reported points to the existence of numerous different N-glycosylation pathways (3). Given that the recent analysis of 168 available archaeal genomes has identified genes encoding AglB, the oligosaccharyltransferase central to N-glycosylation, in 166 cases (2), it would appear that the list of known archaeal N-glycosylation pathway components is only the tip of an iceberg.
With the aim of collecting all of the experimentally obtained information on archaeal N-glycosylation pathway components into a single location that will allow comparative analysis, we have assembled the aglgenes database and created a Web site that allows users access to the information contained in the database. The database allows users to search according to species, protein class or gene name to obtain all currently available genomic, biochemical, structural and functional information on archaeal N-glycosylation pathway components. The routinely updated aglgenes database can be found at www.bgu.ac.il/ aglgenes.

Aims of the database
The primary aims of the aglgenes database are to list all available experimental information on agl (archaeal glycosylation) genes and proteins that are involved in archaeal N-glycosylation pathways and to provide this information in a user-friendly web interface.

Methods
All available experimental data pertaining to the components of archaeal N-glycosylation pathways or individual enzymes shown to be involved in archaeal N-glycosylation were collected from the literature and manually curated. The data are stored and maintained in a relational database using the MySQL database management system (http://www.mysql.com/). A specific naming convention was established to uniquely identify each agl gene entry. Each gene ID starts with the letters 'agEL' followed by two digits (e.g. agEL05). The Web site interface uses embedded MySQL queries, PHP5 code to execute the queries and HTML5-CSS3 code to format the query results. New information will be added to the Web site on a monthly basis or whenever such information becomes available. A complete layout of the architecture of the aglgenes database is presented in Figure 1.
Usage aglgenes provides the user with an intuitive web interface that does not require any particular training. The latest version of any web browser is recommended for the usage of the aglgenes database. The database includes a single search field that allows the user to search through all published experimental data on genes and proteins involved in archaeal N-glycosylation, according to gene name or protein function. Alternatively, users can retrieve all known N-glycosylation pathway components from a given archaeal species. The query results are displayed in a tabular format with the gene's aglgenes ID, name, function and source organism. Clicking the aglgenes ID link opens a new web page containing all information for the selected agl gene. The results of a sample query are featured in Figure 2.
Each retrieved aglgenes gene entry includes gene information, such as gene name and locus tag, the organism(s) containing the gene and available functional information on the protein product. In addition, each entry indicates whether the gene of interest is found as part of an aglBbased N-glycosylation gene cluster. AglB, the archaeal oligosaccharyltransferase responsible for transfer of a glycan from the lipid carrier on which it is assembled to target protein Asn residues (4,8,9), is a central component of the archaeal N-glycosylation process. Other details, such as genomic context and relevant references, are also provided for each aglgenes gene entry. Links to external sources, such as Uniprot, NCBI and PDB, are also provided for each entry according to availability. Table 1 lists all fields in a given aglgenes entry and their descriptions.
Finally, an online data submission page can also be found at the aglgenes database Web site to allow users to submit updates on existing entries or new information on archaeal N-glycosylation pathway components.

Conclusions
Genome-based studies suggest that N-glycosylation is a common posttranslational modification in Archaea (2). At the same time, because sequence-based analysis can only provide limited insight into the precise function of sugar-processing enzymes, deciphering archaeal N-glycosylation pathways has been limited to a restricted number of species for which appropriate molecular tools are available (3,6,7). However, as an increasing number of different archaeal species are being adopted as model systems, the techniques and tools required for detailed delineation of N-glycosylation pathway components are starting to appear. Such efforts have revealed the largely species-specific nature of N-glycosylation pathway components. Indeed, given the enormous variety seen in terms of the content and structure of even the few characterized N-linked glycans decorating archaeal glycoproteins the recruitment of species-specific pathways is not surprising. On the other hand, as more data accumulate, it seems that certain sugar-processing enzymes are involved in N-glycosylation pathways in more than one organism. As a repository of all available experimentally confirmed information on archaeal N-glycosylation, the aglgenes database allows researchers interested in archaea and/or in N-glycosylation to learn more about this posttranslational modification that, in some cases, is associated with the ability of these organisms to survive the extreme environmental conditions they encounter (10).