The Immuno Polymorphism Database (IPD), http://www.ebi.ac.uk/ipd/ is a set of specialist databases related to the study of polymorphic genes in the immune system. The IPD project works with specialist groups or nomenclature committees who provide and curate individual sections before they are submitted to IPD for online publication. The IPD project stores all the data in a set of related databases. IPD currently consists of four databases: IPD-KIR, contains the allelic sequences of killer-cell immunoglobulin-like receptors, IPD-MHC, a database of sequences of the major histocompatibility complex of different species; IPD-HPA, alloantigens expressed only on platelets; and IPD-ESTDAB, which provides access to the European Searchable Tumour Cell-Line Database, a cell bank of immunologically characterized melanoma cell lines. The data is currently available online from the website and FTP directory. This article describes the latest updates and additional tools added to the IPD project.
The study of the immune system constitutes many different complex areas of research. The Immuno Polymorphism Database (IPD) is a set of specialist databases related to the study of polymorphic genes of the immune system. The IPD project works with specialist groups and/or nomenclature committees, which each curate a different section of the project. IPD currently consists of four databases: IPD-KIR, currently contains the allelic sequences of human killer-cell immunoglobulin-like receptors (KIR); IPD-MHC, a database of sequences of the major histocompatibility complex (MHC) of different species; IPD-HPA, alloantigens expressed only on platelets; and IPD-ESTDAB, which provides access to the European Searchable Tumour Cell-Line Database (ESTDAB), a cell bank of immunologically characterized melanoma cell lines. By providing a centralized resource for the work of different groups, it is hoped we can bring together similar data to aid in the study and analysis of this area. The IPD project stores all the data in a set of related databases. Those sections with similar data, IPD-KIR and IPD-MHC share the same database structure. The sharing of a common database structure makes it easier to implement common tools for data submission and retrieval. Other unrelated sections like IPD-ESTDAB currently have their own unique structure.
IPD provides a large number of tools for the analysis of KIR and non-human MHC sequences. These tools are either custom written for the database or are incorporated into existing tools on the European Bioinformatics Institute (EBI) website, see Figure 1 (1,2).
These tools include the following:
Sequence alignments: access to alignment tool, which filters pre-generated alignments to the users’ specification. Provides alignments at the protein and level.
Allele queries: access to detailed information on any allele, including information on database cross-references and seminal publications. This information is also available through integration in the Sequence Retrieval System service at EBI (3).
Downloads: access to a directory containing all the data from the current and previous releases in a variety of commonly used formats like FASTA, MSF and PIR.
The MHC sequences of many different species have been reported (6–17), along with different nomenclature systems used in the naming and identification of new genes and alleles in each species (18). The sequences of the MHC from a number of different species are highly conserved between species (19). The nomenclature for MHC genes and alleles in species other than humans (20,21) and mice (22,23) has historically been overseen either informally by groups generating sequences, or by formal nomenclature committees set up by the International Society for Animal Genetics (ISAG) (24). This work is now overseen by the Comparative MHC Nomenclature Committee and is supported by ISAG and the Veterinary Immunology Committee of the International Union of Immunological Societies (25). By bringing the work of different nomenclature committees and the sequences of different species together, it is hoped to provide a central resource that will facilitate further research on the MHC of each species and on their comparison (26).
The first version of the IPD-MHC database involved the work of groups specializing in non-human primates (NHP) (16), canines (DLA) (12) and felines (FLA) (27) and incorporated all data previously available in the IMGT/MHC database (26). Since the first version, we have been able to add sequences from cattle (BoLA) (17), teleost fish (28), rats (RT1) (29), sheep (OLA) (15) and swine (SLA) (14). In 2012, the nomenclature used to describe the alleles of NHP was extensively revised and updated (16). This was accompanied by updating the IPD-MHC NHP section to complement the publication; IPD-MHC NHP currently contains >4000 alleles covering 47 species of Apes, Old World and New World Monkeys. The management of the sequences within IPD-MHC and the provision of an online submission tool have enabled these databases to grow, the number of sequences increasing by at least 10% each year, and the nomenclature to expand since the inclusion of a species within IPD. This has resulted in regular publications reporting updates or changes to the nomenclature (15–17,30).
The KIRs are members of the immunoglobulin super family formerly called killer-cell inhibitory receptors. KIRs have been shown to be highly polymorphic both at the allelic and haplotypic levels (31). They are composed of two or three Ig-domains, a transmembrane region and cytoplasmic tail, which can in turn be short (activatory) or long (inhibitory). The leukocyte receptor complex, which encodes KIR genes, has been shown to be polymorphic, polygenic and complex in a manner similar to the MHC. Because of the complexity in the KIR region and KIR sequences, a KIR Nomenclature Committee was established in 2002, to undertake the naming of human KIR allele sequences. The first KIR Nomenclature report was published in 2003 (32), which coincided with the first release of the IPD-KIR database. The number of officially named human KIR alleles has increased since the initial release, which contained 89 alleles. As of September 2012, there are >600 alleles, which code for >320 unique protein sequences.
A recent development on the IPD-KIR website is online tools to assist in the prediction of transplant outcome in an unrelated haematopoetic stem cell transplant based on the KIR content of the individuals involved. In 2008, a tool was added to the website to help predict NK cell alloreactivity based on the KIR ligands present in the patient and donor, as transplant strategies based on KIR-ligand mismatches had been shown to influence relapse, graft versus host disease and survival in patients with acute myeloid leukaemia (33). In 2010, with the goal of developing a donor selection strategy to improve transplant outcome, Cooley et al., (34) compared the contribution of KIR gene motifs with the clinical benefit conferred by donors with a particular haplotype. Donor KIR genotype influenced transplantation outcome for some forms of leukaemia, after a T-cell replete unrelated donor transplant. KIR genotyping several HLA-matched potential donors could substantially increase the frequency of transplants using unrelated donor grafts with favorable KIR gene content. To implement this strategy, the IPD-KIR database was asked to provide an online version of the algorithm described in the article, see Figure 2. The B-Content calculator (http://www.ebi.ac.uk/ipd/kir/donor_b_content.html) allows the user to enter the KIR genotypes for up to five prospective donors, and receive their B-Content assignments, and a prediction result of the effect of the KIR genotype on transplant outcome. To ensure only valid KIR genotypes are submitted, all genotypes submitted are compared with a list of predicted genotypes based on known KIR haplotypes. In addition, this list has been supplemented with a number of additional KIR genotypes that have been defined in routine KIR typing. If a prospective donor's KIR typing does not match any of the genotypes on this list, a warning is issued.
In addition to allelic polymorphism, there is haplotypic variability within the KIR region due to the different number and kind of KIR genes. Haplotypic diversity is a major contributor to the population diversity of KIR and of cell repertoires. There are a number of fully sequenced KIR haplotypes that have been reported (35,36). The IPD-KIR website collates this data into a graphic displaying the gene and allele content of these, Figure 3. Where possible the alleles sequenced at the individual KIR genes are also listed. Some genes are only partially sequenced and for these an allele designation is not provided. The graphic provides interactive links to the individual allele entries, as well as details on the cell source used for each haplotype.
The IPD-KIR database is also being expanded to include the KIR sequences from other species, and most recently work has begun on including the sequences of KIR alleles found in Rhesus Macaques (Macaca mulatta) (37,38). Work is also underway on a number of other species of Macaques as well as Chimpanzees, Orangutans and Domestic Cattle (39–42). The non-human KIR sequences will be included into the IPD-KIR section and be accessible using the same tools as the human KIR sequences. Submission tools are already available for depositing non-human KIRs into the database.
Human platelet antigens (HPA) are alloantigens expressed only on platelets, specifically on platelet membrane glycoproteins. These platelet-specific antigens are immunogenic and can result in pathological reactions to transfusion therapy. The HPA nomenclature system was adopted in 1990 (43,44) to overcome problems with the previous nomenclature. Since then, more antigens have been described and the molecular basis of many has been resolved. As a result, the nomenclature was revised in 2003 (45) and included in the IPD project. The IPD-HPA section contains nomenclature information and additional background material. The different genes in the HPA system have not been sequenced to the same level as some of the other projects and so currently only single-nucleotide polymorphisms are used to determine alleles. This information is presented in a grid of single-nucleotide polymorphisms for each gene. The IPD and HPA nomenclature committee hope to expand this to provide full sequence alignments when possible.
IPD-ESTDAB is a database of immunologically characterized melanoma cell lines. The database works in conjunction with the European Searchable Tumour Cell Line Database (ESTDAB) cell bank (46,47), which is housed in Tübingen, Germany and provides immunologically characterized tumour cells. The IPD-ESTDAB section of the website provides an online search facility for cells stored in this cell bank. This enables investigators to identify cells possessing specific parameters important for studies of immunity, immunogenetics, gene expression, metastasis, response to chemotherapy and other tumour biological experimentation. The search tool allows for searches based on a single parameter, or clusters of parameters on >250 different markers for each cell. The detailed reports produced can then be used to identify cells of interest, which can be obtained from the cell bank.
In 2012, the IMGT/HLA Database added an Extensible Markup Language (XML) export to the data formats available (20). XML is a simple but flexible language that defines a set of rules for encoding documents in a format that is both human-readable and machine-readable. Designed to meet the challenges of large-scale electronic publishing, XML is playing an increasingly important role in the exchange of scientific data. Work is currently underway to also produce XML formatted data files for the IPD-KIR and IPD-MHC database.
The IPD project provides a resource for those interested in the study of polymorphic sequences in the immune system. By accommodating related systems in a single database, data can be made available in common formats aiding use and interpretation. As the projects grow and more sections are added, the benefit of having expertly curated sequences from related areas stored in a single location is becoming more apparent. This is particularly true of the IPD-MHC project, where cross-species studies are able to use the high-quality sequences provided by the different nomenclature committees in a common standardized format, ready for use. The initial release of the IPD Database contained only four sections and a small number of tools; however, as the database has grown and more sections and species have been added, more tools have been added to the website. We plan to use the existing database structures to house data for new sections of the IPD project as they become available. The files will also be made available in different formats to download from the website, FTP server and included different web services at the EBI (1).
IPD Homepage: http://www.ebi.ac.uk/ipd/
IPD-KIR Homepage: http://www.ebi.ac.uk/ipd/kir/
IPD-MHC Homepage: http://www.ebi.ac.uk/ipd/mhc/
IPD-HPA Homepage: http://www.ebi.ac.uk/ipd/hpa/
IPD-ESTDAB Homepage: http://www.ebi.ac.uk/ipd/estdab/
If you are interested in contributing to the project, there are specific guidelines for the inclusion of new sections, and interested parties should contact James Robinson, email@example.com for further information.
The IPD projects have received funding from the European Commission within the Fifth Framework Infrastructures program [contract number QLRI-CT-200!-01325] for IPD-ESTDAB and by a National Institutes of Health grant [NIH/NCI P01 111 412] for IPD-KIR. The work of the IPD databases is recognized and supported by the International Union of Immunological Societies (IUIS) for both KIR nomenclature through the IUIS KIR Nomenclature Committee and MHC Nomenclature by the International Society for Animal Genetics (ISAG) and the Veterinary Immunology Committee (VIC). Funding for open access charge: Anthony Nolan Research Institute.
Conflict of interest statement. None declared.
The authors like to acknowledge the work of all the individual nomenclature committee for both the MHC and HPA sections, as well as our ESTDAB collaborators. The authors would also like to acknowledge the support provided by the External Services Group at the European Bioinformatics Institute which allows the IPD project to be hosted within the EBI infrastructure.