The current version of the androgen receptor (AR) gene mutations database is described. The total number of reported mutations has risen from 272 to 309 in the past year. We have expanded the database: (i) by giving each entry an accession number; (ii) by adding information on the length of polymorphic polyglutamine (polyGln) and polyglycine (polyGly) tracts in exon 1; (iii) by adding information on large gene deletions; (iv) by providing a direct link with a completely searchable database (courtesy EMBL-European Bioinformatics Institute). The addition of the exon 1 polymorphisms is discussed in light of their possible relevance as markers for predisposition to prostate or breast cancer. The database is also available on the internet ( http://www.mcgill.ca/androgendb/ ), from EMBL-European Bioinformatics Institute ( ftp://www.ftp.ebi.ac.uk/pub/databases/androgen ), or as a Macintosh FilemakerPro or Word file ( MC33@musica.mcgill.ca ).
Constitutional mutations in the X-linked androgen receptor gene ( AR ) cause the androgen insensitivity syndrome (AIS) by impairing androgen-dependent male sexual differentiation to various degrees ( 1–7 ). Somatic mutations in the AR have been found in advanced prostate cancer ( 8 , 9 ). Severe constitutional androgen insensitivity (AI) yields an external female phenotype. Partial constitutional AI yields a range of external genital phenotypes that vary from near-normal female to normal or near-normal male, with or without gynecomastia and other relatively ‘mild’ signs of undervirilization.
The appearance of the database has been modified slightly this year ( Table 1 ). An accession number has been assigned to each entry in the database to facilitate the construction of a completely searchable database ( Fig. 1 ). The latter is directly linked to the database home page ( Fig. 2 ). In addition to the core mutation maps for AIS phenotypes and prostatic cancer ( Fig. 3 ), we have complemented the database, by constructing maps of large gene deletions ( Fig. 4 ), and we have reclassified the number and variety of mutation types ( Table 2 ).
This version of the database contains 309 entries ( Table 2 ) representing over 200 different AR mutations ( Figs 3 and 4 ) from over 360 patients with AI, over 35 cases of advanced or metastatic prostate cancer, and one case of laryngeal cancer.
There has been a relatively small increase in the number of reported AR mutations this past year compared to last year. In the interval between September 1996 and 1997 ( Fig. 5 ), only 37 new mutations were registered, just over half the number registered last year ( 10 ). Last year's increase in registrations stemmed largely from the extraordinary number of reported mutations in prostate cancer ( 10 ). The trend in the previous few years has been a gradual decline in the number of new mutations reported. This reflected increasing difficulty in publishing reports of single mutations. This difficulty raises the question of what should be the criteria for including mutations in the database. Until now it has been the policy of the database curator to include mutations that have been published in a refereed journal or as an abstract in the proceedings of a refereed scientific meeting. In light of the obvious trend not to publish individual mutations, perhaps it is time to consider an additional criterion: membership in a consortium composed of laboratories with a peer-proven publication record on the AR .
For the first time, the database contains illustrations of AR deletions of >5 bp. A recent analysis of patients with complete AR deletions has indicated that those who have CAIS and mental retardation (MR) probably have a contiguous gene syndrome because of an MR locus between the polymorphic markers DX51 and DXS 905 at Xq11.2–12 ( 11 ).
The database has recently been expanded to include two well-known trinucleotide repeat polymorphisms of the AR . These polymorphisms encode a series of variable-length glutamine (CAG) and glycine (GGC) repeats in exon 1 of the AR gene. The CAG repeat is of special interest because its expansion causes the motor neuron disease, Spinobulbar Muscular Atrophy (SBMA) ( 12 ). Later, it was discovered that a number of other neurodegenerative diseases are caused by similar CAG expansions in a variety of unrelated genes whose normal function is, in most cases, still not established.
The reason for their inclusion in this database, however, derives from recent work, partly in our laboratory, suggesting: (i) that there is a shift in the distribution of CAG-repeat sizes in the hAR genes of breast cancer tissue ( 13 ); (ii) that CAG-repeat sizes may act as molecular markers for prostate cancer risk ( 14 , 15 ); (iii) that codon-usage variants and GGN-tract sizes may be used to seek associations with particular diseases ( 16 ).
Ultimately, we aim to prepare a three-dimensional map of the mutations that affect the structure-function properties of the ligand-binding domain of the protein. In anticipation of that goal, we have begun to position the mutations on a two-dimensional model of the domain in a typical nuclear receptor. Figure 6 shows the positions of two mutations that share an unusual set of androgen-binding properties.
Database Availability And Citation
The database is available on the internet at http://www.mcgill.ca/androgendb/ , and by an anonymous ftp server at ftp://www.ftp.ebi.ac.uk/pub/databases/androgenr . It can also be obtained as a Macintosh FilemakerPro or Word file from Bruce Gottlieb ( MC33@musica.mcgill.ca ). The internet database is updated every month; the other sources, every 3–4 months. Users are requested to cite the present article when the database has been helpful in preparing their publications.
We thank the Medical Research Council, Canada, the Fonds de la Recherche en Santé, Québec and the Fonds pour la Formation de Chercheurs et l'Aide à la Recherche, Quebec for supporting our own work on AR mutations.