Finland, located at the edge of the inhabitable world, is one of the best-studied genetic isolates. The characteristic features of population isolates—founder effect, genetic drift and isolation—have, over the centuries, shaped the gene pool of the Finns. Finnish diseases have been a target of extensive genetic research and the majority of some 35 disease genes enriched in this population have been identified; the molecular and cellular consequences of disease mutations are currently being characterized. Special strategies taking advantage of linkage disequilibrium have been efficiently used in the initial mapping and restriction of Finnish disease loci and this has stimulated development of novel statistical approaches in the disease gene hunt. Identification of mutated genes has provided tools for detailed analyses of molecular pathogenesis in Finnish diseases, many of which reveal a distinct tissue specificity of clinical phenotype. Often these studies have not only clarified the molecular detail of Finnish diseases, but also provided novel information on biological processes and metabolic pathways essential for normal development and function of human cells and tissues.
The Population of Finland
The dual theory of the inhabitation of Finland, supported by recent analyses of Y chromosome haplotypes, assumes an early migratory wave of eastern Uralic speakers some 4000 years ago with a distinct effect on today's Finnish gene pool. The majority of the genes of today's Finnish population is thought to originate from later small founder populations which in the beginning of the first millennium immigrated from the south over the Gulf of Finland ( 1–6 ). This migration of Indo-European speakers some 2000 years ago was also accompanied by language replacement ( Fig. 1 A; 7 ).
Since there is no archaeological, historical, linguistic or genetic evidence for any single major migratory movement to Finland, it is actually more likely that Finland has been inhabited without interruption from the last glacial period, constantly receiving small but significant immigrant groups, mainly from the south and east, but also from the west, throughout prehistoric times.
Both paternally inherited Y chromosome haplotypes and maternally inherited mitochondrial sequences show an exceptional decrease in the genetic diversity of the Finns when compared with other European populations ( 7 ). The reasons for the relative isolation of the Finnish population throughout the centuries are mostly geographical. The Baltic Sea surrounds the country to the south and west and the Arctic Ocean to the north. The geopolitical position between Sweden and Russia, between two areas of distinct culture, language and religion, also had an impact on the isolation of Finland. For centuries, only a narrow strip of land in the coastal areas of the south and southwest was populated and the number of settlers must have been small ( 7 ). In the 12th century the population of Finland was only ∼50 000. The population fluctuated and reached 250 000 in the 16th century, habitation still being heavily concentrated in the coastal areas.
An internal migration movement began in the 16th century from a small southeastern area to the middle, western and finally northern and eastern parts of the country ( Fig. 1 B). With increasing population there was pressure to cultivate more land. Another main reason for internal immigration was to avoid the increasing taxation of the Swedish Crown, which had developed a good administrative infrastructure, even reaching into the backlands of Finland. Especially during the regime of the Swedish King Gustavus of Vasa in 1523–1560, inhabitation of wilderness was highly favored. The era of his regime also resulted in the creation of church records, a major resource for later genetic studies of the Finns. These records report births and deaths, marriages and moves of families and provide a highly reliable source of geneological information, especially since >90% of the population still belongs to the Evangelic Lutheran State Church.
Within a century the inhabited land area of Finland doubled. As the settlers continued to the north, the Saami native minority (Lapps) was forced to move towards the Arctic Ocean. By the end of the 17th century almost the entire land area of Finland was sparsely but permanently populated by Finns. The great famine during the years 1696–1698 and the epidemics that followed were a setback, as one-third of the population of 400 000 was lost. Since then, the Finnish population has grown relatively rapidly from 250 000 at the beginning of the 18th century to its present 5 100 000 ( Fig. 2 ). The internal migration in the 16th century resulted in isolated rural populations that remained surprisingly stable until the Second World War and industrialization.
The Concept of the Finnish Disease Heritage
Twenty rare inherited disorders more prevalent in Finland than in other populations were initially described in 1973 and this introduced the concept of the Finnish disease heritage ( 8 ). Today this group includes >30 typically recessive diseases ( 5 , 9 ) and new diseases with well-defined clinical phenotypes are still being added ( 10 ).
The Finnish disease heritage has its origins in the special population history of Finland, briefly described above. The small number of original founders, followed by isolation, rapid expansion and the sampling of small immigrant groups from the main population, allowed the founder effect and genetic drift to mold the gene pool, especially in the regional subiso-lates ( 2 , 3 , 8 ). The frequency of some rare diseases increased, creating the Finnish disease heritage, whereas some disease alleles became almost non-existent (e.g. phenylketonuria, galactosemia, maple syrup disease and cystic fibrosis). In Finnish subisolates a small number of founders unavoidably resulted in consanguineous marriages, although relationships between individuals were typically several generations old and unknown to them. This random inbreeding increased the local incidence of rare recessive disorders and in some Finnish diseases a regional clustering of cases can still be observed, reflecting the regional origin of the founders. Good examples are the cluster of families with progressive epilepsy with mental retardation (EPMR) ( 11 ) on the eastern border and the similar cluster of the variant form of late infantile neuronal ceroid lipofiiscinosis (vLINCL) cases on the western coastline ( Fig. 3 ; 12). It should be emphasized that in most Finnish diseases solid evidence of the true relationship between affecteds cannot be obtained. The church records reach back to 1634 and the land tax registers to the 1540s and any claim for older relationships can only be based on the origin of founders in nearby or the same communities.
Identification of Finnish Disease Loci
The Finnish disease heritage has been the subject of intensive genetic and molecular research over the past 10 years. The loci for 32 of these diseases have been localized and the causative genes for 17 diseases have been isolated, many of them by positional cloning ( Table 1 ) and, according to expectations, all show extreme locus and allelic homogeneity. The molecular data have unequivocally proved the isolation of and significant founder effect in this population. None of these diseases shows locus heterogeneity within Finland, although some, like neonatally lethal Meckel syndrome or late onset frontal lobe dementia with bone cysts (PLO-SL), demonstrate locus heterogeneity outside Finland ( 40 , 44 ). Furthermore, for all the cloned Finnish disease genes, one major mutation is present in ≥70% of the disease chromosomes, representing as much as 98% of some disease alleles ( Table 1 ; 15,23,45). Since some of diseases are spread throughout the country, they must be the result of an old mutation event carried to Finland with an early immigrant. One major mutation still identified in all diseases is evidence for effective isolation of the population for the past 2000 years.
As a direct consequence of one major founder mutation, linkage disequilibrium (LD), detected with multiple polymorphic markers flanking the disease locus, is obvious in all alleles of the Finnish disease heritage. The genetic interval showing some degree of LD with individual markers at a particular chromosomal region varies considerably in different disease alleles ( Table 2 ). The longer the observed LD interval, the more remarkable is the detectable clustering of grandparents of patients and the more recent the introduction of the mutation into the population ( Fig. 3 ; 3). However, the genetic properties of chromosomal regions and the different demographic histories of Finnish subpopulations also have an impact on the size of the LD interval. The extremes of LD intervals are 13 cM observed in congenital chloride diarrhea and 11 cM in vLINCL chromosomes ( 12 , 46 ).
The first mapping efforts of Finnish diseases were completed using RFLP markers and traditional linkage analyses in a relatively large number of families. This is exemplified by the locus positioning of diastrophic dysplasia (DTD) and infantile neuronal ceroid lipofuscinosis (INCL) mapped with 15 (DTD) and 13 (INCL) families with two or more affected children ( 47 , 48 ). When rarer or neonatally lethal diseases became the target of mapping efforts, special LD-based strategies were applied and genome scans relying on monitoring of genotypes among affecteds or homozygosity of marker genotypes were successfully used. Initial positioning of the locus for the infantile form of spinocerebellar ataxia (IOSCA) was performed using DNA samples of only four patients to search for shared alleles of multiallelic markers ( 35 ). Two cases were from the same sibship with the parents being first cousins and two from two sibships with affecteds being separated by seven meioses. A sparse marker set of only 218 markers was used (average marker interval 20 cM) in the initial scan of these four patients and second stage genotyping of all families and segregation-based linkage analyses were only performed for those markers revealing sharing of the alleles. The benefits in the saving of labor and money are obvious; the disease locus was positioned using only a total of 1000 genotypings instead of the >10 000 needed for the traditional, linkage-based strategy in more mixed populations.
Another example of the power of LD is the recent genome scan to identify the locus for a new lethal neonatal metabolic syndrome ( 10 ). The initial genome scan was carried out using only four affected sibpairs and searching for markers revealing IBS genotypes in affecteds. Consequent linkage analysis in the small family material consisting of only 20 affected and 16 non-affected chromosomes resulted in only suggestive pair-wise lod scores and failed to provide unequivocal evidence for the locus position. However, the linkage disequilibrium ( P < 0.0001) initially observed with three markers provided proof for the locus assignment. The multipoint linkage disequilibrium analysis (DISMULT) ( 49 ) combining information from four markers revealing LD resulted in an impressive lod score of 30.9 and confirmed the locus assignment to 2q33–37 ( 34 ).
Fine Mapping of Finnish Disease Loci
The concept of a restricted number of founders and rapid expansion of a population in relative isolation has similarities to the exponential growth of bacterial populations. This motivated Hästbacka et al. ( 47 ) to adapt bacterial genetics to interpret the LD information in a quantitative fashion using the Luria and Delbrück formula ( 50 ) originally designed to estimate mutation rate in rapidly growing bacterial cultures. The formulae allow estimation of the likely mean and standard deviation of the recombination fraction between a disease locus and a marker, the degree of the homogeneity of disease alleles and the mutation rate for genetic markers. The success of this strategy in the evaluation of the distance between a disease locus and a close marker was best demonstrated by the isolation of a diastrophic dysplasia gene precisely at the distance approximated in the original linkage map. The gene was predicted to lie 0.06 cM (or 60 kb) proximal of the best marker and was finally isolated 70 kb proximal of it ( 19 , 47 ). Reverse use of the Luria-Delbrück formula has also been adapted to estimate the age of the mutation. For congenital chloride diarrhea the estimate was 20 generations, in good agreement with the population history of the spread of this mutation in an expanding subpopulation in Eastern Finland starting 400 years ago ( 17 ). Monitoring the high LD in disease alleles has targeted the search for disease genes to a highly restricted DNA region in several successful positional cloning projects, including cloning of nephrin, a gene mutated in congenital nephrosis, and AIRE, a novel gene mutated in a multisymptomatic autoimmune disease, APECED ( 13 , 18 ).
Monitoring an ancestral haplotype by fine mapping of chromosome regions showing linkage or LD has also helped in locus restriction in all Finnish disease alleles. A marker set providing 0.1–1 cM resolution has without exception revealed one major shared haplotype among affecteds and greatly facilitated multiple positional cloning efforts of Finnish disease genes. It should be emphasized that, as described earlier, although LD can be observed over wide genetic intervals, identical haplotypes of disease alleles in affecteds from different families are systematically found only across highly restricted DNA regions.
Two examples of powerful restriction of the critical DNA region based on a shared haplotype are progressive myoclonus epilepsy (EPM1; OMIM 254800) and PLO-SL (OMIM 221770), mentioned earlier. The obligatory recombinations observed in linkage analysis assigned the EPM1 locus to a 7 cM region on chromosome 21q. The haplotype shared among affected alleles in 38 families restricted the critical region to 176 kb ( 51 ). Similarly, the PLO-SL locus was, in a genome-wides can, assigned to the 9 cM region bordered by obligatory recombinations and disease alleles showed LD with markers over a 3 cM interval on chromosome 19q. A shared haplotype was seen only over 150 kb in 14 families ( 40 ). These examples demonstrate the power of ancient recombinations in restriction of the critical DNA region to an interval accessible by molecular tools. All reported positional cloning projects on Finnish diseases have efficiently used haplotype-based restriction of the critical chromosomal DNA region, which has provided the basis for successful isolation of the disease gene.
Lightning Hitting the Same Spot Twice?
One interesting feature in the mapped disease loci or identified causative genes of the Finnish disease heritage is that some loci are, surprisingly, located within the same restricted chromosomal regions. On chromosome 17 two severe diseases affecting multiple tissues (Meckel syndrome and Mulibrey nanism) are positioned within the same 1.2 Mb DNA clone contig ( Table 1 ; 52 ). The disease-associated alleles carry different haplotypes, although in both diseases one major haplotype is obvious. LD is detected over an interval of 2 cM in Mulibrey nanism and of 1 cM in Meckel alleles, suggesting an earlier introduction of the Meckel mutation into the population. Similarly, on chromosome 19, the nephrin gene, mutated in congenital nephrosis, and the still unknown gene behind PLO-SL are located within the same 150 kb physical contig ( Table 1 ; 53 ). Again, a totally different haplotype was identified in these two disease alleles and the interval revealing LD indicated different ages for these two mutations. It is intriguing that independent disease mutations enriched in this population have hit so close to each other at historical times separated by tens of generations. Analyses of the full sequence information of these chromosomal regions could provide some explanation for this interesting feature.
Functional Analyses of Finnish Disease Genes
Thoroughly characterized phenotypes in patients with Finnish diseases provide a profound basis for functional studies of newly identified, often novel disease genes. This again has not only led to an understanding of the disease mechanisms but in multiple cases revealed novel biological information on essential metabolic pathways. Currently, a total of 17 genes of Finnish diseases have been isolated and for most of these some functional data already exist. Some genes have proved to code for essential molecules with highly organ-specific functions. They either show target tissue-specific expression, like nephrin in congenital nephrosis ( 18 , 54 ), chloride transporter in congenital chloride diarrhea ( 17 , 100 ) and FSHR in ovarian dysgenesis ( 22 , 56 ), or encode a component of a metabolic pathway that although present in multiple tissues is crucial for a specific tissue, like DTSD for cartilage in diastrophic dysplasia ( 19 , 55 ) or REP1 in choroideremia. The recently identified gene for an inherited autoimmune disease carries features of a transcription regulatory element (AIRE in APECED) ( 13 , 57 ) but the explanation for multiple tissue symptoms remains to be solved. Ten of the cloned genes in Finnish diseases are involved in neuronal dysfunction. In six of these molecular studies have led to findings of general biological significance and exposed novel details of important metabolic pathways in the CNS. Functional studies of some representative examples are described briefly here.
Aspartylglucosaminuria—exposing new features of lysosomal enzymes
Studies on the molecular mechanisms behind aspartylglu-cosaminuria (AGU; OMIM 208400) provide a good example how research on a rare disease has resulted in novel information as to the general biology of lysosomal enzymes. AGU results from deficient activity of lysosomal aspartylglucosami-nidase (AGA) and the disease results in progressive mental retardation. Following identification of the gene the major disease-causing mutation (AGU Fin ) was shown to reside in 98% of disease alleles in Finland. Cellular expression studies of the normal and mutated enzyme provided an understanding of the complex intracellular synthesis, assembly and activation of this heterotetrameric enzyme molecule ( 15 , 58–60 , 64 ). Crystal-lographic analyses of the three-dimensional structure of the AGA enzyme revealed a novel catalytic mechanism based on the N-terminal nucleophile and indicated that AGA is the first eukaryotic member of a novel, recently identified enzyme family of N-terminal nucleophile hydrolases ( 61 , 62 ). Detailed structural analyses also led to the characterization of a novel three-dimensional determinant essential for phosphotrans-ferase recognition and lysosomal targeting ofAGA, potentially providing a general structural motif for the correct mannose 6-phosphorylation signal of all lysosomal enzymes ( Fig. 4 ; 63 ). Knock-out mouse models have replicated the tissue pathology of human disease amazingly well ( 65 , 66 ). Aga −/− mice demonstrate not only the characteristic lysosomal storage vacuoles in CNS neurons but MRI also revealed signs of brain atrophy similar to that seen in older AGU patients. By immunohisto-chemistry and MRI a subtle delay in myelination was also observed and, similar to the slow clinical course observed in human patients, Aga −/− mice have behavioral symptoms that emerge at an older age ( 67 ). The AGU mouse model has further created possibilities to analyze the relationship between myelination and the AGA enzyme and laid the basis for the first therapeutic interventions in AGU ( 68–70 ).
Neuronal ceroid lipofuscinoses—a novel class of lysosomal disorders
Neuronal ceroid lipofuscinoses (NCL diseases) constitute a group of severe neurodegenerative diseases all revealing similar tissue pathology with autofluorescent ceroid lipofuscin accumulation bodies as well as hallmark clinical features, including blindness and early death of neocortical neurons. Three of these, i.e. the most severe infantile form, INCL (OMIM 256730), vLINCL (OMIM 256731) and EPMR (CLN8; OMIM 600143), belong to the group of Finnish diseases. The genes behind vLINCL and EPMR were only recently identified (29; A.-E. Lehesjoki et al. , unpublished data) and, consequently, functional data exist so far only for INCL. The causative gene for INCL encodes a palmitoyl protein thioesterase (PPT). This finding revealed for the first time the absolute necessity for correct removal of palmitoyl residues from lipid-modified proteins for post-natal survival of neocortical neurons ( 23 ). Somewhat surprisingly, PPT was shown to be a lysosomal enzyme, thus classifying INCL as a lysosomal storage disorder ( 71 , 72 ). Ninety-eight percent of Finnish INCL alleles carry the same Arg22Trp mutation, resulting in almost complete failure of transport of the mutant PPT into lysosomes ( 71 ). Analyses of the neuronal trafficking of PPT have shown that PPT is transported to the synaptic region in cortical neurons, implying an important role for PPT in synaptic transmission, most probably via interference with membrane recycling, which is extensive in neurons (O. Heinonen et al. , unpublished data). Recent data have revealed that other proteins are also defective in NCL diseases; battenin in CLN3 (OMIM 204200) ( 73 ) and pepinase in CLN2 (OMIM 204500) ( 74 ) represent lysosomal components. An intriquing question emerges as to whether these proteins or the pathways they are involved in interact, since the defects in all of them result in a surprisingly similar tissue pathology, lethal to neocortical neurons although the genes are expressed in all tissues.
Progressive myoclonus epilepsy (Unverricht-Lundborg disease), EPM1—linking minisatellite expansion to apoptosis
Progressive myoclonous epilepsy (PME, Unverricht-Lund-borg disease; OMIM 254800) is characterized by tonic-clonic seizures, myoclonic seizures and progressive neurological dysfunction, including dementia and ataxia. Mutations in cystatin B (CSTB) are responsible for EPM1 disease ( 27 ), but the disease mechanism is so far unknown. In the majority of Finnish EPM1 patients (96% of disease alleles) the underlying mutation is an unstable minisatellite expansion in the promoter region of CSTB ( 75 , 76 ). In humans EPM1 shows a phenotypic triad consisting of myoclonic seizures, progressive neurological decline and occasional tonic-clonic seizures. Mice lacking CSTB develop myoclonic seizures and ataxia, similar to the symptoms seen in the human disease. The tissue pathology includes loss of cerebral granule cells, which display condensed nuclei, fragmented DNA and other cellular changes characteristic of apoptosis ( 77 ).
The defective protein, cystatin B, belongs to a large class of proteins that inhibit cysteine proteases and in vitro experiments have shown that cystatin B is a reversible inhibitor of cathepsins B, H, L and S ( 78 , 79 ). Although the cathepsins are localized to lysosomes, cystatin B resides in the cytosol where it probably has a role in the regulation of proteolysis. This suggests that the neuronal pathology of EPMl results from excessive proteolysis, most harmful to normal neuronal metabolism. The most likely mechanism for neuronal degradation is that the cathepsins, inhibited by cystatin B, directly activate the cas-pases, thus leading to initiation of apoptosis, and that symptoms of EPMl are initiated by neuronal apoptosis.
Choroideremia—tissue specificity in the prenylation defect
Choroideremia (OMIM 303100) represents an example of X-linked Finnish diseases characterized by progressive dystrophy of the choroid, retinal pigment epithelium and retina and resulting in blindness. The gene responsible for this disorder was identified by positional cloning in 1990 ( 80 ) but only later was the gene product identified as REP-1 (Rab-escort protein) ( 81 , 82 ). Lymphoblasts of choroideremia patients are deficient in Rab geranyl-geranyltransferase, which can be reversed by the addition of REP-1 ( 83 ). Rab geranyl-geranyltransferase acts on Rab proteins, which regulate vesicle transport through both the exocytic and endocytic pathways by controlling the assembly of protein complexes involved in vesicle targeting and fusion. The trafficking function of Rab proteins depends upon their binding to cell membranes and this is in turn dependent on lipid modifications. Consequently, modification of Rab proteins by geranyl-geranyltransferase is essential for their function. Rab geranyl-geranyltransferase has a very low affinity for the Rab protein and Rab proteins are recognized only when present in a stable complex with REP ( 84 ).
Again, the specificity of the disease in degeneration of the retina and choroid is difficult to account for, since Rab proteins are present in all cells and geranylgeranylation is absolutely required for Rab function. One explanation would be that loss of REP-1 results in defective prenylation of one Rab protein, Rab27 which is expressed in the choroid and retina. As a consequence of the loss of Rab27 function, the choroid epithelial layer in the eye undergoes degeneration leading to loss of vision ( 83 , 84 )
Familial amyloidosis of the Finnish type—abnormal proteolysis triggering amyloid formation
Familial amyloidosis of the Finnish type (FAF; OMIM 105120), one of two dominant diseases enriched in Finland ( Table 1 ), is a polyneuropathy with lattice dystrophy of the cornea and cranial and peripheral neuropathy ( 85 ). The amyloid fibrils in FAF consist of specific peptides proteolytically cleaved from gelsolin and somewhat exceptionally this disease gene was identified based on biochemical characterization of the tissue amyloid ( 86 ). All Finnish FAF patients have a single point mutation in the gelsolin gene changing Asp187 to Asn ( 20 , 90 ). This same mutation is, surprisingly, also observed in Japanese FAF patients on a totally different haplotype, suggesting a mutation hotspot in this particular gene region. World-wide the only other FAF mutation, also hitting the same nucleotide, has been identified in Danish and Czech (Asp187Tyr) ( 87 ) patients. This homogeneity of FAF mutations is somewhat surprising for a dominant trait. Gelsolin is an actin-modulating protein which exists in intracellular and secreted forms, both encoded by a single gene on chromosome 9 ( 88 ). Gelsolin carrying the Finnish mutation is abnormally proteolytically processed, resulting in secretion of an N-termi-nally truncated amyloid precursor fragment and this triggers further proteolytic cleavage resulting in amyloid peptides in tissues ( 89 ). The capacity for this pathological cleavage by a so far unknown proteolytic activity seems to vary between different cell and tissue types, being most prominent in cells of neuronal origin. This partially explains the tissue specificity of the symptoms and the accumulation of amyloid in cornea and around peripheral nerves ( 91 ).
Congenital nephrotic syndrome of the Finnish type—a novel component of the glomerular filter exposed
Congenital nephrotic syndrome of the Finnish type (CNF; OMIM 256300) is characterized by proteinuria, which begins in utero ( 92 , 93 ). The NPHS1 gene mutated in CNF encodes a novel protein called nephrin with highly specific expression in the kidney glomerulus. Nephrin is a transmembrane protein of the immunoglobulin superfamily which is expressed in the slit diaphragm of the kidney glomerulus ( 18 ), demonstrating an essential role of nephrin in the normal glomerular filtration process ( 18 , 54 ).
Two major mutations were detected in the Finnish population, named Fin major and Fin minor . Sixty-five percent of the patients are homozygous for the Fin major mutation, 8% homozygous for Fin minor and 16% of patients compound heterozygous ( 18 ). Based on both this relatively ‘low predominance’ of the major Finnish mutation and the distribution of the ancestors of the major mutation over the geographical area of Finland, CNF must be one of the oldest representatives of the Finnish disease heritage.
Diastrophic dysplasia and congenital chloride diarrhea—defects in sulfate transporters
Diastrophic dysplasia (DTD; OMIM 222600) is a recessively inherited osteochondrodysplasia caused by mutations in the diastrophic dysplasia sulfate transporter (DTDST) gene on chromosome 5q31 ( 19 ). Ninety percent of Finnish DTD alleles carry a splice site mutation resulting in severely reduced mRNA levels (DTDST Fin ). A somewhat peculiar finding was that two additional mutations were identified on the same ancestral haplotype found in 95% of Finnish DTD alleles ( 94 ). One explanation for this peculiar feature would be that all these DTD mutations arose in the same haplotype, common in the population some 100 generations ago. This haplotype became admixed with another (larger) wave of immigrants not having this haplotype, which became rare in the following generations. Alternatively, this particular haplotype could be more prone to mutations, although we are not aware of such a phe-nomen in any other disease allele.
The DTDST gene encodes a Na + -independent sulfate transporter expressed in all tissues. Recent in vitro data demonstrate that Na + -dependent sulfate transport, an event more crucial for chondrocytes than any other cell type, is dominantly dependent on the DTDST system. Further, undersulfation of proteogly-cans impairs the growth response of the cells to fibroblast growth factor and this also contributes to the hallmark feature of diastrophic dysplasia, severely disturbed endochondral bone formation ( 55 ).
Clinical presentation of congenital chloride diarrhea (CLD; OMIM 214700) is a lifetime, potentially fatal diarrhea with a high chloride content. The disease results from mutations in a chromosome 7 gene encoding a transmembrane protein also belonging to the sulfate transporter family. All Finnish patients have a deletion of codon 317 resulting in an in-frame deletion of Val. In vitro expression in Xenopus oocytes demonstrated that both chloride and sulfate ions are transported by this transmembrane protein and this activity is totally lost in the Val-deleted mutant ( 100 ). In situ data show preferential expression of this gene in intestinal surface epithelium, explaining the involvement of this protein in intestinal symptoms of patients ( 17 ).
Ovarian dysgenesis—new clues to infertility
Ovarian dysgenesis resulting in infertility (OMIM 233300) is caused by mutations in the follicle stimulating hormone receptor (FSH-R) ( 22 ). The causative gene was identified by the positional candidate gene strategy in Finnish families very soon after the first description of the clinical phenotype ( 95 ). FSH-R is a G-protein-coupled transmembrane receptor containing a large extracellular hormone-binding domain. FSH-R has a critical role in reproduction through the control of gonadal development and gamete production ( 96 ). In females FSH-R is expressed in the granulosa cells of the ovaries and it controls follicular growth and ovarian steroidogenesis ( 97 ). An inactivating mutation in FSH-R (Ala189Val) found in all Finnish disease alleles produces high gonadotropin levels and streaky gonads associated with primary amenorrhea in females. Males with the same mutation display various degrees of spermatogenic failure or absolute infertility ( 22 , 98 ).
FSH-R-deficient mice have been produced and female mice display a severe phenotype with thin uteri and small ovaries and are sterile because of a block in folliculogenesis ( 56 ). Mutant males display small testes, partial spermatogenic failure and reduced fertility. These mice will be of major importance in reproduction research since they provide an excellent model in which to analyse the molecular basis of infertility.
Autoimmune polyendocrinopathy-candidiasis-ectodermal dystrophy (APECED)—novel insights into autoimmunity
APECED is perhaps the Finnish disease of widest general interest since APECED patients have multiple autoimmune symptoms appearing throughout their lifespan and this recessive monogenic disease is expected to provide a short-cut to understanding the molecular events involved in autoimmunity. Hallmark symptoms of APECED are multiple autoimmune endocrinopathies, chronic mucocutaneous candidiasis and ectodermal dystrophies. The novel gene behind APECED, named AIRE, was recently identified and again one major mutation in Finnish APECED patients could be identified in 85% of Finnish disease alleles ( 13 , 14 ). APECED Fin major is a nonsense mutation resulting in a truncated polypeptide chain which is not correctly targeted intracellularly. In two other populations showing exceptionally high incidences of APECED, Iranian Jews and Sardinians, one major mutation is also found in 100 and 92% of disease alleles, respectively.
Several structural features of the APECED protein, including a nuclear targeting signal, high proline content, PHD-type zinc finger domains and a newly described DNA-binding domain called SAND ( 99 ), imply nuclear targeting and DNA binding of this protein. Also, in vitro expression studies have demonstrated a nuclear localization and a capacity for tran-scriptional activation of the AIRE protein (57,102; P. Björses et al. , unpublished data). This peculiar protein is found in mature thymocytes and peripheral lymphocytes, suggesting that the biological function could be involved in tolerance development in cellular immunity.
Common Lessons from Rare Diseases
Genetic and molecular studies of Finnish diseases have provided several lessons, important for disease locus mapping and identification generally. Special strategies and LD-based shortcuts in both the initial positioning and the fine mapping of disease loci have proved their power in Finnish diseases and often provided experimental evidence for theoretical assumptions. At the molecular level the identification of novel molecules (like nephrin or APECED protein) has provided new tools with which to tackle unknown details of tissue dysfunction or degradation. Finally, the proteins defective in Finnish diseases have often provided novel insights into the significance of a metabolic pathway or suggested interactions between molecules which were not previously known to interact. This is perhaps best exemplified by the molecular dissection of NCL disorders, which have been transformed from a spectrum of slightly varying clinical phenotypes into well-defined molecular entities resulting from dysfunction of lysosomal molecules.
Genetic and molecular studies of Finnish diseases have without doubt brought into the limelight the definitive advantages of genetic isolates as a resource for human genetic research. If wisely exploited using ethically sound research policies and wisely chosen strategies based on sufficient information of the geneological history, various population isolates will also be of the greatest importance for genetic studies of polygenic and complex diseases. The example of Finland shows how successful research of genetic diseases has been based on well-recorded population histories, the efforts of skillful clinicians and high quality health care. These advantages have produced reliable diagnoses and excellent population and health care registers, but even more importantly a high level of basic trust by the population of genetic research and consequent high participation rates in genetic studies.