Cerebral cavernous malformation (CCM) is a Mendelian model of stroke, characterized by focal abnormalities in small intracranial blood vessels leading to hemorrhage and consequent strokes and/or seizures. A significant fraction of cases is inherited as an autosomal dominant trait with incomplete penetrance. Among Hispanic Americans, virtually all CCM is attributable to a founder mutation localized to 7q (CCM1). Recent analysis of non-Hispanic Caucasian kindreds, however, has excluded linkage to 7q in some, indicating at least one additional CCM locus. We now report analysis of linkage in 20 non-Hispanic Caucasian kindreds with familial CCM. In addition to linkage to CCM1, analysis of linkage demonstrates linkage to two new loci, CCM2 at 7p13–15 and CCM3 at 3q25.2–27. Multilocus analysis yields a maximum lod score of 14.11, with 40% of kindreds linked to CCM1, 20% linked to CCM2 and 40% linked to CCM3, with highly significant evidence for linkage to three loci (linkage to three loci supported with an odds ratio of 2.6 × 105:1 over linkage to two loci and 1.6 × 109:1 over linkage to one locus). Multipoint analysis among families with high posterior probabilities of linkage to each locus refines the locations of CCM2 and CCM3 to ∼22 cM intervals. Linkage to these three loci can account for inheritance of CCM in all kindreds studied. Significant locus-specific differences in penetrance are identified. These findings have implications for genetic testing of this disorder and represent an important step toward identification of the molecular basis of this disease.
Stroke is the third leading cause of death and a major cause of long-term disability in the USA, occurring in >700 000 individuals and killing 200 000 annually (1). Strokes occur as the consequence of disease of intracranial blood vessels, resulting in impaired delivery of oxygen to portions of the brain, either as a consequence of thrombosis or hemorrhage. Although a number of contributing factors have been identified, including hypertension, atherosclerosis, abnormalities in coagulation and vascular anomalies, the detailed pathogenesis of stroke is poorly understood. Genetic contributions to stroke have been recognized from studies of twins and familial aggregation, motivating genetic studies of this trait. Several Mendelian syndromes that include stroke as a prominent feature have been characterized, including cerebral autosomal dominant arteriopathy with subcortical infarcts and leukeoencephalopathy (2) and mitochondrial myopathy, encephalopathy, lactic acidosis and stroke-like episodes (3).
Another Mendelian form of intracranial vascular disease is cerebral cavernous malformation (CCM; OMIM 116860), a common vascular disease of the brain with a prevalence of up to 0.5% in the general population (4,5). Only a subset of subjects with these lesions becomes symptomatic. These subjects typically present between 20 and 40 years of age with intracranial hemorrhage, focal neurological deficits, seizures or headaches. Current therapies include medical management and surgical resection of lesions causing recurrent hemorrhage or seizures (6).
While the histopathology of CCM has been well studied, little is known about the pathophysiology of this disease. Grossly, cavernous malformations are small, well-circumscribed multilobulated vascular lesions (7). Classically, they are dilated sinusoidal vascular spaces lined by a single layer of endothelium surrounded by a sub-endothelial layer of collagenous, fibronectin-rich matrix devoid of mature vessel wall elements. There is no brain parenchyma intervening between lobules of a lesion. In addition, there is accumulation of hemosiderin from prior microhemorrhages (7–10). These lesions are usually not visualized by angiography. However, they have highly characteristic features on magnetic resonance imaging (MRI) (11).
Both autosomal dominant and sporadic forms of disease are recognized (12–15). Familial disease has been particularly prominent among Hispanic Americans, found in up to 50% of cases (14). Analysis of linkage in Hispanic American kindreds has revealed genetic homogeneity, with linkage of CCM1 to an ∼4 cM interval at 7q21–22 (16–19). In this population, there is strong evidence for a founder mutation that accounts for virtually all inherited and many apparently sporadic cases (19). Studies in this population have demonstrated delayed and incomplete penetrance of the disease among known gene carriers (19). The CCM1 gene has not yet been identified.
In contrast, the genetics of CCM among non-Hispanic kindreds are less well characterized. While three such kindreds support linkage to CCM1 (20–22), two recently reported families have excluded linkage to CCM1, indicating the presence of at least one additional gene that when mutated can result in CCM (23). We have now characterized 20 non-Hispanic CCM kindreds and report herein genetic analysis of these families.
Twenty CCM kindreds with three or more affected subjects were ascertained through an affected index case (Fig. 1). The index cases are all of non-Hispanic Caucasian ancestry and are geographically dispersed. Individuals with either definitive surgical pathology, MRI or computerized axial tomography scan findings were classified as affected. Asymptomatic subjects with no history of stroke, seizure disorder or focal neurological deficit were classified as unaffected; all but four of these individuals were over age 20. Individuals with a history of seizure disorder, focal neurological deficits or recurrent headaches who have not had MRI studies or surgery were classified as phenotype unknown. All phenotypes were assigned prospectively.
Analysis of linkage to CCM1
In order to determine which families are and are not linked to CCM1, highly informative polymorphic markers spanning the CCM1 locus on 7q were genotyped in these kindreds and analysis of linkage was performed. The results provided definitive evidence of linkage to 7q in these non-Hispanic kindreds, but confirmed locus heterogeneity. Analysis allowing for locus heterogeneity yielded a maximum lod score of 4.89 at CCM1 with 40% of kindreds linked to CCM1 under an optimized model specifying 75% penetrance and no phenocopies (Table 1 and Fig. 1A). Seven kindreds had lod scores <−2.0, excluding linkage and providing strong evidence of additional CCM loci. Changing estimates of penetrance or phenocopy rate did not substantially change these results (data not shown). These findings indicate that mutation in at least one additional gene is responsible for CCM in ∼12 of these 20 families.
Linkage to CCM2 and CCM3
In order to identify additional CCM loci, a genome-wide linkage study was performed, genotyping highly polymorphic marker loci distributed across all autosomes in seven kindreds that did not support linkage to CCM1 (K2041, K2015, K2035, K2115, K2056, K2061 and K2107). Pairwise and multipoint linkage was performed to compare the segregation of CCM and marker loci. An autosomal dominant model of the trait was analyzed, specifying 90% penetrance and 0.1% phenocopies. All loci or intervals showing lod scores of ≥1.0 had additional nearby markers typed to maximize informativeness. In sum, 312 marker loci were typed. In all seven families combined, no interval ultimately yielded a lod score >1.0, indicating that CCM transmission in this group of seven families cannot be accounted for by linkage to any single locus (data not shown). This finding suggests that mutations in at least two loci account for CCM transmission among these families.
Linkage was next separately analyzed in each of the three largest kindreds (K2015, K2035 and K2041), each ofwhich could support a lod score of ≥2.0. K2041 revealed evidence for linkage to a cluster of markers on 7p (pairwise lod scores of 2.93 and 2.04 at θ = 0 with loci D7S521 and D7S510) and yielded a multipoint lod score of 3.12 for linkage to a 30 cM interval flanked by loci D7S516 and D7S1818 (Table 1 and Fig. 1B). In contrast, families K2015 and K2035 each excluded linkage to this interval; however, both showed maximum multipoint lod scores of 2.48 and 2.00, respectively, for linkage to a segment of chromosome 3q (Table 1 and Fig. 1C; pairwise lod scores for kindred 2015 of 2.30 and 1.33 at θ = 0 with loci GGAA3H06 and GATA14G12 and for kindred 2035 of 1.70 and 1.56 at these same loci). No other intervals gave multipoint lod scores >0.7 in any of these families (data not shown).
These findings motivated genotyping in all 17 remaining families for marker loci in these two intervals. Multipoint analysis of linkage was then performed at these two new putative loci, CCM2 on 7p and CCM3 on 3q. The results of linkage to 7q, 7p and 3q in each of the 20 kindreds are shown in Table 1.
It can be seen that each family shows a positive lod score for linkage to at least one of these three loci, consistent with all families being accounted for by linkage to CCM1, CCM2 or CCM3. The significance of the linkage findings to 7p and 3q was formally assessed by multilocus analysis of linkage, which takes into account linkage results at multiple loci and in the setting of locus heterogeneity has greater power to detect linkage than single locus analysis (24; see Materials and Methods).
We first calculated the multilocus lod score for linkage to three loci, with these three loci accounting for disease in all families. The maximum multilocus lod score for the three-locus model was 14.11 with 40% of families linked to CCM1 on 7q, 20% to CCM2 on 7p and 40% to CCM3 on 3q (Table 1). The significance of linkage to these two additional loci was determined by comparison of this lod score with the lod score of the null hypothesis, linkage to only CCM1 with locus heterogeneity (lod score 4.89). The three-locus model is supported with an odds ratio of 1 600 000 000:1 over the single-locus model, providing highly significant evidence of linkage to more than one locus.
We then determined whether linkage to three loci was favored over linkage to only two loci by comparing the lod score for the three-locus model to the lod score obtained for linkage to alternative models specifying linkage to two loci, either 7q and 7p or 7q and 3q, allowing for remaining unlinked families. The three-locus model is supported with a likelihood ratio of >260 000:1 over the best two-locus model (40% of kindreds linked to 7q, 20% linked to 7p, 40% unlinked, lod score 8.69), providing highly significant evidence for linkage to three loci rather than two. These findings provide formal evidence of significant linkage to CCM2 and CCM3. We also examined four-locus models, specifying linkage to three loci with 5–35% of kindreds remaining unlinked, attributable to an as yet unidentified locus. All such models tested gave a lower multilocus lod score (lod score 13.90–12.42), providing no evidence for additional CCM loci among these 20 kindreds. The likelihood ratio was reduced by a factor of 10 under a model specifying 22% of kindreds unlinked to any of these three loci, approximating a confidence interval for the proportion of kindreds that are linked to CCM1, CCM2 and CCM3. These findings are consistent with all families being attributable to mutation at one of these loci and suggest that, at a minimum, 78% of non-Hispanic kindreds are linked to CCM1 (35%), CCM2 (20%) or CCM3 (23%).
In order to refine the location of disease loci, multipoint linkage at each locus was analyzed separately in families with both lod scores >1.00 and a posterior probability of ≥0.95 (Table 1) for linkage to chromosome 7q (K2142, K2043 and K2144), 7p (K2041, K2137 and K2141) or 3q (K2015 and K2035) (Fig. 2). This analysis permits identification of critical recombinants in families with high likelihood of linkage to CCM1, CCM2 or CCM3. The 7q meiotic interval defined in these non-Hispanic kindreds spans the CCM1 locus defined in Hispanic kindreds, consistent with these loci being allelic (Fig. 2A). The lod-1 and lod-3 support intervals for CCM2 span 6.6 and 22.6 cM, respectively (Fig. 2B). The corresponding support intervals for CCM3 span 19.0 and 22.0 cM (Fig. 2C). These lod-3 intervals correspond to chromosome segments 7p13–15 and 3q25.2–27, respectively. Posterior probabilities can be inflated if the proportion of families linked to a specific locus is overestimated or the proportion that is not linked to any identified locus is underestimated. Accordingly, we have also determined posterior probabilities for linkage specifying 22% of families linked to a fourth unidentified locus. All of the families included in Figure 2 continue to have posterior probabilities of 92–99.9%, further supporting the refined locations shown in Figure 2.
The current findings demonstrate linkage of CCM to two new loci, firmly establishing locus heterogeneity for this disease and providing a first estimate of the proportion of non-Hispanic kindreds linked to each locus. These findings also place an upper limit on the proportion of kindreds that can be attributed to additional unidentified loci in this outbred Caucasian population.
The multilocus analysis utilized, which jointly analyzes evidence for linkage across multiple loci, is particularly useful in the setting of locus heterogeneity and many small families incapable of independently demonstrating significant linkage (24). While a higher level of significance may be required for multilocus analysis, the evidence for linkage to each new locus is supported by an odds ratio of at least 260 000:1. In contrast, evidence for linkage to 7p and 3q would have been overlooked under models of linkage homogeneity analyzing all families together, with multipoint lod scores of −45.05 and −28.81 at CCM2 and CCM3, respectively. Similarly, lod scores at these loci would have been much weaker using traditional methods of analysis specifying locus heterogeneity with linkage to one locus at a time. For example, among all 20 kindreds, the maximum lod scores for linkage to 7p and 3q under single linked locus models with locus heterogeneity were 3.49 with 20% of kindreds linked and 2.79 with 50% of kindreds linked, respectively. These observations underscore the difficulties inherent in identifying linkage in the setting of locus heterogeneity and emphasize the value of concurrent evaluation of linkage data at multiple loci.
Recognition of linkage to different loci in different families provides the opportunity to assess whether there are clinical differences among kindreds linked to different loci. While there are no obvious differences in clinical features of affected subjects, there are significant locus-specific effects on penetrance of symptomatic disease. The penetrance of symptomatic disease among apparent gene carriers for kindreds linked to CCM1, CCM2 and CCM3 is 88, 100 and 63%, respectively (these calculations include subjects from all families included in the multipoint analysis shown in Fig. 2). These differences are not explained by differences in age or gender of gene carriers among families and none of the asymptomatic gene carriers in this analysis was under age 20. Expanding the kindreds included in this analysis to those with posterior probabilities of linkage at least 2-fold higher than the next most likely locus (Table 1 and Fig. 1) yielded nearly identical estimates for locus-specific penetrance. These differences in penetrance are statistically significant χ2 = 11.8, df = 2, P = 0.003) and provide statistical support for the use of different empirical estimates of disease penetrance among families showing linkage to different loci (Table 1). These differences in penetrance may have implications for the prognosis among gene carriers at different loci. These estimates of penetrance are distributed across several families at each locus and there are no significant differences in pentrance among families linked to each locus. Nonetheless, it is possible that penetrance at each locus varies with the particular mutation; it will consequently be important to confirm these differences in penetrance in kindreds with identified mutations once the underlying disease genes have been identified.
The localization of these loci provide a first step toward the identification of additional genes causing CCM and may assist in the identification of CCM1. The finding of apparently indistinguishable phenotypes attributable to different loci suggests that the mutations at these different loci may be acting in the same biochemical pathway. This observation suggests that success in identification of any one of these disease genes may help in identification of the others. Similarly, any genes that act in a common pathway and map to these trait loci would be excellent CCM candidates. At present, however, we have identified no compelling candidate genes in either the CCM2 or CCM3 intervals.
A number of fundamental questions regarding the pathogenesis of CCM are unanswered. First, the causes of the blood vessel abnormalities seen in CCM lesions are unknown. Second, given that all blood vessels harbor inherited CCM mutations in affected members of CCM families, it is unclear why only a small number of lesions develop. Third, the relationship between clearly inherited cases of CCM and sporadic cases in the non-Hispanic Caucasian population is unclear. It is anticipated that identification of the genes underlying this trait will provide insight into these questions and may also prove to be of broader relevance to vascular development and susceptibility to hemorrhagic stroke.
Materials and Methods
Genotyping and analysis of linkage
Highly polymorphic di-, tri- and tetranucleotide repeat marker loci were genotyped by PCR. Primers for each locus were systematically redesigned from published sequences and synthesized in the Keck Biotechnology Resource Laboratory at Yale University. One primer of each pair was 5′-end-labeled with either 6-FAM, HEX or TET phosphoramidite dyes.
PCR was performed as previously described (25) and genotypes were determined on an ABI 377 instrument. Pairwise and multipoint linkage analysis was performed using FASTLINK v.3.0P (26,27) and LINKAGE v.5.1 (28) on a Sun Sparcstation 20. In the CCM1 interval, we have typed one previously unpublished genetic marker, ATTTO85, which is a tetranucleotide repeat polymorphism that has been localized to the specified interval on the physical map of 7q (M. Günel et al., unpublished data). This marker is defined by primers with sequence 5′-TCTGTGACTAGGATCCAACTC-3′ and 5′-AACCCAGCCCTTGGAAAGTG-3′.
Lod scores were computed using models specifying locus heterogeneity with linkage to one or more loci (24), as previously described (29). In a multilocus analysis testing linkage to each of three loci, the likelihood ratio for linkage to any of these loci in a given family is represented by
We gratefully acknowledge members of the families studied for their invaluable contributions to this project, Carol Nelson-Williams, Stephanie Zone, Anita Fahri and Janet Budzinack for management of patient databases and Traci Mansfield for helpful discussions. This work was supported by grants from the NIH. H.D.C. is an investigator of the Medical Scientist Training Program. L.P. and R.P.L. are investigators of the Howard Hughes Medical Institute.
cerebral cavernous malformations
magnetic resonance imaging