Abstract

Copy number variants (CNVs) are heritable gains and losses of genomic DNA in normal individuals. While copy number variation is widely studied in humans, our knowledge of CNVs in other mammalian species is more limited. We have designed a custom array-based comparative genomic hybridization (aCGH) platform with 385 000 oligonucleotide probes based on the reference genome sequence of the rhesus macaque (Macaca mulatta), the most widely studied non-human primate in biomedical research. We used this platform to identify 123 CNVs among 10 unrelated macaque individuals, with 24% of the CNVs observed in multiple individuals. We found that segmental duplications were significantly enriched at macaque CNV loci. We also observed significant overlap between rhesus macaque and human CNVs, suggesting that certain genomic regions are prone to recurrent CNV formation and instability, even across a total of ∼50 million years of primate evolution (∼25 million years in each lineage). Furthermore, for eight of the CNVs that were observed in both humans and macaques, previous human studies have reported a relationship between copy number and gene expression or disease susceptibility. Therefore, the rhesus macaque offers an intriguing, non-human primate outbred model organism with which hypotheses concerning the specific functions of phenotypically relevant human CNVs can be tested.

INTRODUCTION

Copy number variants (CNVs) are intra-specific gains or losses of ≥1 kb of genomic DNA (1,2). Recent studies revealed that CNVs are unexpectedly common and frequent among normal human individuals (3–5). Furthermore, for many CNVs, copy number is correlated with transcriptional and translational levels (6–12), suggesting that CNVs may be involved in phenotypic diversity including differential disease susceptibility. For example, specific CNVs have been identified as risk factors for HIV infection, autoimmune disorders and Crohn’s disease, among others (7,9,13–18). However, while CNVs in the human genome have received considerable attention (3–5,10,12,19–27), our understanding of intra-specific CNVs in other mammalian species is substantially more limited, with genome-wide data currently available only for chimpanzees (Pan troglodytes) and mice (Mus musculus) (28–33).

Rhesus macaques (Macaca mulatta) are the most widely studied non-human primate for biomedical research (34), especially in the areas of physiology, behavioral biology, cardiological disorders and infectious disease (35–38). For example, rhesus macaques have been indispensable as a model for HIV and AIDS due to their susceptibility to the closely-related (to HIV) simian immunodeficiency virus (SIV) and SIV-HIV-1 chimera (39), and the first successful primate somatic cell nuclear transfer was recently performed in rhesus macaques (40). Therefore, identification and characterization of CNVs in the rhesus macaque genome may help us to understand better the relationship between copy number variation and disease. Furthermore, by comparing patterns of intra-specific copy number variation in rhesus macaques to CNVs observed in humans, we may develop a better understanding of the evolutionary forces that shape our genomes. Here, we have performed a study of CNVs in the macaque genome by using a macaque-specific array-based comparative genomic hybridization (aCGH) platform to describe patterns of copy number variation among the genomes of 10 macaque individuals.

RESULTS

CNVs are widespread throughout the rhesus macaque genome

The goal of our study was to characterize levels and patterns of intra-specific copy number variation in rhesus macaques, rather than to identify inter-specific copy number differences among primates (41–47). Therefore, we developed a rhesus macaque-specific aCGH platform with ∼385 000 isothermic oligonucleotide probes roughly evenly spaced throughout the macaque genome (mean distance between probes ∼6.5 kb). We used this platform to identify CNVs among the genomes of 10 unrelated rhesus macaques (9 individuals each compared with a single rhesus macaque reference individual). Among the 9 test macaques, we identified 132 genomic gains and 82 genomic losses relative to the reference individual, corresponding to 123 distinct CNVs (Fig. 1; Supplementary Material, Table S1). Thirty of these CNVs (24%) were observed in multiple individuals.

Figure 1.

Genomic locations of macaque CNVs. Locations of 123 autosomal CNVs identified by aCGH experiments among 9 macaque individuals relative to a single reference individual. Heights of the green and red bars reflect the frequency of gains and losses, respectively.

Figure 1.

Genomic locations of macaque CNVs. Locations of 123 autosomal CNVs identified by aCGH experiments among 9 macaque individuals relative to a single reference individual. Heights of the green and red bars reflect the frequency of gains and losses, respectively.

In order to validate the aCGH results, we first performed real-time quantitative PCR (qPCR) and confirmed 26 out of 27 CNV calls at 7 loci, observing significant correlations between qPCR estimated copy numbers and aCGH log2 ratios at multiple loci (Fig. 2; Supplementary Material, Table S2). Next, we qPCR-genotyped a CNV overlapping the SIGLEC gene family locus in 13 macaque parent–offspring trios (n = 39 individuals). We obtained genotypes that were 100% concordant with Mendelian patterns of inheritance (Fig. 3) and consistent with Hardy–Weinberg equilibrium (considering parents only; χ2 test; P = 0.671).

Figure 2.

qPCR validation of aCGH data. (AC) aCGH log2 intensity ratio profiles for a 129 kb CNV containing the macaque ortholog for the human OR2L (olfactory receptor) gene subfamily locus for macaque individuals (A) mmu 228_1994, (B) mmu 353_1991 and (C) mmu 211_1999. Each oligonucleotide probe is represented by a single point. Probes within the copy number variable region are highlighted in blue (no change), red (loss) and green (gain). (D) Plot depicting the relationship between qPCR-estimated copy number and aCGH-based mean log2 ratio for the OR2L CNV for all 9 macaque individuals (least squares regression; R2 = 0.927; P < 0.001). All qPCR and aCGH estimates are relative to the reference individual mmu 313_2000. Values for the individuals depicted in (A), (B) and (C) correspond to the blue, red and green points, respectively. Error bars represent ±1 SD.

Figure 2.

qPCR validation of aCGH data. (AC) aCGH log2 intensity ratio profiles for a 129 kb CNV containing the macaque ortholog for the human OR2L (olfactory receptor) gene subfamily locus for macaque individuals (A) mmu 228_1994, (B) mmu 353_1991 and (C) mmu 211_1999. Each oligonucleotide probe is represented by a single point. Probes within the copy number variable region are highlighted in blue (no change), red (loss) and green (gain). (D) Plot depicting the relationship between qPCR-estimated copy number and aCGH-based mean log2 ratio for the OR2L CNV for all 9 macaque individuals (least squares regression; R2 = 0.927; P < 0.001). All qPCR and aCGH estimates are relative to the reference individual mmu 313_2000. Values for the individuals depicted in (A), (B) and (C) correspond to the blue, red and green points, respectively. Error bars represent ±1 SD.

Figure 3.

Mendelian inheritance of rhesus macaque CNVs. (A) Frequency distribution of qPCR-estimated copy numbers for 13 macaque parent–offspring trios (n = 39) for a CNV that spans a region orthologous to the human SIGLEC gene family locus. Copy number estimates are relative to the reference macaque individual mmu 313_2000. The copy number estimates form three discrete clusters, including 19 individuals for which there was no qPCR amplification at the SIGLEC CNV (homozygous deletion). The reference individual was inferred to have one copy per diploid genome. (B) An example of the predicted inheritance pattern of the SIGLEC locus CNV for a macaque parent–offspring trio, based on qPCR copy number estimates. All 13 parent–offspring trios showed patterns consistent with Mendelian inheritance.

Figure 3.

Mendelian inheritance of rhesus macaque CNVs. (A) Frequency distribution of qPCR-estimated copy numbers for 13 macaque parent–offspring trios (n = 39) for a CNV that spans a region orthologous to the human SIGLEC gene family locus. Copy number estimates are relative to the reference macaque individual mmu 313_2000. The copy number estimates form three discrete clusters, including 19 individuals for which there was no qPCR amplification at the SIGLEC CNV (homozygous deletion). The reference individual was inferred to have one copy per diploid genome. (B) An example of the predicted inheritance pattern of the SIGLEC locus CNV for a macaque parent–offspring trio, based on qPCR copy number estimates. All 13 parent–offspring trios showed patterns consistent with Mendelian inheritance.

Although CNVs were found on every chromosome excluding the Y chromosome [which was not sequenced for the current draft of the macaque genome (34)], the number of CNVs per chromosome was uncorrelated with chromosome length (least squares regression; R2 = 0.08; P = 0.220). This finding is consistent with previous studies that reported a non-random distribution of CNVs in humans (5,19). Furthermore, a large proportion of our CNVs are relatively small (Fig. 4), near our effective resolution of ∼40 kb (we used a threshold of six consecutive probes for calling CNVs). In fact, while CNV sizes ranged from ∼32 to ∼710 kb, a majority (76%) of the CNVs we identified were <100 kb in size. For this reason, we infer that our finding of 123 CNVs underestimates the true level of structural variation in the macaque genome, and that a considerable proportion of macaque CNVs could be <40 kb in size. We did not observe any CNVs >1 Mb in size, which may in part be a consequence of the number and size of sequence gaps in the current draft of the macaque genome sequence (>172 000 gaps, currently estimated at ∼8% of the genome; 34,48). In addition, previous studies in humans have observed that there are relatively few very large CNVs, and that they tend to be low in frequency (5), which may explain the absence of these CNVs from our dataset, if the distribution of macaque CNVs is similar.

Figure 4.

Size distribution of identified macaque CNVs. CNV size mean = 102 kb; median = 58 kb; range = 32–710 kb.

Figure 4.

Size distribution of identified macaque CNVs. CNV size mean = 102 kb; median = 58 kb; range = 32–710 kb.

Macaque CNVs overlap segmental duplications and human CNVs

CNVs in the human, chimpanzee and mouse genomes are significantly enriched for segmental duplications (3,5,19,30,33). Segmental duplications are low-copy repeated sequences of ≥1 kb with ≥90% sequence identity (49) that may be involved in CNV genesis through non-allelic homologous recombination (NAHR) mechanisms (reviewed in 50,51). Although the segmental duplication content of the macaque genome is less than half that of the human genome (2.3 versus 5.2%; 34,52), we found that 37 of the 123 (30%) macaque CNVs overlapped segmental duplications in the macaque genome, representing a 1.9-fold significant enrichment compared with the level of overlap expected by chance alone (Fig. 5A; permutation test; P < 0.0001). Furthermore, of the 30 CNVs observed in multiple macaque individuals, 22 (73%) were found to overlap segmental duplications (Fig. 5B; 2.7-fold enrichment; P < 0.0001). These findings suggest that segmental duplication-mediated NAHR plays an important role in the formation of a subset of CNVs across diverse mammalian species, despite substantial differences in genomic segmental duplication content. However, we note the possibility that segmental duplications may be underreported in the current assembly of the rhesus macaque genome sequence (34). If the known segmental duplication content of the macaque genome does increase, then the proportion of CNVs observed to overlap segmental duplications may also rise. Therefore, these results should be re-examined when subsequent drafts of the macaque genome sequence become available.

Figure 5.

Comparisons of macaque CNV locations with segmental duplications (SDs) and human CNVs. (A and B) The observed number of macaque CNVs that overlapped macaque segmental duplications was compared to a distribution constructed by iterative random placement of CNVs (preserving their respective lengths) and assessment of segmental duplication overlap for (A) all macaque CNVs (P < 0.0001), and (B) macaque CNVs observed in multiple individuals (P < 0.0001). (C and D) A similar analysis was performed assessing overlap between (C) previously observed human CNVs and macaque CNVs (P < 0.01), and (D) previously observed human CNVs and macaque CNVs found in multiple individuals for both species (P> 0.001). All analyses were performed with 10 000 iterations. Observed values are indicated by dashed lines.

Figure 5.

Comparisons of macaque CNV locations with segmental duplications (SDs) and human CNVs. (A and B) The observed number of macaque CNVs that overlapped macaque segmental duplications was compared to a distribution constructed by iterative random placement of CNVs (preserving their respective lengths) and assessment of segmental duplication overlap for (A) all macaque CNVs (P < 0.0001), and (B) macaque CNVs observed in multiple individuals (P < 0.0001). (C and D) A similar analysis was performed assessing overlap between (C) previously observed human CNVs and macaque CNVs (P < 0.01), and (D) previously observed human CNVs and macaque CNVs found in multiple individuals for both species (P> 0.001). All analyses were performed with 10 000 iterations. Observed values are indicated by dashed lines.

We evaluated the overlap between our macaque CNVs and human CNVs that were reported recently for 270 individuals representing four populations from the HapMap collection (5). Two genome-wide platforms were used in the human study (5); because the genome coverage and resolution of our macaque-specific array were similar to those of the large-insert clone whole-genome tile path (WGTP) human platform, we considered only the 867 autosomal CNV calls made by this platform. We unambiguously determined human orthologous positions for 101 of the 123 macaque CNVs. The remaining 22 CNVs were (i) mapped to multiple human chromosomes, (ii) lacked orthologous sequences in the human genome or (iii) had differences in size between the two assemblies that were too great for us to be confident of orthology across the entire region (see Materials and Methods).

We found that ∼25% (25 of 101) of macaque CNVs are mapped to regions of the human genome previously found to contain CNVs among the 270 HapMap individuals (5), significantly more than expected by chance alone (Fig. 5C; 1.8-fold enrichment; P <0.01). We also observed that CNVs identified in multiple macaques tended to overlap with CNVs identified in multiple humans (Fig. 5D; 3.7-fold enrichment; P > 0.001), and we found even greater enrichment when considering human CNVs of ≥5% frequency (5.6-fold enrichment; P < 0.0001).

DISCUSSION

Mechanisms for CNV formation across mammalian lineages

We have shown that macaque CNVs are enriched for segmental duplications and often occur in regions orthologous to human CNVs. We previously reasoned that CNVs found in the same regions of both humans and chimpanzees likely reflect recurrent CNV formation, rather than ancestral CNVs maintained throughout the evolutionary histories of both species (30). Given that the human and macaque lineages diverged ≥25 million years ago (53,54), it is even more likely that CNVs in orthologous regions in both macaques and humans were formed by independent events. However, while segmental duplications present in the human–chimpanzee common ancestor may have maintained ∼99% sequence similarity among paralogous copies within a given species (43) and therefore may facilitate NAHR in both species (30), genome-wide human–macaque nucleotide sequence divergence is ∼7% (34). While rare NAHR events can be mediated by perfectly matching segments smaller than 50 bp in size (55), more common NAHR events may require perfectly matching segments of >300 bp in length (56), and NAHR most frequently occurs between duplications >10 kb with >95% sequence similarity to each other (51). This level of required sequence similarity is greater than the similarity that may be expected for paralogous copies of segmental duplications that were present in the genome of the human–macaque common ancestor (i.e. ∼93%). Therefore, one might expect reduced levels of NAHR for older segmental duplications, which could implicate mechanisms other than NAHR in frequent and recurrent CNV formation in the same genomic regions of both macaques and humans.

To examine this issue, we assessed the segmental duplication content of the CNVs observed in orthologous regions of macaques and humans, and found that 52% of these CNVs (13 of 25) in fact contain segmental duplications in both genomes. One explanation for this observation could be that new segmental duplications have formed in similar regions of both genomes following their divergence (i.e. segmental duplications in these regions were not actually present in the macaque–human ancestor). However, we consider several other (and non-mutually exclusive) explanations to be more likely. First, due to stochastic mutation patterns, stretches of >95% sequence similarity that are sufficiently large for NAHR (including perfectly matching segments hundreds of bp in length) may have been maintained between segmental duplication paralogs, even over ≥25 million years. Second, gene conversion may have homogenized ancestral segmental duplications (57,58), potentially preserving higher than expected within-species sequence identity in both lineages. For example, this process has been linked to the long-term maintenance of a structural rearrangement hot spot in eutherian mammals, mediated by segmental duplications that arose >100 million years ago (59). To explore this possibility further, we estimated the sequence divergence and phylogenies of segmental duplication paralogs and orthologs within or flanking a subset of the CNV regions in macaques and humans. In each case, we observed that nucleotide sequence divergence among duplicons was >7% across species but ≤5% within species (Supplementary Material, Fig. S1), consistent with a model of recurrent gene conversion in both lineages. In addition, these regions may have been unstable throughout the evolutionary histories of both species, with recurrent copy number gains providing substrates for further CNV genesis (60). Together, these forces may effectively maintain CNV hot spots across a surprising diversity of mammalian lineages.

Evaluating the functional relevance of macaque CNVs

Initial reports of associations between copy number and gene expression levels (6,10,12) and copy number and susceptibility to complex diseases (e.g. refs. 7,14,17) in humans have prompted speculation that copy number variation may play a considerable role in phenotypic diversity (1,61). However, our knowledge of the functional consequences of even the few thoroughly studied human CNVs remains limited. Among the macaque CNVs identified in our study, eight mapped to genomic regions containing human CNVs that have been implicated in disease susceptibility or correlated with gene expression levels (Table 1), representing an intriguing opportunity for model organism studies of functionally relevant human CNVs.

Table 1.

CNVs observed in both macaques and humans, with functional significance previously shown in humans

Macaque CNV locusa Macaque CNV obs.b Human CNV obs.c Genes of interestd Phenotypic effect of CNVe References 
Chr. 3: 180.2–180.3 28 PRSS1 Hereditary pancreatitis susceptibility (15
Chr. 4: 29.5–29.6 21 LOC347981 Correlated with gene expression level (12
Chr. 4: 31.0–31.1 20 LOC282956 Correlated with gene expression level (12
Chr. 4: 32.1–32.6 231 HLA-DRB5, HLA DQA1, HLA-DQA2 Correlated with gene expression level (12
Chr. 8: 8.0–8.7f 124 DEFB4 Correlated with gene expression level; Crohn’s disease susceptibility; psoriasis susceptibility (6,9,81
Chr. 10: 86.5–86.5 25 CGI-96 Correlated with gene expression level (12
Chr. 16: 55.9–55.9 207 HDAC5, MGC3130 Correlated with gene expression level (12
Chr. 19: 47.2–47.4 CYP2A6 Correlated with gene expression level; correlated with protein level; lung cancer susceptibility (13,70,71,82
Macaque CNV locusa Macaque CNV obs.b Human CNV obs.c Genes of interestd Phenotypic effect of CNVe References 
Chr. 3: 180.2–180.3 28 PRSS1 Hereditary pancreatitis susceptibility (15
Chr. 4: 29.5–29.6 21 LOC347981 Correlated with gene expression level (12
Chr. 4: 31.0–31.1 20 LOC282956 Correlated with gene expression level (12
Chr. 4: 32.1–32.6 231 HLA-DRB5, HLA DQA1, HLA-DQA2 Correlated with gene expression level (12
Chr. 8: 8.0–8.7f 124 DEFB4 Correlated with gene expression level; Crohn’s disease susceptibility; psoriasis susceptibility (6,9,81
Chr. 10: 86.5–86.5 25 CGI-96 Correlated with gene expression level (12
Chr. 16: 55.9–55.9 207 HDAC5, MGC3130 Correlated with gene expression level (12
Chr. 19: 47.2–47.4 CYP2A6 Correlated with gene expression level; correlated with protein level; lung cancer susceptibility (13,70,71,82

aChromosome (chr.) coordinates (in Mb) based on the January 2006 rheMac2 genome assembly.

bNumber of observations of the CNV in macaques (of nine individuals relative to a single reference individual).

cNumber of observations of the CNV in the orthologous region in humans (of 269 HapMap individuals relative to a single reference individual), based on data from the WGTP platform (5).

d Genes affected by CNVs in humans. The LOC347981, LOC282956, HDAC5 and MGC3130 genes are not located within the CNVs influencing their expression levels; these may be regulatory CNVs (12).

eWe have only included examples for which CNVs themselves have been linked to phenotypic variation in humans, based on the limited number of studies conducted to date. Other CNVs observed in both macaques and humans (Supplementary Materials, Table S1) should also be considered to be of potential functional significance, for example the CNV encompassing the UGT2B7 gene. Single-nucleotide polymorphisms in human UGT2B7 have been implicated in differential drug and hormone responses (83–85), and deletion variants in another member of the UGT2 family have been associated with prostate cancer in humans (16).

fThe endpoint coordinates for this locus are >5 times larger in the human reference genome sequence compared to the macaque reference genome (see Materials and Methods) due to large segmental duplications and a gap in the human sequence; however, we established that the nucleotide sequence of the macaque CNV region aligns to the human β-defensin locus, including the DEFB4 gene.

For example, we observed multiple gains and losses at the macaque β-defensin gene cluster locus, including the DEFB4 gene. Copy number variation of this gene in humans has been associated with susceptibility to Crohn’s disease, a chronic inflammation of the gastrointestinal tract (9). Given the possible genetic and microbial influences on Crohn’s disease (62,63) and the bactericidal roles of the β-defensins (64), the rhesus macaque may be an appropriate model organism for investigating the potential correlation between β-defensin gene dosage and microbiotic composition. Ultimately, the efficacy of targeted antibacterial drugs could be tested on macaques with different backgrounds of β-defensin copy number in pharmacogenomics-based studies.

In addition, we detected multiple copy number losses overlapping the cytochrome P450, family 2, subfamily A locus, which is also copy number variable in humans (5,10,65). One member of this gene family is CYP2A6, which is responsible for the metabolism of nicotine and the activation of pro-carcinogens found in tobacco smoke (66,67). Previous studies in humans have shown that duplications and deletions at this locus are generated by NAHR (68,69), and that the deletion allele may confer protection against lung cancer (13,70,71). In macaques, the effects of copy number changes on in vivo enzymatic activities of CYP2A subfamily genes could be assessed to understand better the relationship between nicotine metabolism and cancer and addiction behavior. For example, levels of metabolized nicotine could be measured across individuals with different copy numbers. A similar approach could be used to investigate CYP-mediated activation of pro-carcinogens and subsequent oncogenesis. Finally, rhesus macaque studies could be informative with regard to any CNV-related effects on CYP-inhibiting drug response (reviewed in 72,73).

The examples discussed above and presented in Table 1 are based on the limited number of studies that have investigated the functional effects of human CNVs. More human CNVs that are orthologous to the observed macaque CNVs may later be found to be associated with specific human phenotypes, for example, through future genome-wide CNV disease association studies (74). Furthermore, additional macaque CNVs will undoubtedly be identified with the application of higher resolution platforms and larger population samples; a proportion of these CNVs may also be orthologous to functionally relevant human CNVs.

Conclusion

We have identified 123 CNVs among the genomes of 10 rhesus macaque individuals. To our knowledge, this is the first successful attempt to map copy number variation in rhesus macaques at a genome-wide level. Our analyses have shown that a surprising number of CNVs are mapped to genomic regions that are also copy number variable in humans. These findings have two immediate implications for the medical and evolutionary genetics communities: (i) future rhesus macaque-based biomedical research studies may benefit from fully accounting for copy number variation and any subsequent gene dosage effects, and (ii) the macaque is an ideal model organism for testing hypotheses concerning the functional significance of CNVs, including for many CNV loci that are directly relevant to human disease.

MATERIALS AND METHODS

Sample preparation

Whole venous blood was collected from 4 male and 6 female unrelated (with known ancestry for a minimum of three generations), outbred rhesus macaques and 13 parent–offspring trios during routine health evaluations at the New England Primate Research Center. All individuals were descendants from Indian-origin rhesus macaques. DNA was isolated from peripheral leukoctyes with the Puregene DNA isolation kit (Gentra, Minneapolis, MN) and purified using the Wizard DNA Cleanup System (Promega, Madison, WI).

Custom array design

Human–macaque nucleotide sequence divergence is ∼7% (ref. 34), which could result in hybridization specificity and efficiency issues if macaque samples were used with human-specific arrays (75). Furthermore, with a human-specific array we would not be able to examine regions present in the rhesus macaque but not the human genome (i.e. human lineage-specific deletions). Therefore, we designed a high density (385 000 probes) oligonucleotide aCGH platform based on the rhesus macaque genome assembly (Mmu1_051212; rheMac2; January 2006) by NimbleGen Systems Inc. (design 2006-11-30_rheMac2_WG_CGH). We favored a genome-wide over a gene-focused approach because previous studies have shown that a subset of intergenic CNVs may be functionally relevant, for example, by influencing expression levels of nearby genes (12). In addition, full genomic coverage lets us make more precise inferences regarding the sizes of identified CNVs. Therefore, probes were distributed as evenly as possible throughout the genome, with mean probe spacing of 6.5 kb. We excluded sequences with >5 identical matches elsewhere in the genome from the array (i.e. to avoid placing probes in recently transposed copies of highly-repetitive sequences such as Alu elements), but included segmental duplications. Probe length varied between 50–75 nucleotides with target melting temperatures of ∼76°C. The arrays were printed at the NimbleGen facility in Reykjavik, Iceland.

aCGH experiments

DNA samples for test individuals were labeled with Cy5 and co-hybridized onto our platform with Cy3-labeled DNA from a single female reference individual (mmu 313_2000). We also performed two self-self hybridization experiments (i.e. by labeling DNA from a single individual separately with Cy3 and Cy5 and co-hybridizing both labeled samples to an array); one using DNA from the female reference individual, and another with DNA from one of the male test individuals (mmu 211_1999). Each self-self experiment was performed with two aliquots from a single DNA extraction. All experiments were labeled and hybridized using standard NimbleGen aCGH protocols in Reykjavik, Iceland. Briefly, 1 µg of genomic DNA was labeled with random 9-mer ‘wobble’-Cy3 or Cy5 (TriLink BioTechnologies, San Diego, CA) using Exo-Klenow DNA polymerase (New England Biolabs, Ipswich, MA), at 37°C for 2 h. Hybridization was performed using the MAUI Hybridization System (BioMicro Systems, Salt Lake City, UT) at 42°C for 16 h. Arrays were scanned with a GenePix 4000B 5 µm microarray scanner (Axon Instruments, Foster City, CA). Signal intensities were extracted using NimbleScan software (NimbleGen Systems, Inc.), and relative intensity log2 ratios were normalized using the Qspline method (76). All normalized Cy-3/Cy-5 intensity data from the aCGH experiments have been deposited to the Gene Expression Omnibus database (accession number GSE9220). All sample-level CNV calls with corresponding log2 values are provided in Supplementary Material, Table S3.

CNV analyses

CNVs were identified based on the relative intensity log2 ratio profiles of each experiment, using the BreakPtr algorithm (77). Following the protocol of Korbel et al. (77), we generated CNV calls with a hidden Markov model by using positive and negative training parameters based on log2 ratio data. Positive controls (i.e. ‘gold standards’) consisted of 16 putative copy number gains and losses that were validated by qPCR (Supplementary Material, Table S4), whereas negative controls consisted of log2 data from one self-self hybridization experiment (mmu 211_1999). We then set the minimum number of probes required for a CNV call by comparing the number of calls for the nine test experiments with the number of (false-positive) calls from the second self-self experiment (mmu 313_2000) while varying the minimum number of probes required for calling a CNV (Supplementary Material, Fig. S2). We found that a 3-probe threshold was sufficient to eliminate CNV calls on the self-self experiment. However, as the single self-self hybridization experiment cannot fully reflect the variance of our nine test experiments and because we preferred a low false-positive rate even at the expense of having more false negatives in our dataset, we established a more conservative threshold of 6 probes for CNV calls (resulting in an effective resolution of ∼40 kb). We noted that our platform and CNV calling parameters detected gains and losses at the major histocompatibility complex locus on macaque chromosome 4, consistent with previous studies (78,79).

One CNV was identified on the X chromosome in one female individual (Supplementary Material, Table S1), which was excluded from subsequent analysis because we could not reliably call X chromosome CNVs for male test individuals (versus the female reference). All CNV calls with a probe density of <1 probe per 50 kb were examined and reanalyzed as separate regions if they contained intervening assembly gaps. This affected five CNVs in the final dataset. CNV calls from different individuals that overlapped one another were merged and treated as a single CNV in our analyses.

Enrichment analyses were performed using a permutation test of 10 000 randomized trials. For evaluating segmental duplication enrichment, macaque genome autosomal segmental duplication coordinates (rheMac2) were downloaded from the UCSC Genome Browser (34,48). The locations of the macaque CNVs were randomized based on the midpoint coordinates of all autosomal oligonucleotide probes on our platform (the sizes of the randomized CNVs were identical to the actual observed CNV sizes). For each trial, the number of randomized CNVs that overlapped (defined as ≥1 bp in common) with macaque segmental duplications was determined. To test for enrichment, we compared the observed number of CNVs that overlapped with segmental duplications based on the actual locations of the macaque CNVs, to the distribution from the permutation analysis.

To determine whether macaque and human CNVs occur in orthologous regions more often than expected by chance, we first converted the positions of the macaque CNVs to orthologous human coordinates (hg17) using the Batch Coordinate Conversion (liftOver) tool from the UCSC Genome Bioinformatics resource (80). This was accomplished by obtaining liftOver coordinates for 1 kb of sequence flanking each putative breakpoint of all CNV calls (2 kb for each CNV end) and combining these results. Thirty-one macaque CNV breakpoint sequences that were duplicated in the human genome were converted manually using the Convert tool of the UCSC Genome Browser (80). Using these two approaches, human orthologous positions were obtained for 101 of the 123 macaque CNVs (Supplementary Material, Table S1). The remaining 22 macaque CNVs had putative breakpoints that mapped to multiple human chromosomes (n = 7), were mapped to the human X chromosome (which was not included in our analysis; n = 1), lacked orthologous sequences in the human genome (n = 11), or were excluded because the human converted genomic region size differed by >5-fold from the macaque CNV size and we could not be confident of orthology across the entire region (n = 3). We then performed permutation analyses with 10 000 trials, as described above for segmental duplications, except that we used a database of human CNVs from the WGTP aCGH platform, previously used to identify CNVs in 270 HapMap individuals (5), and randomized the location of these CNVs based on the midpoint positions of all autosomal clones on the WGTP array, because it was not practical to convert the coordinates of all the human CNVs to the macaque genome. For each trial, we then determined the number of macaque CNVs that overlapped the human CNVs and compared this distribution to the observed number of macaque CNVs that overlapped the human CNVs based on their actual locations.

Quantitative PCR

Specific primers were designed from the rheMac2 reference genome sequence, for both the putative copy number variable regions of interest and an endogenous control locus (PAX9). A 5-point standard curve (0.5–8 ng of DNA) was generated in duplicate for a single reference individual, while test individuals were assayed in triplicate using 2 ng of DNA per reaction. iQ SYBR Green Supermix and the iQ5 detection system (Bio-Rad Laboratories, Hercules, CA) were used for nucleic acid detection. Reactions were performed at 95°C for 3 min followed by 40 cycles of 95°C for 10 s and 60°C for 30 s. Mean DNA starting quantities and standard deviations were estimated based on threshold cycle differences between the control and test loci. Primer sequences are available in Supplementary Material, Table S4.

SUPPLEMENTARY MATERIAL

Supplementary Material is available at HMG Online.

ACKNOWLEDGEMENTS

We thank Edward Hollox for insightful comments on an earlier draft of the manuscript, Angela Carville, Keith Mansfield, and the NEPRC Genetics Core for providing rhesus macaque blood samples, and Jan Smidek for assistance with Figure 1. The work was supported in part by National Institutes of Health (NIH) National Human Genome Research Institute grant 1 P41 HG004221-01 (C.L.), and NIH National Center for Research Resources grant RR00168 (E.J.V., W.E.J., G.M.M.). J.O.K. was supported by a Marie Curie Outgoing International Fellowship. A.S.L. was supported by a Catherine Innes Ireland Fellowship and the Harvard College Research Program.

Conflict of Interest statement. None declared.

REFERENCES

1
Feuk
L.
Carson
A.R.
Scherer
S.W.
Structural variation in the human genome
Nat. Rev. Genet
 , 
2006
, vol. 
7
 (pg. 
85
-
97
)
2
Freeman
J.L.
Perry
G.H.
Feuk
L.
Redon
R.
McCarroll
S.A.
Altshuler
D.M.
Aburatani
H.
Jones
K.W.
Tyler-Smith
C.
Hurles
M.E.
, et al.  . 
Copy number variation: new insights in genome diversity
Genome Res
 , 
2006
, vol. 
16
 (pg. 
949
-
961
)
3
Iafrate
A.J.
Feuk
L.
Rivera
M.N.
Listewnik
M.L.
Donahoe
P.K.
Qi
Y.
Scherer
S.W.
Lee
C.
Detection of large-scale variation in the human genome
Nat. Genet
 , 
2004
, vol. 
36
 (pg. 
949
-
951
)
4
Sebat
J.
Lakshmi
B.
Troge
J.
Alexander
J.
Young
J.
Lundin
P.
Maner
S.
Massa
H.
Walker
M.
Chi
M.
, et al.  . 
Large-scale copy number polymorphism in the human genome
Science
 , 
2004
, vol. 
305
 (pg. 
525
-
528
)
5
Redon
R.
Ishikawa
S.
Fitch
K.R.
Feuk
L.
Perry
G.H.
Andrews
T.D.
Fiegler
H.
Shapero
M.H.
Carson
A.R.
Chen
W.
, et al.  . 
Global variation in copy number in the human genome
Nature
 , 
2006
, vol. 
444
 (pg. 
444
-
454
)
6
Hollox
E.J.
Armour
J.A.
Barber
J.C.
Extensive normal copy number variation of a beta-defensin antimicrobial-gene cluster
Am. J. Hum. Genet
 , 
2003
, vol. 
73
 (pg. 
591
-
600
)
7
Gonzalez
E.
Kulkarni
H.
Bolivar
H.
Mangano
A.
Sanchez
R.
Catano
G.
Nibbs
R.J.
Freedman
B.I.
Quinones
M.P.
Bamshad
M.J.
, et al.  . 
The influence of CCL3L1 gene-containing segmental duplications on HIV-1/AIDS susceptibility
Science
 , 
2005
, vol. 
307
 (pg. 
1434
-
1440
)
8
Linzmeier
R.M.
Ganz
T.
Human defensin gene copy number polymorphisms: comprehensive analysis of independent variation in alpha- and beta-defensin regions at 8p22-p23
Genomics
 , 
2005
, vol. 
86
 (pg. 
423
-
430
)
9
Fellermann
K.
Stange
D.E.
Schaeffeler
E.
Schmalzl
H.
Wehkamp
J.
Bevins
C.L.
Reinisch
W.
Teml
A.
Schwab
M.
Lichter
P.
, et al.  . 
A chromosome 8 gene-cluster polymorphism with low human beta-defensin 2 gene copy number predisposes to Crohn disease of the colon
Am. J. Hum. Genet
 , 
2006
, vol. 
79
 (pg. 
439
-
448
)
10
McCarroll
S.A.
Hadnott
T.N.
Perry
G.H.
Sabeti
P.C.
Zody
M.C.
Barrett
J.C.
Dallaire
S.
Gabriel
S.B.
Lee
C.
Daly
M.J.
, et al.  . 
Common deletion polymorphisms in the human genome
Nat. Genet
 , 
2006
, vol. 
38
 (pg. 
86
-
92
)
11
Perry
G.H.
Dominy
N.J.
Claw
K.G.
Lee
A.S.
Fiegler
H.
Redon
R.
Werner
J.
Villanea
F.A.
Mountain
J.L.
Misra
R.
, et al.  . 
Diet and the evolution of human amylase gene copy number variation
Nat. Genet
 , 
2007
, vol. 
39
 (pg. 
1256
-
1260
)
12
Stranger
B.E.
Forrest
M.S.
Dunning
M.
Ingle
C.E.
Beazley
C.
Thorne
N.
Redon
R.
Bird
C.P.
de Grassi
A.
Lee
C.
, et al.  . 
Relative impact of nucleotide and copy number variation on gene expression phenotypes
Science
 , 
2007
, vol. 
315
 (pg. 
848
-
853
)
13
Ariyoshi
N.
Miyamoto
M.
Umetsu
Y.
Kunitoh
H.
Dosaka-Akita
H.
Sawamura
Y.
Yokota
J.
Nemoto
N.
Sato
K.
Kamataki
T.
Genetic polymorphism of CYP2A6 gene and tobacco-induced lung cancer risk in male smokers
Cancer Epidemiol. Biomarkers Prev
 , 
2002
, vol. 
11
 (pg. 
890
-
894
)
14
Aitman
T.J.
Dong
R.
Vyse
T.J.
Norsworthy
P.J.
Johnson
M.D.
Smith
J.
Mangion
J.
Roberton-Lowe
C.
Marshall
A.J.
Petretto
E.
, et al.  . 
Copy number polymorphism in Fcgr3 predisposes to glomerulonephritis in rats and humans
Nature
 , 
2006
, vol. 
439
 (pg. 
851
-
855
)
15
Le Marechal
C.
Masson
E.
Chen
J.M.
Morel
F.
Ruszniewski
P.
Levy
P.
Ferec
C.
Hereditary pancreatitis caused by triplication of the trypsinogen locus
Nat. Genet
 , 
2006
, vol. 
38
 (pg. 
1372
-
1374
)
16
Park
J.
Chen
L.
Ratnashinge
L.
Sellers
T.A.
Tanner
J.P.
Lee
J.H.
Dossett
N.
Lang
N.
Kadlubar
F.F.
Ambrosone
C.B.
, et al.  . 
Deletion polymorphism of UDP-glucuronosyltransferase 2B17 and risk of prostate cancer in African American and Caucasian men
Cancer Epidemiol. Biomarkers Prev
 , 
2006
, vol. 
15
 (pg. 
1473
-
1478
)
17
Fanciulli
M.
Norsworthy
P.J.
Petretto
E.
Dong
R.
Harper
L.
Kamesh
L.
Heward
J.M.
Gough
S.C.
de Smith
A.
Blakemore
A.I.
, et al.  . 
FCGR3B copy number variation is associated with susceptibility to systemic, but not organ-specific, autoimmunity
Nat. Genet
 , 
2007
, vol. 
39
 (pg. 
721
-
723
)
18
Yang
Y.
Chung
E.K.
Wu
Y.L.
Savelli
S.L.
Nagaraja
H.N.
Zhou
B.
Hebert
M.
Jones
K.N.
Shu
Y.
Kitzmiller
K.
, et al.  . 
Gene copy-number variation and associated polymorphisms of complement component C4 in human systemic lupus erythematosus (SLE): low copy number is a risk factor for and high copy number is a protective factor against SLE susceptibility in European Americans
Am. J. Hum. Genet
 , 
2007
, vol. 
80
 (pg. 
1037
-
1054
)
19
Sharp
A.J.
Locke
D.P.
McGrath
S.D.
Cheng
Z.
Bailey
J.A.
Vallente
R.U.
Pertz
L.M.
Clark
R.A.
Schwartz
S.
Segraves
R.
, et al.  . 
Segmental duplications and copy-number variation in the human genome
Am. J. Hum. Genet
 , 
2005
, vol. 
77
 (pg. 
78
-
88
)
20
Tuzun
E.
Sharp
A.J.
Bailey
J.A.
Kaul
R.
Morrison
V.A.
Pertz
L.M.
Haugen
E.
Hayden
H.
Albertson
D.
Pinkel
D.
, et al.  . 
Fine-scale structural variation of the human genome
Nat. Genet
 , 
2005
, vol. 
37
 (pg. 
727
-
732
)
21
Conrad
D.F.
Andrews
T.D.
Carter
N.P.
Hurles
M.E.
Pritchard
J.K.
A high-resolution survey of deletion polymorphism in the human genome
Nat. Genet
 , 
2006
, vol. 
38
 (pg. 
75
-
81
)
22
Goidts
V.
Cooper
D.N.
Armengol
L.
Schempp
W.
Conroy
J.
Estivill
X.
Nowak
N.
Hameister
H.
Kehrer-Sawatzki
H.
Complex patterns of copy number variation at sites of segmental duplications: an important category of structural variation in the human genome
Hum. Genet
 , 
2006
, vol. 
120
 (pg. 
270
-
284
)
23
Hinds
D.A.
Kloek
A.P.
Jen
M.
Chen
X.
Frazer
K.A.
Common deletions and SNPs are in linkage disequilibrium in the human genome
Nat. Genet
 , 
2006
, vol. 
38
 (pg. 
82
-
85
)
24
Khaja
R.
Zhang
J.
MacDonald
J.R.
He
Y.
Joseph-George
A.M.
Wei
J.
Rafiq
M.A.
Qian
C.
Shago
M.
Pantano
L.
, et al.  . 
Genome assembly comparison identifies structural variants in the human genome
Nat. Genet
 , 
2006
, vol. 
38
 (pg. 
1413
-
1418
)
25
Locke
D.P.
Sharp
A.J.
McCarroll
S.A.
McGrath
S.D.
Newman
T.L.
Cheng
Z.
Schwartz
S.
Albertson
D.G.
Pinkel
D.
Altshuler
D.M.
, et al.  . 
Linkage disequilibrium and heritability of copy-number polymorphisms within duplicated regions of the human genome
Am. J. Hum. Genet
 , 
2006
, vol. 
79
 (pg. 
275
-
290
)
26
de Smith
A.J.
Tsalenko
A.
Sampas
N.
Scheffer
A.
Yamada
N.A.
Tsang
P.
Ben-Dor
A.
Yakhini
Z.
Ellis
R.J.
Bruhn
L.
, et al.  . 
Array CGH analysis of copy number variation identifies 1284 new genes variant in healthy white males: implications for association studies of complex diseases
Hum. Mol. Genet
 , 
2007
, vol. 
16
 (pg. 
2783
-
2794
)
27
Wong
K.K.
deLeeuw
R.J.
Dosanjh
N.S.
Kimm
L.R.
Cheng
Z.
Horsman
D.E.
MacAulay
C.
Ng
R.T.
Brown
C.J.
Eichler
E.E.
, et al.  . 
A comprehensive analysis of common copy-number variations in the human genome
Am. J. Hum. Genet
 , 
2007
, vol. 
80
 (pg. 
91
-
104
)
28
Li
J.
Jiang
T.
Mao
J.H.
Balmain
A.
Peterson
L.
Harris
C.
Rao
P.H.
Havlak
P.
Gibbs
R.
Cai
W.W.
Genomic segmental polymorphisms in inbred mouse strains
Nat. Genet
 , 
2004
, vol. 
36
 (pg. 
952
-
954
)
29
Adams
D.J.
Dermitzakis
E.T.
Cox
T.
Smith
J.
Davies
R.
Banerjee
R.
Bonfield
J.
Mullikin
J.C.
Chung
Y.J.
Rogers
J.
, et al.  . 
Complex haplotypes, copy number polymorphisms and coding variation in two recently divergent mouse strains
Nat. Genet
 , 
2005
, vol. 
37
 (pg. 
532
-
536
)
30
Perry
G.H.
Tchinda
J.
McGrath
S.D.
Zhang
J.
Picker
S.R.
Caceres
A.M.
Iafrate
A.J.
Tyler-Smith
C.
Scherer
S.W.
Eichler
E.E.
, et al.  . 
Hotspots for copy number variation in chimpanzees and humans
Proc. Natl. Acad. Sci. USA
 , 
2006
, vol. 
103
 (pg. 
8006
-
8011
)
31
Cutler
G.
Marshall
L.A.
Chin
N.
Baribault
H.
Kassner
P.D.
Significant gene content variation characterizes the genomes of inbred mouse strains
Genome Res
 , 
2007
, vol. 
17
 (pg. 
1743
-
1754
)
32
Egan
C.M.
Sridhar
S.
Wigler
M.
Hall
I.M.
Recurrent DNA copy number variation in the laboratory mouse
Nat. Genet
 , 
2007
, vol. 
39
 (pg. 
1384
-
1389
)
33
Graubert
T.A.
Cahan
P.
Edwin
D.
Selzer
R.R.
Richmond
T.A.
Eis
P.S.
Shannon
W.D.
Li
X.
McLeod
H.L.
Cheverud
J.M.
, et al.  . 
A high-resolution map of segmental DNA copy number variation in the mouse genome
PLoS Genet
 , 
2007
, vol. 
3
 pg. 
e3
 
34
Gibbs
R.A.
Rogers
J.
Katze
M.G.
Bumgarner
R.
Weinstock
G.M.
Mardis
E.R.
Remington
K.A.
Strausberg
R.L.
Venter
J.C.
Wilson
R.K.
, et al.  . 
Evolutionary and biomedical insights from the rhesus macaque genome
Science
 , 
2007
, vol. 
316
 (pg. 
222
-
234
)
35
Shannon
R.P.
SIV cardiomyopathy in non-human primates
Trends Cardiovasc. Med
 , 
2001
, vol. 
11
 (pg. 
242
-
246
)
36
Wallen
K.
Hormonal influences on sexually differentiated behavior in nonhuman primates
Front. Neuroendocrinol
 , 
2005
, vol. 
26
 (pg. 
7
-
26
)
37
Barr
C.S.
Goldman
D.
Non-human primate models of inheritance vulnerability to alcohol use disorders
Addict. Biol
 , 
2006
, vol. 
11
 (pg. 
374
-
385
)
38
Barratt-Boyes
S.M.
Brown
K.N.
Melhem
N.
Soloff
A.C.
Gleason
S.M.
Understanding and exploiting dendritic cells in human immunodeficiency virus infection using the nonhuman primate model
Immunol. Res
 , 
2006
, vol. 
36
 (pg. 
265
-
274
)
39
Nath
B.M.
Schumann
K.E.
Boyer
J.D.
The chimpanzee and other non-human-primate models in HIV-1 vaccine research
Trends Microbiol
 , 
2000
, vol. 
8
 (pg. 
426
-
431
)
40
Byrne
J.
Pedersen
D.
Clepper
L.
Nelson
M.
Sanger
W.
Gokhale
S.
Wolf
D.
Mitalipov
S.
Producing primate embryonic stem cells by somatic cell nuclear transfer
Nature
 , 
2007
, vol. 
450
 (pg. 
497
-
502
)
41
Locke
D.P.
Segraves
R.
Carbone
L.
Archidiacono
N.
Albertson
D.G.
Pinkel
D.
Eichler
E.E.
Large-scale variation among human and great ape genomes determined by array comparative genomic hybridization
Genome Res
 , 
2003
, vol. 
13
 (pg. 
347
-
357
)
42
Fortna
A.
Kim
Y.
MacLaren
E.
Marshall
K.
Hahn
G.
Meltesen
L.
Brenton
M.
Hink
R.
Burgers
S.
Hernandez-Boussard
T.
, et al.  . 
Lineage-specific gene duplication and loss in human and great ape evolution
PLoS Biol
 , 
2004
, vol. 
2
 pg. 
e207
 
43
Cheng
Z.
Ventura
M.
She
X.
Khaitovich
P.
Graves
T.
Osoegawa
K.
Church
D.
DeJong
P.
Wilson
R.K.
Paabo
S.
, et al.  . 
A genome-wide comparison of recent chimpanzee and human segmental duplications
Nature
 , 
2005
, vol. 
437
 (pg. 
88
-
93
)
44
Goidts
V.
Armengol
L.
Schempp
W.
Conroy
J.
Nowak
N.
Muller
S.
Cooper
D.N.
Estivill
X.
Enard
W.
Szamalek
J.M.
, et al.  . 
Identification of large-scale human-specific copy number differences by inter-species array comparative genomic hybridization
Hum. Genet
 , 
2006
, vol. 
119
 (pg. 
185
-
198
)
45
Wilson
G.M.
Flibotte
S.
Missirlis
P.I.
Marra
M.A.
Jones
S.
Thornton
K.
Clark
A.G.
Holt
R.A.
Identification by full-coverage array CGH of human DNA copy number increases relative to chimpanzee and gorilla
Genome Res
 , 
2006
, vol. 
16
 (pg. 
173
-
181
)
46
Dumas
L.
Kim
Y.H.
Karimpour-Fard
A.
Cox
M.
Hopkins
J.
Pollack
J.R.
Sikela
J.M.
Gene copy number variation spanning 60 million years of human and primate evolution
Genome Res
 , 
2007
, vol. 
17
 (pg. 
1266
-
1277
)
47
Harris
R.A.
Rogers
J.
Milosavljevic
A.
Human-specific changes of genome structure detected by genomic triangulation
Science
 , 
2007
, vol. 
316
 (pg. 
235
-
237
)
48
Karolchik
D.
Hinrichs
A.S.
Furey
T.S.
Roskin
K.M.
Sugnet
C.W.
Haussler
D.
Kent
W.J.
The UCSC table browser data retrieval tool
Nucleic Acids Res
 , 
2004
, vol. 
32
 (pg. 
D493
-
D496
)
49
Bailey
J.A.
Yavor
A.M.
Massa
H.F.
Trask
B.J.
Eichler
E.E.
Segmental duplications: organization and impact within the current human genome project assembly
Genome Res
 , 
2001
, vol. 
11
 (pg. 
1005
-
1017
)
50
Stankiewicz
P.
Lupski
J.R.
Genome architecture, rearrangements and genomic disorders
Trends Genet
 , 
2002
, vol. 
18
 (pg. 
74
-
82
)
51
Bailey
J.A.
Eichler
E.E.
Primate segmental duplications: crucibles of evolution, diversity and disease
Nat. Rev. Genet
 , 
2006
, vol. 
7
 (pg. 
552
-
564
)
52
Bailey
J.A.
Gu
Z.
Clark
R.A.
Reinert
K.
Samonte
R.V.
Schwartz
S.
Adams
M.D.
Myers
E.W.
Li
P.W.
Eichler
E.E.
Recent segmental duplications in the human genome
Science
 , 
2002
, vol. 
297
 (pg. 
1003
-
1007
)
53
Kumar
S.
Hedges
S.B.
A molecular timescale for vertebrate evolution
Nature
 , 
1998
, vol. 
392
 (pg. 
917
-
920
)
54
Steiper
M.E.
Young
N.M.
Sukarna
T.Y.
Genomic data support the hominoid slowdown and an Early Oligocene estimate for the hominoid-cercopithecoid divergence
Proc. Natl. Acad. Sci. USA
 , 
2004
, vol. 
101
 (pg. 
17021
-
17026
)
55
Lam
K.W.
Jeffreys
A.J.
Processes of de novo duplication of human {alpha}-globin genes
Proc. Natl. Acad. Sci. USA
 , 
2007
, vol. 
104
 (pg. 
10950
-
10955
)
56
Reiter
L.T.
Hastings
P.J.
Nelis
E.
De Jonghe
P.
Van Broeckhoven
C.
Lupski
J.R.
Human meiotic recombination products revealed by sequencing a hotspot for homologous strand exchange in multiple HNPP deletion patients
Am. J. Hum. Genet
 , 
1998
, vol. 
62
 (pg. 
1023
-
1033
)
57
Jackson
M.S.
Oliver
K.
Loveland
J.
Humphray
S.
Dunham
I.
Rocchi
M.
Viggiano
L.
Park
J.P.
Hurles
M.E.
Santibanez-Koref
M.
Evidence for widespread reticulate evolution within human duplicons
Am. J. Hum. Genet
 , 
2005
, vol. 
77
 (pg. 
824
-
840
)
58
Chen
J.M.
Cooper
D.N.
Chuzhanova
N.
Ferec
C.
Patrinos
G.P.
Gene conversion: mechanisms, evolution and human disease
Nat. Rev. Genet
 , 
2007
, vol. 
8
 (pg. 
762
-
775
)
59
Caceres
M.
Sullivan
R.T.
Thomas
J.W.
A recurrent inversion on the eutherian X chromosome
Proc. Natl. Acad. Sci. USA
 , 
2007
, vol. 
104
 (pg. 
18571
-
18576
)
60
Cooper
G.M.
Nickerson
D.A.
Eichler
E.E.
Mutational and selective effects on copy-number variants in the human genome
Nat. Genet
 , 
2007
, vol. 
39
 (pg. 
S22
-
S29
)
61
Lee
J.A.
Lupski
J.R.
Genomic rearrangements and gene copy-number alterations as a cause of nervous system disorders
Neuron
 , 
2006
, vol. 
52
 (pg. 
103
-
121
)
62
Naser
S.A.
Ghobrial
G.
Romero
C.
Valentine
J.F.
Culture of Mycobacterium avium subspecies paratuberculosis from the blood of patients with Crohn’s disease
Lancet
 , 
2004
, vol. 
364
 (pg. 
1039
-
1044
)
63
Rioux
J.D.
Xavier
R.J.
Taylor
K.D.
Silverberg
M.S.
Goyette
P.
Huett
A.
Green
T.
Kuballa
P.
Barmada
M.M.
Datta
L.W.
, et al.  . 
Genome-wide association study identifies new susceptibility loci for Crohn disease and implicates autophagy in disease pathogenesis
Nat. Genet
 , 
2007
, vol. 
39
 (pg. 
596
-
604
)
64
Joly
S.
Maze
C.
McCray
P.B.
Jr
Guthmiller
J.M.
Human beta-defensins 2 and 3 demonstrate strain-selective activity against oral microorganisms
J. Clin. Microbiol
 , 
2004
, vol. 
42
 (pg. 
1024
-
1029
)
65
Buckland
P.R.
Polymorphically duplicated genes: their relevance to phenotypic variation in humans
Ann. Med
 , 
2003
, vol. 
35
 (pg. 
308
-
315
)
66
Tiano
H.F.
Wang
R.L.
Hosokawa
M.
Crespi
C.
Tindall
K.R.
Langenbach
R.
Human CYP2A6 activation of 4-(methylnitrosamino)-1-(3-pyridyl)-1-butanone (NNK): mutational specificity in the gpt gene of AS52 cells
Carcinogenesis
 , 
1994
, vol. 
15
 (pg. 
2859
-
2866
)
67
Messina
E.S.
Tyndale
R.F.
Sellers
E.M.
A major role for CYP2A6 in nicotine C-oxidation by human liver microsomes
J. Pharmacol. Exp. Ther
 , 
1997
, vol. 
282
 (pg. 
1608
-
1614
)
68
Oscarson
M.
McLellan
R.A.
Gullsten
H.
Agundez
J.A.
Benitez
J.
Rautio
A.
Raunio
H.
Pelkonen
O.
Ingelman-Sundberg
M.
Identification and characterisation of novel polymorphisms in the CYP2A locus: implications for nicotine metabolism
FEBS Lett
 , 
1999
, vol. 
460
 (pg. 
321
-
327
)
69
Fukami
T.
Nakajima
M.
Yamanaka
H.
Fukushima
Y.
McLeod
H.L.
Yokoi
T.
A novel duplication type of CYP2A6 gene in African-American population
Drug. Metab. Dispos
 , 
2007
, vol. 
35
 (pg. 
515
-
520
)
70
Kamataki
T.
Nunoya
K.
Sakai
Y.
Kushida
H.
Fujita
K.
Genetic polymorphism of CYP2A6 in relation to cancer
Mutat. Res
 , 
1999
, vol. 
428
 (pg. 
125
-
130
)
71
Miyamoto
M.
Umetsu
Y.
Dosaka-Akita
H.
Sawamura
Y.
Yokota
J.
Kunitoh
H.
Nemoto
N.
Sato
K.
Ariyoshi
N.
Kamataki
T.
CYP2A6 gene deletion reduces susceptibility to lung cancer
Biochem. Biophys. Res. Commun
 , 
1999
, vol. 
261
 (pg. 
658
-
660
)
72
Sellers
E.M.
Tyndale
R.F.
Mimicking gene defects to treat drug dependence
Ann. N. Y. Acad. Sci
 , 
2000
, vol. 
909
 (pg. 
233
-
246
)
73
Ingelman-Sundberg
M.
Sim
S.C.
Gomez
A.
Rodriguez-Antona
C.
Influence of cytochrome P450 polymorphisms on drug therapies: Pharmacogenetic, pharmacoepigenetic and clinical aspects
Pharmacol. Ther
 , 
2007
, vol. 
116
 (pg. 
496
-
526
)
74
McCarroll
S.A.
Altshuler
D.M.
Copy-number variation and association studies of human disease
Nat. Genet
 , 
2007
, vol. 
39
 (pg. 
S37
-
S42
)
75
Gilad
Y.
Rifkin
S.A.
Bertone
P.
Gerstein
M.
White
K.P.
Multi-species microarrays reveal the effect of sequence divergence on gene expression profiles
Genome Res
 , 
2005
, vol. 
15
 (pg. 
674
-
680
)
76
Workman
C.
Jensen
L.J.
Jarmer
H.
Berka
R.
Gautier
L.
Nielser
H.B.
Saxild
H.H.
Nielsen
C.
Brunak
S.
Knudsen
S.
A new non-linear normalization method for reducing variability in DNA microarray experiments
Genome Biol
 , 
2002
, vol. 
3
  
research0048
77
Korbel
J.O.
Urban
A.E.
Grubert
F.
Du
J.
Royce
T.E.
Starr
P.
Zhong
G.
Emanuel
B.S.
Weissman
S.M.
Snyder
M.
, et al.  . 
Systematic prediction and validation of breakpoints associated with copy-number variants in the human genome
Proc. Natl. Acad. Sci. USA
 , 
2007
, vol. 
104
 (pg. 
10110
-
10115
)
78
Slierendregt
B.L.
Otting
N.
van Besouw
N.
Jonker
M.
Bontrop
R.E.
Expansion and contraction of rhesus macaque DRB regions by duplication and deletion
J. Immunol
 , 
1994
, vol. 
152
 (pg. 
2298
-
2307
)
79
Otting
N.
de Vos-Rouweler
A.J.
Heijmans
C.M.
de Groot
N.G.
Doxiadis
G.G.
Bontrop
R.E.
MHC class I A region diversity and polymorphism in macaque species
Immunogenetics
 , 
2007
, vol. 
59
 (pg. 
367
-
375
)
80
Kent
W.J.
Sugnet
C.W.
Furey
T.S.
Roskin
K.M.
Pringle
T.H.
Zahler
A.M.
Haussler
D.
The human genome browser at UCSC
Genome Res
 , 
2002
, vol. 
12
 (pg. 
996
-
1006
)
81
Hollox
E.J.
Huffmeier
U.
Zeeuwen
P.L.
Palla
R.
Lascorz
J.
Rodijk-Olthuis
D.
van de Kerkhof
P.C.
Traupe
H.
de Jongh
G.
den Heijer
M.
, et al.  . 
Psoriasis is associated with increased beta-defensin genomic copy number
Nat. Genet
 , 
2008
, vol. 
40
 (pg. 
23
-
25
)
82
Haberl
M.
Anwald
B.
Klein
K.
Weil
R.
Fuss
C.
Gepdiremen
A.
Zanger
U.M.
Meyer
U.A.
Wojnowski
L.
Three haplotypes associated with CYP2A6 phenotypes in Caucasians
Pharmacogenet. Genomics
 , 
2005
, vol. 
15
 (pg. 
609
-
624
)
83
Girard
C.
Barbier
O.
Veilleux
G.
El-Alfy
M.
Belanger
A.
Human uridine diphosphate-glucuronosyltransferase UGT2B7 conjugates mineralocorticoid and glucocorticoid metabolites
Endocrinology
 , 
2003
, vol. 
144
 (pg. 
2659
-
2668
)
84
Sawyer
M.B.
Innocenti
F.
Das
S.
Cheng
C.
Ramirez
J.
Pantle-Fisher
F.H.
Wright
C.
Badner
J.
Pei
D.
Boyett
J.M.
, et al.  . 
A pharmacogenetic study of uridine diphosphate-glucuronosyltransferase 2B7 in patients receiving morphine
Clin. Pharmacol. Ther
 , 
2003
, vol. 
73
 (pg. 
566
-
574
)
85
Daly
A.K.
Aithal
G.P.
Leathart
J.B.
Swainsbury
R.A.
Dang
T.S.
Day
C.P.
Genetic susceptibility to diclofenac-induced hepatotoxicity: contribution of UGT2B7, CYP2C8, and ABCC2 genotypes
Gastroenterology
 , 
2007
, vol. 
132
 (pg. 
272
-
281
)