Chromosomal Mapping and Candidate Gene Discovery of Chicken Developmental Mutants and Genome-Wide Variation Analysis of MHC Congenics

The chicken has been widely used in experimental research given its importance to agriculture and its utility as a model for vertebrate biology and biomedical pursuits for over 100 years. Herein we used advanced technologies to investigate the genomic characteristics of specialized chicken congenic genetic resources developed on a highly inbred background. An Illumina 3K chicken single nucleotide polymorphism (SNP) array was utilized to study variation within and among major histocompatibility complex (MHC)-congenic lines as well as investigate line-speciﬁc genomic diversity, inbreeding coefﬁcients, and MHC B haplotype-speciﬁc GGA 16 SNP proﬁles. We also investigated developmental mutant-congenic lines to map a number of single-gene mutations using both the Illumina 3K array and a recently developed Illumina 60K chicken SNP array. In addition to identifying the chromosomes and speciﬁc subregions, the mapping results afﬁrmed prior analyses indicating recessive or dominant and autosomal or sex chromosome modes of inheritance. Priority candidate genes are described for each mutation based on association with similar phenotypes in other vertebrates. These single-gene mutations provide a means of studying amniote development and in particular serve as invaluable biomedical models for similar malformations found in human. inbreds,

Animal models play a significant role in the study of the genetics and molecular basis of diseases and disorders, provide insight into the diversity of mammalian and avian species, and contribute to our overall understanding of vertebrate biology. One of these models, the chicken, has served as a versatile research organism for over 100 years for the study of development, genetics, virology, immunology, oncology, and physiology (ICGSC 2004;Siegel et al. 2006). The success of chicken as a model organism is due to its in ovo development, relatively short generation interval, and large number of progeny (ICPMC 2004;Siegel et al. 2006;Muir et al. 2008aMuir et al. , 2008b. Another factor key to the importance of the chicken as a model since the time of Bateson is the diversity of phenotypes available for study, with early work focusing on feather and comb patterns segregating within and among breeds (Bateson and Saunders 1902;Spillman 1909;Serebrovsky and Petrov 1930;Hutt and Lamoreux 1940;Hutt 1949;). Throughout the 20th century, numerous unique genetic resources were developed at university and government institutions and within commercial companies (Pisenti et al. 1999;Delany 2004;Delany and Gessaro 2008). The diverse genotypes and phenotypes coupled with the chicken genome sequence assembly (ICGSC 2004) and the identification of 2.8 million single nucleotide polymorphisms (SNPs) (ICPMC 2004;Muir et al. 2008a;Muir et al. 2008b) in combination with advancing technology provide the tools to analyze genetic variation underpinning physiological, developmental, immunogenetic, and other traits of importance. Such knowledge contributes to our understanding of chicken, avian, and vertebrate biology. In this paper, we discuss the use of advanced SNP genotyping technologies for the analysis of genetic variation within and among specialized and well-studied chicken genetic resources, including selected inbreds, major histocompatibility complex (MHC)-congenic inbred lines, and developmental mutantcongenic lines.
The chicken MHC (a.k.a. B system) resides in GGA 16 and has been the focus of many genetic studies, in addition to its use as a model for the primordial vertebrate MHC and its subsequent evolution (Shiina et al. 2007;Kaufman 2008;Delany et al. 2009). Furthermore, specific MHC B haplotypes have been shown to be of critical importance for disease resistance and susceptibility, which is important for agricultural production of poultry (Abplanalp et al. 1985;Bacon 1987;Senseney et al. 2000;Taylor 2004;Kaufman 2008). In 1975, researchers at the University of California-Davis (UCD) introgressed various MHC B haplotypes onto UCD-003, a 99.9% inbred line, thereby creating MHCcongenic inbred lines (Abplanalp 1992). The MHC-congenic lines were used to examine the influence of the chicken MHC in a number of studies ranging from the mechanisms of immune resistance to mate preference and fertility rates to its effect on animal health and production traits (Abplanalp et al. 1985Bacon 1987;Lamont 1989;Bacon et al. 2000;Zhou and Lamont 2003;Delany and Gessaro 2008). These MHC-congenic lines have been maintained for 3 decades and provide the opportunity to explore aspects of congenic purity, mutation rate, and allow for the study of genetic divergence over time.
In addition to its value in the field of poultry disease and general immunogenetics, the chicken has long been a premier vertebrate model for the study of development, and in particular limb development (Tickle 2004; see Antin et al. 2004). Given the community of researchers using the chicken to study vertebrate development, as well as other disciplines, and the advent of the genomic technologies, the chicken was recently recognized by the National Institute of Health (2009) as a model organism for biomedical research (http:// www.nih.gov/science/models/). The many naturally occurring chicken mutants exhibiting developmental defects offer opportunities for the study of unique aspects of developmental biology (Romanoff 1972;Abbott and Yee 1975;Somes 1990aSomes , 1990bPisenti et al. 1999;Delany 2004). The UCD maintains a number of developmental mutant-congenic lines (herein referred to as mutant congenic), which were studied for their mode of inheritance and phenotypes since the 1960s. Similar to the MHC congenics discussed above, the 10 developmental mutations reviewed in this study are on the highly inbred UCD-003 background. These lines segregate mutations causing craniofacial, limb, skeletal, muscular, and integument defects (Somes 1990a(Somes , 1990bPisenti et al. 1999 and references therein) and therefore provide a vehicle to detect and identify the specific genetic mutations that cause these defects. Many of the defects show similarity with human conditions, both inherited and sporadic; thus, the chicken is a valuable resource for the scientific community to study the etiology of both human and animal defects and syndromes, which are refractory to analysis in other systems for a variety of reasons.
The initial chicken genome sequencing effort was completed in 2004 (ICGSC 2004;ICPMC 2004;Wallis et al. 2004) and as a result, identification of genomic variation and candidate genes regions are now feasible. Here, we utilized 2 SNP array platforms, 3K and 60K, to explore the genetic variation segregating in specialized genetic resources including a series of MHC-congenic and 3 inbred lines, in addition to a series of developmental mutant congenics in order to map the chromosomal locations for the mutations. In the case of the MHC congenics, the analysis provided insight into each line#s genomic diversity, segregating regions, inbreeding coefficients, and MHC B haplotype-specific GGA 16 SNP profile. In the case of the mutant congenics, we have affirmed mode of inheritance (autosomal or sex linked), discovered specific chromosomal locations for the mutations, and identified priority candidate genes.

Genetic Lines
Highly inbred lines as well as MHC-and developmental mutant-congenic inbred lines were investigated for genetic variation by SNP analyses using Illumina GoldenGate 3K and 60K iSelect SNP chip array platforms. The inbred and MHC-congenic lines were analyzed for the purpose of investigating whether variation exists within and among the lines as well as the parent background genotype (in the case of the congenics). Within-and among-line comparisons were made and, in some cases, included archived samples from 20 years ago. In the case of the mutant congenics, the purpose of the SNP analysis was to identify the causative region (CR) associating with various mutations so to begin investigating candidate loci to determine causative genes. The developmental mutations were characterized decades ago for phenotype and mode of inheritance and were more recently developed into congenic lines (Pisenti et al. 1999). The background genotype for both categories of congenics is the single comb white leghorn UCD-003 inbred line (F . 0.99, Abplanalp 1992). All congenic lines are denoted as genetic_line_name.003 or as seen in the tables, gene_ symbol.003. In addition to UCD-003, 2 additional inbred lines were studied, UCD-058 and -082, along with 10 MHC congenics .003, and 10 mutant congenics including UCD-Polydactyly. 003 a.k.a. Po.003 (Po), -Coloboma.003 a.k.a. Co.003 (co, also known as cm), -Diplopodia-1.003 a.k.a. Dp-1.003 (dp-1), -Diplopodia-3.003 a.k.a. Dp-3.003 (dp-3), -Diplopodia-4.003 a.k.a. Dp-4.003 (dp-4), -Eudiplopodia.003 a.k.a. Eu.003 (eu), -Stumpy.003 a.k.a. Stu.003 (stu), -Limbless.003 a.k.a. Ll.003 (ll), -Talpid-2.003 a.k.a. Ta-2.003 (ta-2), and -Crooked-neck dwarf.003 a.k.a. Cn.003 (cn). Table 1 describes the genetic characteristics and phenotypes of the lines. Co.003 co 1970 / --Hemizygous female mutants (Z -/W) are moderately to severely dwarfed with mild to severe cleft palate, some are lacking preaxial digits or have truncated wings and legs; some are edemic; the expression may be highly variable even with the same parents, temperature sensitive phenotypic variability Dp-1.003 dp-1 1947 --Homozygotes (À/À) display moderate preaxial polydactly, dwarfing, some with exposed vicscera, occasional cleft palate, and shortened upper beak Dp-3.003 dp-3 1972 / --Homozygotes (À/À) display moderate preaxial polydactly, dwarfing, some with exposed viscera, occasional cleft palate, and shortened upper beak Dp-4.003 dp-4 1972 / --Hemizygous female mutants (Z -/W) exhibit craniofacial defects along with truncation of the extremities, short stature, exposed viscera, preaxial polydactly, mild to severe cleft palate and a shortened upper beak; occasional elongation and duplication of the fibula Eu.003 eu 1959 --Homozygotes (À/À) have extra digits (5-9) on the dorsal surfaces of the limb buds, which develop into extra, scaled bidorsal toes on the feet with conical nails and occasional dorsal knobs or digits on the wings; expression is variable; temperature sensitive phenotype Po.003 Po 1948 / --Homozygous (Po/Po) and heterozygous (Po/+) mutants may have an additional preaxial digit on one or both feet and/or wings; or affected individuals may display a longer than normal first digit on one or both feet Ll.003 ll 1979 / --Homozygotes (À/À) do not form limb buds or limbs and usually have a shortened upper beak Stu.003 stu 1966 / --Homozygotes (À/À) have conical leg buds and a poorly to nonvascularized allantois that never gets larger than the head; embryo death typically between E5-7 is associated with massive multiple hemorrhages

Sample Collections and SNP Genotyping
From adults (inbred, MHC-congenic, mutant-congenic lines), approximately 0.3 ml of blood was collected in heparin tubes; some inbred and MHC-congenic samples were derived from archived semen samples dating to 1986 held in liquid nitrogen. Approximately 0.2 ml of blood were collected from embryos (mutant congenics) by capillary action using microhematocrit capillary tubes (Fisher Scientific) and placed in a 1.5-ml microcentrifuge tube containing 0.4 ml of 0.07 M sodium citrate/sodium chloride to prevent coagulation. To obtain homozygous mutant embryos for analysis, embryos from each genetic line were incubated to E9, a stage of development such that the phenotypes (normal þ/þ and þ/À; mutant À/À (autosomal recessive) or À/W [sex linked]) could be accurately discerned. Exceptions were Stu.003 (collected at E6), Co.003 and Eu.003 (collected E10), and Cn.003 (collected at E14). DNA was isolated from the blood and semen samples using the DNeasy Blood & Tissue Kit (Qiagen) or from previously archived nuclear pellets according to Petitte et al. (1994) using a QIAamp DNA Blood Mini Kit (Qiagen). Initial SNP analyses were conducted for 9 of the mutantcongenic (UCD-Co.003, Dp-1.003, Dp-3.003, Dp-4.003, Eu.003, Po.003, Ll.003, Ta-2.003, and Stu.003), the 3 inbred (UCD-003, -058, and -082), and 10 MHC-congenics ) lines using a 3072 Illumina GoldenGate chicken SNP array-herein referred to as the 3K array. The SNPs were chosen from a 2.8 million SNP data set (ICPMC 2004) and were designed to be evenly spaced throughout the genome sequence (WASHUC1, Feb. 2004/galGal2); additionally, chromosomal recombination rates were taken into consideration (Gitter 2006;Muir et al. 2008a;Muir et al. 2008b). More recently, DNAs from the 10 developmental mutants (all those listed above and Cn.003) were analyzed using a 60,800 chicken Illumina iSelect SNP platform (herein referred to as the 60K array). The polymorphic sequences selected for the 60K array were developed by the United States Department of Agriculture (USDA) Chicken Genomewide Marker-assisted section Consortium, Cobb Vantress, and Hendrix Genetics (Cheng H, unpublished data). The 60K array design included the genome-wide even distribution of 60,800 SNP bins. SNPs were selected using a number of criteria: chromosomal recombination rates, previous SNP validation in one or more of the 3 prior SNP arrays, and/or when known minor allele frequency was greater than 0.05 (Cheng H, unpublished data). Sequence information was based on the May 2006 chicken (Gallus gallus) v2.1 (galGal3) assembly (WASH-UC2, May 2006;ICGSC 2004).

SNP Genotyping Array Chromosomal Coverage
The parameters and characteristics of the 3K SNP array are described in Muir et al. (Muir, Wong, Zhang, Wang, Groenen, Crooijmans, Megens, Zhang, Okimoto, et al. 2008;Muir, Wong, Zhang, Wang, Groenen, Crooijmans, Megens, Zhang, McKay, et al. 2008). The 60K SNP array included all the chromosomes found on the 3K SNP array, plus GGA 25. Of the 60,800 total SNPs, 94.8% or 57,636  Altman and Katz (1979), Abplanalp et al. (1992), Delany and Pisenti (1998), and Pisenti et al. (1999). a Disease resistance: RSV, Rous sarcoma virus; MDV, Marek's disease virus; r, resistant; s, susceptible. Developmental mutants were not characterized for resistance/susceptibility to these viruses. All other gaps indicate unknown reaction to virus. b MHC-congenic lines UCD-380.003 and 386.003 are now extinct. c Allele symbol: B refers to the haplotype present at the MHC B region on GGA 16. d Listed is the year in which each line was first established at UCB or UCD, or in the case of some of the developmental mutants, the year in which the mutation was introduced into the University of California genetic stock. If year of origin is unknown, the date of first identified publication is indicated with /. e All MHC haplotypes were first crossed to UCD-003 in 1975. After 5 generations of backcrosses, the lines were then closed (1980) and mated inter se starting in 1981. f All developmental mutations are embryonic lethal except for polydactyly (Po), which survives to adulthood. The developmental mutant-congenic lines were developed using the backcross method over a period of 5 or 6 generations (instituted in the mid-1980s) allowing for 96.9% and 98.4% parent background once the lines were closed for inter se matings, respectively. In a very few cases, lines close to extinction (due to low fertility and small numbers) had to be rescued by occasional backcrossing again to UCD-003 (99.9% inbred).
gave successful assays for this work. Successful coverage was variable across chromosomal categories. For example, the percent coverage, calculated from the total successful SNPs for this work, for the different chromosomal size classes found in the chicken genome are as follows: macrochromosomes (GGA 1-5) (46.6%), intermediate chromosomes (GGA 6-10) (15.5%), microchromosomes (GGA 11-28, 32) (30.4%), and sex chromosomes (Z and W) (5.2%). The SNPs that were not linked to a specific chromosome constituted 2.3% of the total successful SNPs.

SNP Analysis and CR Identification
In order to define the maximum and minimum CR(s), CR max and CR min , respectively, for each mutant-congenic line, the SNP results were compared with the congenic inbred background genotype of UCD-003. A polymorphic marker indicates that the mutation is linked to a particular SNP and chromosomal region. The following SNP pattern suggests the presence of a polymorphic marker: mutants type as homozygous (-/-) (denoted AA) for the alternative allele and nonaffected carriers type as heterozygous (þ/-) (AB) when compared with the UCD-003 genotype (þ/þ) (BB). The outermost flanking markers, exhibiting this pattern defined the CR min for the mutant congenics. The CR max was defined by identifying recombination events that resulted in a change in the polymorphic marker pattern. In some cases where a breakpoint could not be identified due to lack of sequence information, the average recombination rate for the specific chromosomal region was added to the CR min identified. Likewise, analysis of the 3 inbred (UCD-003, -058, and -082) and 10 MHC-congenic lines (UCD-253.003, -254.003, -312.003, -330.003, -331.003, -335.003, -336.003, -342.003, -380.003, and -386.003) was also compared with the highly inbred UCD-003 using the technique described above. However, instead of typing a sample as ''mutant'' or ''nonaffected carrier,'' the sample was simply assessed at all 2679 SNPs and genotype (homozygous AA and BB; heterozygous AB) was identified. Polymorphisms between the 2 lines, UCD-003 and the line in question were identified.

Priority Candidate Gene Analysis and Identification
The genes within the CR max (candidate gene location) for each of the developmental mutant lines were identified using NCBI (

UCD-003
Since 1956, the UCD has maintained UCD-003 by a full-sib mating strategy. In 1984, the estimated inbreeding coefficient for UCD-003 was 0.99 (Abplanalp 1992). Thus, it comes as no surprise that the 3K array analysis of 3 UCD-003 birds from 2006 were genetically identical to 3 1991 birds and 1 1996 bird (Table 2). Likewise, analysis of 2 UCD-003 2008 individuals using the 60K SNP array (described below) exhibited no variation (Table 2). Because the developmental mutant-congenic lines (discussed below) differed at only the chromosomal positions, it is concluded that the mutant congenics have the same MHC B17 haplotype as UCD-003.

UCD-058 and UCD-082
White leghorn lines, UCD-058 and UCD-082, selected for increased egg production, were estimated in 1984 to have inbreeding coefficients of 0.80 and 0.76, respectively (Abplanalp 1992). The 3K array analysis of the UCD-082 samples from 1986 suggests an inbreeding coefficient of approximately 0.855 ± 0.008, corroborating the 1986 inbreeding coefficient expected from full-sib matings. Although a within-line comparison for UCD-058 could not be conducted due to the lack of archived samples, an examination of the number of heterozygous SNPs suggests that in 1986 this line had an approximate inbreeding coefficient of 0.813, similar to the expected inbreeding coefficient. Inbreeding coefficients were calculated, based on individual reduction in total heterozygosity across loci, per Muir et al. (Muir et al. 2008a).
As egg production is a quantitative trait, many genes play a role in its regulation from growth-regulating, shellthickness, to disease resistance genes which have been mapped to a number of chromosomes (e.g., GGA 2,4,5,8,9,Z) with genetic variation impacting the overall trait (Feng et al. 1997;Tuiskula-Haavisto et al. 2002;Ankra-Badu and Aggrey 2005;Rubin et al. 2010). When comparing the 3K array results of UCD-058 and UCD-082, a total of 2100 SNPs (78.4%) were similar and 385 SNPs (14.4%) differed. Interestingly, a number of shared SNPs are located within or close to the proposed egg production genes previously identified by the studies indicated above (data review not shown, but available on request). A representative example of such data includes several GGA Z SNPs found in or around the growth hormone receptor and interferon genes, both of which are shown to associate with egg production (Feng et al. 1997;Ankra-Badu and Aggrey 2005).

UCD MHC-Congenic Chicken Lines
Ten MHC-congenic inbred lines were analyzed for their SNP genotype features by within-(over time) and among-line comparisons as well as a comparison with the parental background genotype, UCD-003 (Table 2). In order to create the MHC congenics, all the lines were first backcrossed to UCD-003 for 5 generations with accompanying serological selection for B haplotype (see Tables 1, 2, and 3) prior to closing the lines with inter se crossing (Abplanalp 1992;Abplanalp et al. 1992). Recently, Fulton et al. (2006) used microsatellites (LEI0258 and MCW0371) to distinguish between the various MHC B haplotypes using the UCD lines and confirmed their B haplotypes.
Genome-Wide Variation (non-GGA 16) within and among MHC-Congenic Lines and UCD-003 Although the MHC congenics have been bred and selected for their homozygous MHC B haplotype for ;30 years, these lines are not pure congenics, meaning the MHC B haplotype is not the only region of variation. Because these lines were closed after 5 generations of backcross mating, 1.6% of DNA from the original haplotype source background is estimated to remain, with approximately 98.4% of the genome as UCD-003 (Abplanalp 1992).
Overall, the 3K array results affirmed that a large portion of each of the MHC-congenic genome is essentially UCD-003 (average: 99.55%), with an average difference among all lines and UCD-003 of only 9.95 SNPs (Table 2). Such a high degree of similarity (99.55%) provides evidence that the breeding schemes used to maintain these genetic lines have been successful (i.e., no pedigree errors). The observed differences between the MHC congenics and UCD-003 are compiled in Table 2. Points of interest are that UCD-386.003 is segregating only at one SNP on GGA 4 and likewise, UCD-312.003 segregates only at loci on GGA 16. Both of these results could be due to the small numbers available for reproduction, minimal variation compared with UCD-003 in the source line, and/or repeated backcrossing (to UCD-003) in order to maintain the line. Line UCD-380.003 shows segregation at 8 SNPs (across 4 chromo-somes) when compared with UCD-003, the source of both the MHC B haplotype (B17) and the inbred background. Because UCD-003 was ;99.9% inbred when the MHC congenics were established in 1975, a possible genetic mechanism for such variation is spontaneous mutation (estimated rate in chicken 5 0.0017; Kuhnlein et al. 1989); additionally, the number of birds in each MHC-congenic line are kept to a minimum providing an opportunity to maintain the mutation in subsequent generations (see heterozygous loci, Table 2). After the lines were closed, birds were selected for breeding based on their performance (e.g., fertility and health) relative to their sibs thereby possibly maintaining heterozygosity. We therefore assessed the level of heterozygosity within and among MHC-congenic lines ( Table 2). Birds from both 1986 and 2003 or 2006 were analyzed to determine the genomic variation at the various time points. One could expect a small portion of the genome to be heterozygous in 1986 and only a few SNPs to be segregating approximately 20 years later except for cases of mutation and/or heterozygous advantage. Overall, our results suggest that the breeding schemes and selection for the specific MHC haplotype have been successful as the 3K array analysis shows an increase in genetic homogeneity within these congenic lines over additional generations (Table 2). A representative example is shown by UCD-336.003 wherein 1986 birds exhibited 0.91% heterozygosity and 0.07% heterozygosity in 2006 compared with UCD-003.

GGA 16 Variation among the UCD MHC Congenics and UCD-003
Currently, the chicken genome browsers only display ;443 kb of MHC sequence data and although only 6 SNPs from the 3K array were specific to GGA 16 (Table 3), differences were found among lines. Serological  and references therein; Briles et al. 1950) and microsatellite  b The GGA 16 coordinates are arbitrary and error prone as its assembly is poor due to the chromosomal size and the various repetitive regions throughout the microchromosome (e.g., NOR ribosomal RNA repeats, MHC tandem repeats, and the relatedness of the MHC genes/gene families) (Delany et al. 2009;ICGSC 2004). c UCD-003 was used as the control for comparison as all MHC congenics were backcrossed onto the UCD-003 background. d UCD MHC-congenic lines: for MHC B haplotype and immune response to MDV and RSV, see Table 1. (Fulton et al. 2006) tests confirm that UCD-003 and UCD-380.003 share the same MHC B haplotype, B17. Although 5 GGA 16 SNPs were identical, one differed (rs15788248, coordinate 58,575 bp), perhaps due to a spontaneous mutation as the locus was genotyped as heterozygous.

Robb et al. Genomic Characteristics in Specialized Chicken Genetic Resources
Notably, UCD-003 shows the same GGA 16 SNP pattern as UCD-386.003, and the latter is an MHC recombinant between B15 and B21 (BG/BF loci). MHC haplotypes B15 and B21 are maintained in UCD lines 254.003 and 330.003, respectively. SNP analysis of these 2 lines indicate that a recombination event occurred between SNPs rs15788248 and rs15026773 and between rs14096713 and rs14096690 thereby resulting in the GGA 16 SNP genotypic pattern displayed by the MHC recombinant (B15/21R) line UCD-386.003. The MHC haplotype BQ is similar to B21 (maintained in UCD lines 336.003 and 330.003, respectively) (Senseney et al. 2000 and references therein), and here, we show that their GGA 16 SNP genotypic patterns are the same (Table 3). As indicated in Table 1, the BQ haplotype originated from the red jungle fowl UCD-001, which was used in the reference population for mapping (Crittenden et al. 1993;Cheng and Crittenden 1994) and the sequenced genome (ICGSC 2004). When comparing the reference genome UCD-001 (BQ) to UCD-336.003 (BQ), all GGA 16 SNP genotypes are similar.

MHC Congenics and Disease Resistance
Years of research on the MHC B haplotype has provided evidence of association with genetic susceptibility or resistance to tumor formation for Rous sarcoma virus (RSV) and Marek's disease (MD) (Abplanalp et al. 1985;Bacon 1987;Senseney et al. 2000;Taylor 2004). Such viral infections in the chicken result in decreased growth rate and egg production, processing condemnations, and high mortality Witter and Schat 2003). We analyzed the SNP pattern of the MD and RSV resistant and susceptible lines in order to consider whether specific SNPs associated with viral resistance profiles. Although there was no specific SNP pattern identified (Tables 1-3), several points can be made. Both UCD-330.003 and -336.003 congenics have been bred to maintain the MHC haplotype B21 (BQ origin for UCD-336.003), which is known to be highly resistant to both MD and RSV (Abplanalp 1979 and references therein;Bacon 1987). Interestingly, of the 10 MHC congenics and UCD-003, only these 2 lines share a unique SNP (rs15026782, coordinate: 171,898) genotype. More specifically, this SNP resides within an exon of BF1, an MHC class I a-chain 1 gene found within the B core antigen processing region (Shiina et al. 2007). Further study is required to determine whether the genetic variant causing strong resistance is linked to this SNP.
Given the multigenic nature of MD resistance/ susceptibility, we did not expect to identify a single causative SNP as numerous factors, genome-wide, contribute to the MD profile. However, in order to identify genomic regions contributing to MD resistance/susceptibility, we assessed the genotyping pattern across all MHC-congenic lines. It is of interest to note that there was one shared SNP (GGA 16 rs15026782) among all MD susceptible lines .003); however, 3 of the 5 MD-resistant lines show the same genotype (Table 3). Additionally, as SNP rs14096713 is conserved among all 10 MHC-congenic lines and UCD-003, it is probable that no susceptible or resistant markers are linked to this SNP. Although no specific SNP was common among all MD-resistant lines, several lines shared SNP differences relative to UCD-003 which is susceptible to MD. For example, the following MD-resistant lines share common SNPs: UCD-380.003 and UCD-386.003 (SNP on GGA 4), UCD-330.003 and UCD-380.003 (2 SNPs on GGA 7), and UCD-253.003, UCD-330.003, and UCD-331.003 (3 SNPs on GGA 17) ( Table 2).

UCD Developmental Mutant-Congenic Chicken Lines
The mutant-congenic lines (Table 1, Figure 1) were analyzed using both 3K and 60K SNP arrays as the advanced technologies developed. The initial map locations for these single-gene mutations identified by the 3K array are listed in Supplementary Table 1. The 60K array analyses (Table 4) significantly improved the resolution and number of markers within the trait-associated region for 6 mutations, co, dp-1, dp-4, Po, ll, and stu. For example, the 3K array mutant analyses identified one marker for both dp-4 and ll, whereas the 60K array analyses associated 25 and 42, respectively. Additionally, the 60K enabled the identification of chromosomal regions for 2 mutations (dp-3 and eu) as no chromosome was identified by the 3K array. Because there are a number of chromosomal regions still segregating in 2 lines, Ta-2.003 and Cn.003 (Supplementary Figure 1), 5 and 19 (Supplementary Table 2), respectively, priority candidate genes are not discussed for these 2 mutations. Specific SNPs linked to all 10 mutant congenics are available on request. Coloboma.003 mutant (Z À /W) displaying moderate coloboma, cleft palate, severe dwarfism, and extreme visceral (internal organs) exposure. (C) Diplopodia-1.003 mutant (À/À) displaying moderate cleft palate, moderate dwarfism, exposed viscera, and preaxial (thumb) polydactyly (digit(s) duplication) on both wings and feet. (D) Diplopodia-3.003 mutant (À/À) displaying mild cleft palate, moderate dwarfism, exposed viscera, and preaxial polydactyly on both wings and feet. (E) Diplopodia-4.003 mutant (Z À /W) displaying moderate cleft palate, moderate dwarfism, preaxial polydactyly on both wings and feet, and moderately exposed viscera. (F) Eudiplopodia. 003 mutant (À/À) displaying bidorsal digit duplication on feet and additional wing ''digit'' knobs. (G) Polydactyly.003 mutant (+/À) displaying preaxial polydactyly (single duplication) on both feet. (H) Limbless.003 (À/À) mutant displaying an absence of all limbs and a shortened upper beak. (I) Stumpy.003 mutant (À/À) displaying conical leg buds and extreme visceral exposure; this embryo was collected at E7.

Coloboma
The Coloboma (co) trait was originally named for the lack of tissue around the eye; its inheritance was shown to be sexlinked recessive affecting only females (the heterogametic sex in birds, ZW) and an embryonic lethal. The phenotype includes craniofacial defects, bilateral facial coloboma, along with absent or greatly reduced extremities due to disruption in cartilage formation (Abbott et al. 1970). Similarly, human ocular coloboma can occur as a multisystem syndrome involving other eye, craniofacial, skeletal defects, and genitourinary anomalies (Gregory- Evans et al. 2004). In this study, we mapped the coloboma trait to GGA Zp with a CR max of 1.49 Mb. On investigation of the RefSeq genes (5 chicken; 27 nonchicken) in the region, we suggest ADAMTS10 and SLC30A5, as high priority candidates (Table 4).
The specific physiological function of ADAMTS10 (adisintegrin-like and metallopeptidase domain with thrombospondin type 1 motif, 10) remains unknown (Kutz et al. 2008); however, it has been shown to be required for growth, both pre-and postbirth, with functions in the development of the eyes, skin, heart, and skeleton (Dagoneau et al. 2004;Sommerville et al. 2004). Interestingly, several missense mutations found within the metalloprotease domain near the 3# end of this gene have been linked to the human autosomal recessive Weill-Marchesani syndrome (Dagoneau et al. 2004;Kutz et al. 2008). This syndrome is a connective tissue disorder with physical characteristics such as short stature, brachydactyly, eye abnormalities, joint stiffness, and heart defects (Faivre et al. 2003 and references therein).
The SLC30A5 gene encodes the ZNT5 protein and facilitates zinc efflux from the cytoplasm to Golgi-enriched vesicles (Kambe et al. 2002;Palmiter and Huang 2004). ZNT5-null mice display poor growth, muscle weakness, osteopenia, and male-specific death due to bradyarrhythmia (Inoue et al. 2002). Additional Slc30a5 knockout abnormalities are found in adipocytes, skeletal myocytes, osteoblasts, and cardiomyoctyes of conduction systems thereby suggesting that ZNT5 plays an important role in the development or maintenance of mesenchyme-related cells, as all the above cell types are derived from mesenchymal stem cells (mesodermal origin) (Inoue et al. 2002).
b Chromosomal arms (p vs. q) are indicated if the centromere has been positioned in the sequence assembly.
c All sequence positions are based on the May 2006 chicken (Gallus gallus) v2.1 (galGal3) assembly. Coordinates are listed in base pairs, and sizes are listed in megabase pairs. d The CR min is the minimum chromosomal region, which is still linked to the mutation, where the causative element could reside.
e The CR max is the maximum chromosomal region identified through SNP analysis or by adding the recombination rate (RR) estimated for the genomic region to the CR min identified (see Materials and Methods). f The number of RefSeq genes (within CR max ) was determined by examining the number of chicken RefSeq genes in NCBI, Ensembl, and the UCSC genome browser. g Priority candidate genes identified for the corresponding mutant congenic lines are listed below (see details in Results and Discussion: UCD developmental mutant-congenic chicken lines). h The RR at this position of GGA Z is 2 cM/Mb (Groenen et al. 2009). In order to best estimate the CR max , 500 kb was subtracted from the 5# sequence coordinate of the CR min . i The RR at this position of GGA 24 is 8.7 cM/Mb (Delany et al. 2007; average microchromosome RR); 115 kb was subtracted and added to the 5# and 3# CR min sequence coordinates, respectively. j The RR at this position of GGA 5 is 2 cM/Mb (Groenen et al. 2009); 500 kb was subtracted from the 5# sequence coordinate of the CR min .
showed all mutations to be independent (dp-5 was not assessed) (Landauer 1956;Somes 1990b and references therein;Taylor 1972). Morphological defects of the wings and legs can be designated in the following order of severity: dp-3,dp-1,dp-4,dp-2 (Taylor 1972 and references therein). The physical characteristics of Dp-1.003 mutants are thought to be due to a disruption in the genes involved in mesodermal tissue function, the phenotype includes extra preaxial digits, dwarfism, and a mild cleft palate (Abbott 1959(Abbott , 1967. For the diplopodia-1 trait, we identified a 0.72 Mb CR max on the q arm of GGA 1 that includes 4 chicken RefSeq and 16 nonchicken RefSeq genes (Table 4). We propose the MRE11A gene as a priority candidate for dp-1 as its disruption results in defective sister chromatid cohesion (Barber et al. 2008). Genes with similar functions (e.g., NIPBL and SMC1L1) were found associated with Cornelia de Lange syndrome (Krantz et al. 2004;Tonkin et al. 2004;Kaur et al. 2005;Musio et al. 2006). This disorder is characterized by facial dysmorphisms, upper limb abnormalities, cognitive retardation, growth delay (Jackson et al. 1993 and references therein), and occasional submucous cleft palate (Kline et al. 2007).
Although the dp-3 mutation is similar to dp-1 phenotypically, the CR max was mapped to a 0.95 Mb region on GGA 24 (Table 4), which corroborates the complementation test results as well as the autosomal recessive mode of inheritance. Although the sequence assembly of GGA 24 is incomplete having numerous gaps and noncontiguous regions, a total of 6 chicken RefSeq genes were identified within the CR max . Of these genes, we identified MLL1 as a priority candidate. MLL1 (previously known as MLL) encodes a DNA-binding protein which methylates histone H3 thereby regulating expression of target genes, notably the HOX genes (Milne et al. 2002). Furthermore, it acts as a maintenance factor for development of multiple tissues during embryogenesis (Yu et al. 1995). Mouse knockouts indicate that Mll1 is necessary for proper mammalian segment identity through positive regulation of Hox gene expression (Yu et al. 1995). The affected anatomical systems observed in the homozygous and heterozygous knockouts include skeletal, craniofacial, and limb/digit/tail abnormalities (MGI: 96995). Interestingly, this gene was associated with the luxoid mutation in the mouse (Pravenec et al. 1997); these polydactylous mice exhibit reduction or absence of tibiae, torsion of the fibulae, alopecia, and semilethality Strong and Hardy 1956).
The dp-4 mutation differs from dp-1 and dp-3 in exhibiting elongation and duplication of the fibula (Pisenti JM, personal communication), a phenotype reported in human (Karchinov 1973;Jones et al. 1978;Narang et al. 1982). Limb bud gene expression results from dp-1 and dp-4 mutants suggest that the genes responsible for the mutations either modified the interaction of the HOXD genes with their interacting factors (i.e., FGF and BMP gene families) or suggests that the causative gene resides downstream of SHH thereby altering the embryo formation polarization (Rodriguez et al. 1996). Our study mapped the diplopida-4 trait to GGA Zp, with a CR max of 2.67 Mb; unfortunately, none of the RefSeq genes (11 chicken; 30þ nonchicken) ( Table 4) are known to interact with or alter expression of SHH, HOXD, or the FGF and BMP gene families as discussed by Rodriguez et al. (1996). We therefore suggest NIPBL (Nipped-B homolog) as a priority candidate gene for dp-4 because it is a key component required for assembling the protein cohesin onto the chromatids ( Jahnke et al. 2008) and plays a role in developmental regulation (Rollins et al. 1999). As discussed above in regard to dp-1, mutations in NIPBL are associated with Cornelia de Lange syndrome in human (Jackson et al. 1993;Tonkin et al. 2004;Kline et al. 2007).
Eudiplopodia Rosenblatt et al. (1959) first described the eu mutation nearly 50 years ago as an embryonic lethal, autosomal recessive polydactylous mutation. Five to nine additional digits protrude from the leg, positioned anywhere from the hock to the foot, with each supernumerary digit homologous to the normal toe lying ventral to it (Fraser and Abbott 1971). It was hypothesized that additional digits resulted from formation of a secondary apical ectodermal ridge (AER) (Goetinck 1964;Fraser andAbbott 1971). D'Souza et al. (1998) documented a developmental malformation in human that possesses a similar phenotype to the chicken eu mutation. The affected individual exhibited a foot with a total of 9 toes, 2 of which had conical nails positioned dorsal to the normal toes; it was hypothesized that the phenotype was due to cellular mosaicism or damage during formation of the AER on the dorsal surface of the foot as opposed to an inherited genetic mutation (D'souza et al. 1998). A 1.07 Mb CR max was mapped to GGA 5q for the eu mutation; the region currently includes 7 chicken RefSeq genes and over 30 nonchicken RefSeq genes (Table 4). Of the elements within this region, we suggest MGA as a priority candidate. MGA encodes a protein that contains a T-box DNA-binding motif. T-box proteins are transcription factors that control developmental pathways including specification of limb identity and regulation of limb development (Wilson and Conlon 2002). Expression of MGA is found within the limb bud as well as regions patterned by mesoderm and mesodermal-epithelial interactions; it is hypothesized that MGA participates in regulating mesoderm induction or differentiation (Hurlin et al. 1999).

Polydactyly
Polydactyly is probably the most common and well-known developmental malformation affecting numerous vertebrates including humans, mice, cats, dogs, and chickens. Polydactyly in chicken is an autosomal dominant mutation resulting in the development of an additional preaxial digit (Table 1) (Warren 1944;Pitel et al. 2000;Dorshorst et al. 2010). Human polydactyly, inherited in both an autosomal dominant and recessive fashion, has an incidence of approximately 1 in 500 births with varying rates between sexes and among races (Woolf and Myrianthopoulos 1973). Species such as the pig and chicken have both dominant and recessive forms of polydactyly (Somes 1990a;Gorbach et al. 2010). Genome-wide linkage studies of polydactyly in both human and mouse indicated both SHH/Shh and LMBR1/ Lmbr1 genes as associated with the mutation. Herein, we find that the chicken orthologs of these genes are encoded on GGA 2p and within the CR identified (CR max: 6.34 Mb; CR min: 6.01 Mb) by the 60K array (Table 4). A sequence element found within intron 5 of Lmbr1 called a zone of polarizing activity (ZPA) regulatory sequence (ZRS) drives expression of Shh in the anterior limb bud (Sharpe et al. 1999;Blanc et al. 2002;Lettice et al. 2003;Maas and Fallon 2005;Sagai et al. 2005) and is the cause of the polydactyl phenotype. In chicken, the 794 bp ZRS region is located approximately 328 kb upstream of the SHH gene (GGA2: 8,024,909-8,034,717) (Dorshorst et al. 2010). Recent work by Dorshorst et al. (2010) describes further investigation of the chicken ZRS. The 794 bp ZRS from 5 polydactylous chicken breeds was sequenced and analyzed for variation among the breeds. A single-point mutation (SNP ss161109890) within a transcription factor-binding site involved in limb morphogenesis was found to be conserved in 2 breeds, whereas the other 3 did not have the polymorphism (SNP ss161109890) or any other mutations within the 794 bp ZRS. This research suggests that the chicken polydactyl phenotype can result from at least 2 causal mutations, which is consistent with observations in human and other vertebrates (Dorshorst et al. 2010 and references therein). Considering the research conducted to date, we suggest ZRS as the causative element for the chicken Po mutation. Sequencing of the Po.003 ZRS will be necessary to identify whether there is a single-point mutation (SNP ss161109890) within the ZRS or if the phenotype is due to another allele.
Interestingly, when comparing the initial 3K array results (Supplementary Table 1) with that of the 60K array (Table  4), the expanded 60K array decreased the CR min by only 142 Kb. During the timeframe between the 2 SNP analyses, the Po.003 genetic line underwent 4 additional years of sib matings. With an expected recombination rate of 1.5-3.5 cM/Mb (Elferink et al. 2010;Groenen et al. 2009) for the 6.34 Mb CR max on GGA 2 (Table 4), one could expect at least 9 recombination events to occur thereby reducing the CR max , CR min , and the 102 polymorphic SNPs associated with this mutation. Interestingly, Lodder et al. (2009) showed an inversion in a patient with postaxial polysyndactyly. Although inversions are not seen in all polydactylous individuals, there have been documented cases. We are currently investigating the possibility that an inversion is present in the Po.003 line thereby maintaining the large CR max as well as the presence of the single-point mutation in the ZRS identified by Dorshorst et al. (2010).

Limbless
Limbless (ll) is an autosomal recessive embryonic lethal mutation that results in absent forelimbs (wings) and hindlimbs (legs), as well as craniofacial defects (Waters and Bywaters 1943;Zwilling 1956aZwilling , 1956bZwilling , 1956c. Amelia, the complete absence of limbs in human (Michaud et al. 1995;Pierri et al. 2000) is a rare condition with an incidence of 1.5 per 100,000 live births and 7.9 per 10,000 stillbirths (Froster-Iskenius and Baird 1990;Evans et al. 1994;Krahn et al. 2005). This condition can present as an isolated defect or associated with craniofacial, nervous system, pulmonary, skeletal, or urogenital anomalies (Michaud et al. 1995;Pierri et al. 2000;Niemann et al. 2004). Similar to ll, amelia inheritance appears to be autosomal recessive (Michaud et al. 1995). Interestingly, Niemann et al. (2004) mapped the tetra-amelia (absence of all limbs) locus to a region on human chromosome 17q21 which is syntenic to the region of interest for the chicken ll mutation (Table 4).
Both the 3K and 60K array analyses identified GGA 2p as the chromosome harboring the ll mutation (Table 4,  Supplementary Table 1). Fourteen chicken RefSeq genes and more than 60 nonchicken RefSeq genes, including an SP (specificity protein) family gene cluster, have been identified within the 2.62 Mb CR max . Of these, we suggest SP8 and SP9 as priority candidate genes as knockouts of these genes result in severe truncation of the limbs (Bell et al. 2003). The formation of the vertebrate limb is regulated by the AER with secretion of fibroblast growth factor 10 (FGF-10) initiating the formation of the AER in the limb field. The AER in turn secretes FGF-8, and continuous release of FGF-8 further stimulates the formation of the limb during development (Lewandoski et al. 2000). Bell et al. (2003) showed that SP8 has a role in maintaining FGF-8 expression in mice. Likewise, gene expression studies indicate that Sp9 is also involved in the Fgf-8/-10 pathway as downregulation of Sp9 is positively correlated with FGF-8 expression (Bell et al. 2003). Thus, both SP8 and SP9 mediate the induction and maintenance of FGF-8 expression in the AER thereby allowing for proper limb outgrowth (Bell et al. 2003;Kawakami et al. 2004).

Stumpy
The stumpy (stu) mutation was first reported in 1966 (Somes 1990b and references therein). Embryos (stu/stu) typically die during embryonic development E6-7. At this stage, the limb buds in the homozygous mutant are conically shaped stumps, the characteristic for which the mutation gets its name. In addition to a size reduction of the eye and poor vascularization, hemorrhages in the brain, mesonephros, liver, and spleen have also been reported (Somes 1990b). Transplantation studies with mutated limb mesenchyme cells resulted in abnormal phenotypic expression (Somes 1990b and references therein). We have mapped stu to the q arm of GGA 10 (Table 4). Within the 1.80 Mb CR max , 14 chicken RefSeq and 37 nonchicken RefSeq genes are indicated; of these, we suggest MESDC2 (mesoderm development candidate 2) as a priority candidate. Although little is known about MESDC2, it has been shown to modulate WNT (Wingless/Int) signaling through regulation of low-density lipoprotein receptors (Li et al. 2006;Koduri and Blacklow 2007). The WNT proteins are known to play important roles in both embryonic development and adult metabolic homeostasis.

Summary
The MHC-congenic lines present a unique opportunity to directly examine the influence of the chicken MHC in any number of studies ranging from the mechanisms of immune and disease resistance to mate preference and fertility rates to its effectiveness on animal health and production traits (e.g., growth rate, feed efficiency, egg production, body weight, and embryonic mortality). Overall, SNP analysis of the UCD MHC congenics allowed us to predict inbreeding coefficients, which corroborated those expected, and to identify genomic heterozygosity and SNP variability thereby investigating congenic purity and propose genetic mechanisms for the variation observed. We also showed that there was an overall increase in genomic homogeneity overtime. Additionally, this analysis has enabled us to identify a unique GGA 16 SNP potentially linked to strong MDV-resistant MHC B congenic lines albeit no specific GGA 16 or genome profile was consistent among all resistant and/or susceptible (to MDV or RSV) lines.
Naturally occurring mutations in model systems have a rich tradition of providing opportunities to study the molecular and cellular mechanisms that control normal development in vertebrate organisms. SNP analyses were undertaken in order to map the CRs for 10 single-gene developmental mutations in the chicken. This study was successful in that 8 of the 10 mutations were found to be associated with a specific region on a single chromosome thereby allowing us to identify priority candidate genes. The SNP analysis therefore provides an essential step in further promoting these lines as models for vertebrate developmental biology. Fine-mapping analysis coupled with next-generation sequencing technologies (e.g., genomic enrichment sequencing technology and resequencing) will further decrease the CR size for each mutant (Robb et al. 2010; Webb AE, Gitter CL, Kaya M, Cheng HH, Delany ME, in preparation) and ultimately allow for the confirmation or rejection of the proposed candidate genes. The chicken embryo is remarkable for the wealth of approaches available to study gene function, tissue interactions, and developmental pathways. Examples of such approaches include whole-embryo in situ hybridization, tissue transplantation, and other in ovo manipulations using RNA interference and morpholino techniques. Continued research of these chicken mutations and priority candidate genes utilizing such approaches will allow for the further elucidation of mechanisms important to amniote development.