Chromosomes carrying meiotic avoidance loci in three apomictic eudicot Hieracium subgenus Pilosella species share structural features with two monocot apomicts

The LOSS OF APOMEIOSIS ( LOA ) locus is one of two dominant loci known to control apomixis in the eudicot H. praealtum . LOA stimulates differentiation of somatic aposporous initial (AI) cells after the initiation of meiosis in ovules. AI cells undergo nuclear proliferation close to sexual megaspores, forming unreduced aposporous embryo sacs, and the sexual program ceases. LOA -linked genetic markers were used to isolate 1.2 Mb of LOA -associated DNAs from H. praealtum . Physical mapping defined the genomic region essential for LOA function between two markers, flanking 400 kb of identified sequence and central unknown sequences. Cytogenetic and sequence analyses revealed that the LOA locus is located on a single chromosome near the tip of the long arm and surrounded by extensive, abundant complex repeat and transposon sequences. Chromosomal features and LOA -linked markers are conserved in aposporous H. caespitosum and H. piloselloides but absent in sexual H. pilosella. Their absence in apomictic H. aurantiacum suggests that meiotic avoidance may have evolved independently in aposporous subgenus Hieracium species. The structure of the hemizygous chromosomal region containing the LOA locus in the three Hieracium subgenus Pilosella species resembles that of the hemizygous apospory-specific genomic regions in monocot Pennisetum squamulatum and Cenchrus ciliaris . Analyses of partial DNA sequences at these loci show no obvious conservation, indicating that they are unlikely to share a common ancestral origin. This suggests convergent evolution of repeat-rich hemizygous chromosomal regions containing apospory loci in these monocot and eudicot species which may be required for function and maintenance of the trait. a (sporophytic) cell initial (AI) cell formation to Embryo and endosperm formation are both fertilization-independent (autonomous) in apomictic Hieracium subgenus Pilosella Pennisetum fertilization is for endosperm formation. A single dominant locus termed the Apospory Specific Genomic Region (ASGR) is required for functional apomixis in Pennisetum of conservation of the identified LOA locus and its repeat-associated chromosomal structure in Hieracium subgenus Pilosella further concerning the evolution of apospory in the


Introduction
Asexual seed formation, or apomixis, occurs in both monocot and eudicot plants and it has evolved in more than 40 plant genera. Approximately 75% of apomicts exist in three families, the Poaceae, Asteraceae and Rosaceae. Apomixis bypasses meiosis during female gametophyte formation. This contrasts with sexual female gametophyte development which requires meiosis of a megaspore mother cell (MMC) in the ovule (megasporogenesis) followed by nuclear proliferation of typically one of the meiotic products (megagametogenesis). Female gametophytes formed by the apomictic route of meiotic avoidance, or apomeiosis are also termed unreduced. Seed development in sexual species requires one sperm cell to fuse with the egg cell in the female gametophyte to initiate embryogenesis and another sperm cell to fuse with the central cell for endosperm formation. Egg cells that differentiate in unreduced gametophytes of apomicts do not require fertilization to develop into an embryo, and endosperm formation may or may not require fertilization. Therefore, seedling progeny derived from apomictic reproduction retain a maternal genotype (Bicknell and Koltunow 2004;Tucker and Koltunow 2009).
Apomixis is controlled by only a few dominant genetic loci in the currently studied monocot and eudicot species, but genes controlling these events have not been isolated (Ozias-Akins and van Dijk, 2007).
Apomictic plants of different evolutionary history have developed similar mechanisms of meiotic avoidance in order to form unreduced gametophytes. The two common modes observed are termed diplospory and apospory. Monocots in Poaceae such as Tripsacum and eudicot members of the Asteraceae including Taraxacum and Erigeron undergo diplospory, where the MMC or a cell that has aborted meiosis undergoes nuclear proliferation to form an unreduced embryo sac (Tucker and Koltunow, markers in H. praealtum. The chromosomal location of the LOA locus was determined by fluorescent in situ hybridization (FISH) using identified BACs and locus-specific sequences as probes. LOA is located on a single chromosome near the distal tip of the long arm and is surrounded by repetitive sequences in H. praealtum and in two other Hieracium subgenus Pilosella species. Structural features of the hemizygous chromosomal region containing the LOA locus in these eudicot Hieracium species resemble those found in two other aposporous monocot species, suggesting that chromosomal structure might be functionally relevant for the induction and/or maintenance of apospory in these plants.

Identification of LOA-associated genomic sequences and specific markers
Four sequence characterized amplified region (SCAR) markers LOA 300, LOA 267, LOA 275 and LOA 219, were previously found to be located in the central region of the LOA locus in H. praealtum R35 (Catanach et al., 2006;Koltunow et al., 2011b).
These SCAR markers are also present in apomictic H. caespitosum (C36) and H. piloselloides (D36) but they are absent in sexual H. pilosella (P36) and also in two other apomictic H. aurantiacum accessions (A35, A36; Table I; Koltunow et al., 2011b). All of the characterized H. praealtum deletion mutants that have lost LOA function lack these four SCAR markers except mutant 134, which is thought to contain a small deletion or translocation as a result of gamma irradiation ( Figure S1; Koltunow et al 2011b).
These four SCAR markers were used to screen a H. praealtum (R35) BAC library, in order to isolate genomic sequences associated with the LOA locus. Individual BACs containing the SCAR markers were extended by chromosome walking with the aim of obtaining the entire genomic sequence linking the four markers in the LOA locus. In this study, 28 BACs covering 1.2 Mb of sequences were identified and they were assembled into three independent DNA contigs A, B and C that provide partial coverage of sequences spanning the four SCAR markers (Figure 1). LOA 300 and LOA 267 are physically linked in the largest contig A, which comprises ~650 kb, while contig B and C cover approximately 330 and 270 kb of genomic sequence, respectively (Figure 1).

6
Thirteen new SCAR markers linked to the LOA locus were developed from BAC end sequences, bringing the total number of LOA-linked SCAR markers to 17 (Figure 1).
These LOA SCAR markers are based largely on repetitive sequences ( Figure S2A). They detect sequences in H. praealtum (R35) and in the deletion mutant 134 but they are absent in the remaining deletion mutants defective in LOA function (Koltunow et al., 2011b). Therefore, the majority of characterized apospory mutants contain physically large deletions that lack the three identified contig sequences ( Figure S1).  Figures 2B to 2D). Furthermore, these probes painted nearly half of the long arm of the elongated chromosome even though the DNA inserts in the three BACs ranged from 110-185 kb (Figures 2B to 2H). This suggested that sequences present within these BACs are repeated over a very large region on the long arm of the chromosome. We estimated that the size of the region hybridized by the LOA267.14 7 In order to examine the nature of the sequences present in contigs A and B that might be responsible for the extensive BAC probe hybridization on the long arm of the elongated chromosome, 454 pyrosequencing was used to determine the sequence of a pool of 10 BACs from contig A, and another pool of 4 BACs from contig B (Figure 1).
The Hieracium genome has not been sequenced and the absence of a reference genome coupled with the repetitive sequence nature of the BACs hampered assembly of DNA sequences. A total of 379 non-redundant contigs were assembled from the contig A BAC pool and 241 non-redundant contigs from the contig B BAC pool. This resulted in a total of 620 non-redundant contigs and 760,082 bp of sequence coverage from the LOA locus (Table II).
Analyses using comparative genomics of open-reading frames (ORFs) and ab inito gene prediction programs indicated that the sequenced region contained few ORFs and a large number of partially conserved ORFs, particularly tranposon-related proteins.
The sequenced genomic region of the LOA locus has a low GC content (38%), is AT-rich and contains simple repeat and low-complexity DNAs. Transposon prediction programs, RepeatMasker, Censor and TranpsosonPSI, identified various transposon-related sequences, revealing that class I retrotransposons Ty1-copia and Ty3-gypsy (Kumar and Bennetzen, 1999) were particularly abundant in the LOA locus (Table II). In total, 120 transposon sequences were annotated, covering 14.9% of the 760 kb sequenced region.
DNA clustering analysis identified 899 additional complex repeat fragments for which homologous sequences are present at least twice in the LOA locus that are not simple or low complex DNAs or transposons. These complex repeat sequences cover 31.1% of the sequenced region (Table II). There does not appear to be a single sequence continuously repeated in the analysed BAC pool sequences. The complex repeats, transposons and simple/low complex sequences comprise 49.1% of the sequence coverage at the LOA locus and we conclude that a combination of these sequences accounts for the observed extensive BAC probe hybridization on the chromosome containing the LOA locus. To establish the location of contigs A and B within the repetitive region and concurrently gain an insight to the size of the intervening sequences (gap) between contigs A and B, we developed specific FISH probes for each contig. Potentially unique LOA-linked sequences derived from the 454 sequence contigs were selected and assessed for their suitability as FISH probes by genomic DNA blot analysis ( Figure S2B). A mixture of these probes derived from two regions in contig A (A-mix) and a single region in contig B (B-mix) was used for FISH ( Figure 2A). Multicolor FISH hybridization revealed that A-mix and B-mix probes hybridized as distinct but closely localized spots at the distal end of the long chromosome within the repetitive region labeled by the LOA267.14 BAC probe (Figures 2I to 2L and Figure S2C). Analysis of multiple chromosome samples indicated that the genomic region detected by the contig A-mix probes was located towards the distal tip of the chromosome, while the B-mix probes detected adjacent genomic sequences oriented towards the centromere ( Figure 2L).

LOA contigs A and
Collectively, these results indicate that the identified contigs A and B associated with the LOA locus are closely located at the distal tip of a single long chromosome in H.
praealtum. They are surrounded by a large array of complex repeats and transposons.
The localization of the LOA locus linked to a single chromosome is also consistent with the predicted hemizygous genetic nature of the LOA locus (Catanach et al., 2006). and Pilosella 2 shown in Table I, based on their chloroplast trnT-trnL sequences (Koltunow et al., 2011b). Analyses using the 14 SCAR markers summarized in Table I 9 revealed remarkable marker conservation in the two apomictic species H. caespitosum (C36) and H. piloselloides (D36 and D18). Sequencing of the PCR products obtained from five of the examined markers indicated in Table I confirmed the sequences to be greater than 98% identical to those in R35, supporting conservation of markers. These SCAR markers were absent in two accessions of sexual H. pilosella (P36 and P36(CR)) and in two accessions of apomictic H. aurantiacum (A35 and A36; Table I).

Conservation of SCAR markers and chromosomal features associated with the
FISH analyses using fluorescently labeled BAC LOA267.14 and the contig A and B-specific probes (A-and B-mix; Figure 2A) did not show significant hybridization to P36, A35 and A36 chromosomes ( Figure S3; Table III). Therefore, the apomictic accessions of H. aurantiacum (A35 and A36) and sexual H. pilosella (P36) do not contain the same large block of repetitive sequences associated with the LOA locus in their genome. Furthermore, they do not contain sequences that can be reproducibly detected by the A and B contig specific probes in R35 ( Figure S3).
By contrast, the LOA-linked repetitive sequences were also observed on the long arm of a single chromosome in H. caespitosum (C36) and in the two H. piloselloides accessions (D36 and D18) when BAC LOA267.14 was used as a FISH probe ( Figure 3A).
The LOA-associated repetitive sequences were found on an elongated long chromosome in C36 like that observed in R35, but the chromosomes containing repetitive sequences in D36 and D18 were not significantly elongated relative to the others ( Figure 3A).  Figure S4). It is also interesting to note that the size of the LOA-associated repeat region on the chromosomes of these plants  Figure 3A; white asterisks).
Therefore, hemizygous chromosome constitutions extend beyond that observed for the LOA locus. The non-reductional mitotic nature of apomictic reproduction is likely to maintain both aneuploidy and hemizygotic chromosome constitutions in subgenus Pilosella.

The chromosome containing LOA and surrounding repeats is not enriched for heterochromatin marks
Chromosomes containing extensive regions of repetitive DNA and retrotransposon-like sequences that were observed around the LOA locus ( Figure 2 and  (Table III). Several antibodies specific for histone H3 methylation associated with euchromatin and transcriptional activation (H3K4me2 and H3K4me3), or heterochromatin formation and gene silencing (H3K9me2, H3K27me2, H3K27me3) were used. No significant differences were observed in global histone H3 methylation patterning on the chromosome containing LOA-associated repeats relative to the other chromosomes ( Figure S5). We also investigated DNA methylation using anti-5methylcytosine (anti-5mC) antibody. Figures 3C to 3E show that the distal half of the long arm of the long chromosome, where the LOA locus is located, did not exhibit increased DNA methylation relative to the other chromosomes. Taken together, the highly repetitive region around the LOA locus is not associated with a significantly higher degree of DNA methylation or unique post-translational histone H3 modification relative to other chromosomes. Thus, the long arm of this chromosome predominantly containing repetitive sequences and transposons is unlikely to be completely transcriptionally silent.
This analysis does not rule out the possibility that more localized DNA and histone modifications are playing a role in regulating the expression of genes at the LOA locus.

Delineation of the genomic region essential for LOA function
Next we examined which parts of the genomic region defined by the three BAC contigs ( The LOA-linked SCAR markers are not present in sexual H. pilosella P36 (Table   I), thus F1 plants carrying these markers inherit them from the apomictic pollen parent.
We screened the 833 F1 progeny plants with the four SCAR markers LOA 300, LOA 267, plants containing one or more of these markers were identified, indicating low transmission efficiency (15%) of LOA-linked sequences. This supports the previous observations of segregation distortion for transmission of the LOA locus (Catanach et al., 2006). The 125 F1 progeny containing LOA-linked SCAR markers were further investigated for the presence of nine additional SCAR markers at the LOA locus, and they were divided into 11 classes based on the combinations of markers they contained (Table   IV). Sixty seven plants representative of these classes were examined cytologically for the presence of aposporous initial (AI) cells and aposporous embryo sacs. Linkage analysis of the SCAR markers in the F1 progeny confirmed the order of markers as shown in Figure 1, which was supported by an odds ratio of 10 9 over any other order of However, we cannot exclude the possibility that recombination and loss of critical LOA sequences has also occurred in the genomic region yet to be identified.
We considered that the aneuploid nature of R35 may have contributed in part to the low transmission of the LOA and associated markers. Therefore, we examined 671 F1 progeny derived from a cross between sexual H. pilosella P36 and apomictic H.
caespitosum C36, where both parents are tetraploid. We also found low transmission efficiency (40%) of the LOA-linked markers in this F1 progeny. Surprisingly, transmission of markers between 14-T7 and 21-T7 linked with the AI cell formation phenotype did not occur in this cross as frequently as in the previous P36 x R35 cross (Table S1). Consistent with this, the progeny plants rarely developed AI cells and aposporous embryo sacs. Analyses of the progeny of the P36 x C36 cross support conclusions from the P36 x R35 cross that the genomic region between markers 9-HR and LOA 219 is not essential for the initiation of apomixis (Classes 1 to 7; Table S1).
Given the poor transmission of markers in the P36 x C36 cross, it is not possible to confirm if the order of SCAR markers in H. caespitosum is the same as that in H.
In summary, analyses of the mapping populations indicate that the genomic region between marker 14-T7 on contig A and marker 9-HR on contig B, containing 400 kb of isolated genomic sequences flanking a region yet to be identified, appears to be sufficient for the initiation of apomixis in H. praealtum (Figure1).

Comparison of partial genomic sequences from eudicot LOA and monocot ASGR loci
The ASGR in both aposporous Pennisetum squamulatum and Cenchrus ciliaris is located on a single chromosome that also contains transposons and repeated sequences (Akiyama et al., 2004;Akiyama et al., 2005). Genomic sequences associated with the Hieracium LOA locus and ASGR of both Pennisetum and Cenchrus were compared using BlastN in order to investigate sequence conservation, and no significant similarity was found except for some simple repeat sequences. Next we compared the LOA and ASGR associated sequences using tBlastX, which enabled a comparison of 6-frame translated sequences, and matches to the putative genes currently identified at the ASGR were not found (Table S2; Conner et al., 2008).
Transposons of the Ty3-gypsy and Ty1-copia type were found at both LOA and ASGR loci with Ty3-gypsy-like retrotransposons being most abundant at the LOA locus, and Ty1-copia-like sequences being most abundant in the ASGR (Table II). Ty1-copialike retrotransposons are ubiquitous in plants. However, they are considerably diverse at the DNA sequence level, and multiple well-supported phylogenetic lineages have been identified in plants (Voytas et al., 1992;Kumar and Bennetzen, 1999). The retrotransposon sequences associated with the ASGR and the LOA locus did not show significant DNA conservation, indicating that they had originated from different ancestral sequences. Clustering analyses also identified 421 and 404 complex repeat fragments repeated at least twice in the Pennisetum and in the Cenchrus ASGR-associated contigs, covering 16.8 and 28% of the sequenced regions, respectively (Table II). There was no sequence conservation in the complex repeats found at the ASGR and the LOA locus.
Only partial DNA sequences have been obtained from the Hieracium LOA locus and the Pennisetum and Cenchrus ASGR and as the current sequenced regions do not contain many genes, we are unable to make significant conclusions about candidate apomixis genes. However, we can conclude that the repetitive and transposon sequences present on the hemizygous chromosomal region containing these loci appear to have evolved independently. Therefore, this provides an example of convergent evolution of chromosomal structures on hemizygous chromosomal regions containing apospory loci in monocot and eudicot species.

Discussion
The repeat rich chromosomal structure containing the LOA locus is not conserved in all aposporous Hieracium subgenus Pilosella species proliferation around meiosis to form aposporous embryo sacs. Interestingly in both accessions of H. aurantiacum multiple embryo sacs amalgamate during their expansion towards the sexually programmed cells. As a consequence the sexual pathway ceases and more than one aposporous embryo sac is rarely observed in mature H. aurantiacum ovules (Koltunow et al., 1998(Koltunow et al., , 2000(Koltunow et al., , 2011b.

Roles of the repeat sequences at apospory loci and models for LOA function
Akiyama et al (2004)  elements present at the LOA region must be participating in recombination during meiosis.
Using FISH to track the LOA locus through the events of microsporogenesis may provide further insight to the behaviour of the hemizygous chromosomal region during meiosis.
Although genomic regions containing loci with significant transposons and repeats are thought to be highly heterochromatic, transcription from these chromosomal   the LOA locus cannot be excluded. They may directly or indirectly induce gene silencing, influence epigenetic marks and chromatin structure, or alter transcript levels (Grewal and Moazed, 2003;Martienssen et al., 2004;Arteaga-Vazquez and Chandler, 2010).
Isolation of the DNA sequences between the LOA-linked markers 14-T7 and 9-HR in the LOA locus and the identification of genes responsible for apomixis initiation in H. praealtum are our current priorities. The deletion mutants defective in LOA function provide a means to test candidate genes to examine if they restore apospory.

Plant materials
Eight Hieracium subgenus Pilosella accessions were used in this study. These included tetraploid (4X=2n=36)  Plant growth conditions and reproductive features of these plants are described in Koltunow et al. (1998;2011b).

Identification of genomic sequences associated with the LOA locus
The identification of BACs associated with the LOA locus was initiated using four SCAR markers (LOA 300, LOA 267, LOA 275 and LOA 219) that were central to the LOA locus from prior analyses (Catanach et al., 2006;Koltunow et al., 2011b). A H.
praealtum (R35) BAC library was screened using a combination of PCR and BAC filter hybridization as follows. First, super pools of BAC DNAs were screened by PCR with SCAR markers and individual BAC DNA pools were identified. Then, BAC filters containing the identified BAC pools were hybridized with DNA probes to identify candidate BAC clones. DNA probes were generated from the SCAR marker sequences and labeled with α -[ 32 P]-dCTP. Individual BAC clone candidates were obtained from the sequences were identical to the original SCAR markers. An overlap of BAC contigs was confirmed by BAC fingerprint patterns using restriction enzyme digestion to classify association with a BAC contig group. BAC end sequences were determined, and Southern blot analysis was performed to assess the copy number and specificity of BAC end DNA sequences as described previously (Okada et al., 2000). New SCAR markers were developed from the BAC end sequences for further BAC library screening and physical mapping analysis. The development of SCAR markers from BAC end sequences is described in Supplementary Methods.

Sequencing of BAC pools and bioinformatic analyses
Partial DNA sequence of BACs associated with the LOA locus was determined by

Identification of repeats and transposons in Hieracium LOA contigs and
RepeatMasker version 3.2.9 (http://www.repeatmasker.org). Resulting output files obtained from the programs were processed using GALAXY (Goecks et al., 2010). In addition, a DNA sequence clustering method was used to identify complex repeat sequences that were not identified by the above mentioned programs. Briefly, contig sequences were compared with each other by BlastN with low complexity filter and cutoff value of 1e -10 . Sequence regions with significant similarity to other contigs were identified as candidate repeat clusters. Repeat cluster sequences, primarily containing transposons and simple/low complex repeat sequences annotated by TransposonPSI and RepeatMasker, were removed from further analysis. Then, repeat cluster sequences were used to mask the contig sequences, and masked regions were counted and their sequence lengths were measured using Excel (Microsoft). In total, 899 sequence regions containing 236,401 bp were masked in LOA associated contigs, 421 regions (91,289 bp) in Pennisetum and 404 regions (141,420 bp) in Cenchrus ASGR-associated contigs.
These masked regions have homologous sequences among the contigs, and thus represent repeated sequences that were designated as complex repeats in Table II.
Sequences associated with the LOA locus (620 contigs) and the ASGR (1341 contigs) were compared by BlastN with e-value cut-off of 1e -10 and tBlastX with a cut-off of 1e -5 . Sequence regions with significant similarity between LOA and ASGR were annotated by BlastX with an e-value cut-off of 1e -5 against Arabidopsis thaliana peptide sequences (http://www.arabidopsis.org/) and Oryza sativa peptide sequences (http://rice.plantbiology.msu.edu/). The results are summarized in Table S2.

Fluorescent in situ Hybridization (FISH) analysis
Vigorously growing root tips were collected from Hieracium plants aseptically grown on 0.5 x MS liquid medium (Koltunow et al., 1998). To accumulate metaphase cells, the root tips were pretreated in iced water for 17 h. The root tips were then fixed in 3:1 ethanol/acetic acid. Chromosome preparations for FISH were performed as described previously (Mukai et al., 1990). The LOA locus specific FISH probes were designed from the BAC end sequences and partial BAC sequences. Specificity of probes was examined by genomic Southern blot analysis and four DNA fragments (A-mix) and two fragments (B-mix) were chosen for FISH ( Figure S2B). Fluorescent probes were made and FISH was carried out essentially as described by Mukai et al. (1990). A detailed protocol is provided in Supplementary Methods.

Analysis of the segregation of LOA-linked markers and AI cell formation in F1
progeny  Lander et al., 1987). Plants with or without LOA markers were phenotyped for AI cell and aposporous embryo sac formation at stages 4 and 10 of capitulum development (Koltunow et al., 1998).

Analysis of DNA and histone methylation by indirect immunofluorescence
The

Supplementary References
Supplementary Tables   Table S1. Analysis of the segregation of LOA-linked markers and AI cell formation in F1 progeny derived from a cross between sexual P36 (female) and apomict C36 (male).        ----+/-10 4 a, Reproductive type, apomictic (Apo) or sexual (Sex). b, Presence (+) or absence (-) of the long chromosome shown in Figure 3. c, For details of the conservation of LOA SCAR markers, see Table I         aposporous initial cell; dms, degenerating megaspores; fm, functional meiotic megaspore. Bar = 20 µm.