Regulatory Hotspots Are Associated with Plant Gene Expression under Varying Soil Phosphorus Supply in Brassica rapa 1[W][OA]

Gene expression is a quantitative trait that can be mapped genetically in structured populations to identify expression quantitative trait loci (eQTL). Genes and regulatory networks underlying complex traits can subsequently be inferred. Using a recently released genome sequence, we have deﬁned cis- and trans-eQTL and their environmental response to low phosphorus (P) availability within a complex plant genome and found hotspots of trans-eQTL within the genome. Interval mapping, using P supply as a covariate, revealed 18,876 eQTL. trans-eQTL hotspots occurred on chromosomes A06 and A01 within Brassica rapa ; these were enriched with P metabolism-related Gene Ontology terms (A06) as well as chloroplast- and photosynthesis-related terms (A01). We have also attributed heritability components to measures of gene expression across environments, allowing the identiﬁcation of novel gene expression markers and gene expression changes associated with low P availability. Informative gene expression markers were used to map eQTL and P use efﬁciency-related QTL. Genes responsive to P supply had large environmental and heritable variance components. Regulatory loci and genes associated with P use efﬁciency identiﬁed through eQTL analysis are potential targets for further characterization and may have potential for crop improvement.

Expression quantitative trait loci (eQTL) are genetic regions associated with variation in gene expression among individuals (Kliebenstein, 2009). This variation can arise due to sequence polymorphisms in target genes, their cis-regulatory (proximal) or trans-regulatory (distal) regions, leading to phenotypic differences. Identifying variation in gene expression within a segregating mapping population is potentially of immense use, in particular within the plant and crop sciences . First, there is the opportunity to map the chromosomal positions of thousands of genes based on constitutive differences in expression between parents. Gene mapping at this scale is of great value, especially for organisms whose genomes have not been fully sequenced, and can contribute to efforts to integrate physical and genetic maps. Furthermore, in polyploids and/or where a genome contains ancestral duplications, a feature common among plant species, it will become feasible to map gene paralogues given sufficiently robust expression probes. Second, since a single gene could be associated with one or many eQTL, categorizing eQTL into cis-or trans-eQTL effects relative to the physical location of a gene can reveal regulatory networks controlling gene expression in the absence of a priori models. Third, by combining eQTL analysis with trait analysis, putative candidate genes associated with traits can be identified indirectly by correlation analysis. Candidate genes associated with important traits can then be-come targets for traditional genetic characterization .
Several eQTL studies have been reported in plants, including studies on Arabidopsis (Arabidopsis thaliana), barley (Hordeum vulgare), Populus, and maize (Zea mays; for review, see Kliebenstein, 2009;Druka et al., 2010). In studies where a physical map is available, eQTL have been categorized empirically as cis-and/or trans-eQTL based on assumptions about genetic linkage. The most significant eQTL tend to be cis-eQTL, which occur when a sequence polymorphism in a gene or promoter maps to quantitative variation in the gene transcript (Kliebenstein, 2009). This phenomenon, when combined with linkage, results in the standard "cis-diagonal" being seen when gene physical position is plotted against eQTL position (DeCook et al., 2006;Keurentjes et al., 2007;West et al., 2007;Kliebenstein, 2009). In contrast, sequence polymorphisms in regulatory genes such as transcription factors, which do not colocate with the physical position of the gene, can result in allelic variation at trans-eQTL. A cis-eQTL in a regulatory gene, therefore, can generate many trans-eQTL (Kliebenstein, 2009). Several trans-eQTL hotspots have been identified in plants, in which loci are associated with variation in the expression level of many hundreds or even thousands of genes, and these are thought to represent master regulatory loci with potentially pleiotropic effects (Kliebenstein, 2009). Typically, the statistical significance of associations between genetic loci and trans-eQTL are less than cis-eQTL, potentially due to the complex nature of regulatory gene networks.
The identification of cis-and trans-eQTL has already revealed information regarding the complex genetic architecture of variation in plant gene expression. However, relationships between classical phenotypic or trait QTL and variation in gene expression have not been widely reported, since most studies have been conducted under single experimental conditions. Notably, in barley, correlations between loci conferring resistance to rust pathogens and variation in gene expression have also been found using eQTL approaches (Druka et al., 2008;Chen et al., 2010). However, the identification of eQTL under abiotic stress has not yet been reported.
The aim of this study was to identify eQTL under altered soil phosphorus (P) supply. Brassica rapa was chosen since it is a close crop relative of Arabidopsis that retains a diploid genome structure, albeit including multiple paralogues. Recent hybridizations between the diploid A genome of B. rapa (vegetable and oil crops) and the C genome of Brassica oleracea (vegetable crops) have given rise to the widely grown amphidiploid Brassica napus (AC genome; canola/ oilseed rape/colza, rutabaga/swede; Allender and King, 2010). Furthermore, extensive genetic and genomic resources for B. rapa have now been assembled. These include a rapid-cycling mapping population developed from highly inbred lines of rapid cycling (IMB211) and yellow sarson (R500) B. rapa (Iniguez-Luy et al., 2009), more than 2M Brassica GenBank sequences, and a B. rapa genome sequence made available in 2011 alongside other reference Brassica genome and resequencing projects (Multinational Brassica Genome Project; www.brassica.info/resource/sequencing.php), as well as oligonucleotide microarrays (Trick et al., 2009a;Love et al., 2010).
Identifying and characterizing eQTL under altered P supply will increase our understanding of P use efficiency (PUE) in plants. An improved understanding of the genetics of PUE at the individual gene level may provide new opportunities for crop improvement based on candidate gene and marker identification at a scale that is much more rapid than one based on trait QTL approaches alone (Hammond et al., 2009). There are pressing economical and environmental pressures to reduce our reliance on inorganic P fertilizers, including the development of crops that grow well under conditions of low soil P and that utilize P fertilizer inputs most efficiently.

RESULTS AND DISCUSSION
The transcriptional profiling of B. rapa recombinant inbred lines (RILs) grown in different environments has enabled us to characterize the genetic architecture of plant adaptation to low P availability. To our knowledge, this is the first study of its kind in which the heritability (H 2 ) of global gene expression is estimated in response to a known abiotic stress (P availability) and the quantitative expression of genes is mapped across environments to identify eQTL associated with adaptation to P availability. To complete this study, we have utilized the BraIRRI mapping population (Iniguez-Luy et al., 2009). This population was derived from a cross between a highly inbred rapidcycling B. rapa (IMB211, female) and a highly inbred annual yellow sarson (var trilocularis; R500, male), and the progeny selfed for eight generations to generate RILs.

Transcriptional Responses to Low P Availability Indicates Enhanced P Recycling
To determine conserved transcriptional responses to P availability, transcriptional profiles were determined across the BraIRRI mapping population within the treatments. To correct for multiple testing, a Benjamini and Hochberg multiple testing-corrected P , 0.05 (Benjamini and Hochberg, 1995) was used subject to greater than 2-fold difference in expression between low and optimal external P concentration ([P] ext ). Analysis of transcriptional profiles between treatments identified 2,009 probes representing transcripts with significantly greater abundance at low [P] ext (Supplemental Table S2) and 1,217 probes representing transcripts with significantly less transcript abundance at low [P] ext (Supplemental Table S3). Probes with greater abundance included those with annotations to genes whose expression has been shown previously to change in response to low P availability (Hammond et al., 2005;White and Hammond, 2008;Fang et al., 2009;Nilsson et al., 2010). These included probes annotated as having phosphatase activity, such as a transcript with homology to Arabidopsis Purple Acid Phosphatase6 (At1g56360.1), which had a greater than 1,000-fold difference in expression between low and optimal [P] ext . In addition, six other transcripts with homology to Arabidopsis proteins containing phosphatase activity were in the top 50 transcripts with significantly greater abundance at low [P] ext (Supplemental Table S2). Proteins with acid phosphatase activity are known to be induced under P-limiting conditions and act to release P from various inorganic P monoesters to recycle inorganic P to essential metabolic processes (Li et al., 2002;Tran et al., 2010).
Several transcripts with significantly greater abundance at low [P] ext showed homology to proteins involved in the manipulation of membrane lipids. These included two transcripts with homology to glycerophosphoryl diester phosphodiesterases and a transcript with homology to Arabidopsis PLDP2, phospholipase D, which are involved in the catabolism of phospholipids, recycling P to essential metabolic processes and providing diacyglcerol for galactolipid biosynthesis (van der Rest et al., 2002;Li et al., 2006aLi et al., , 2006bTjellströ m et al., 2008). Transcripts with homology to proteins involved in the biosynthesis of galactolipids and sulfolipids also had significantly greater abundance at low [P] ext , including two isoforms of 1,2-diacylglycerol 3-b-galactosyltransferase (MGDG and MGDC) (Supplemental Table S2). Under P-limiting conditions, plants can alter the composition of their membranes, reducing their phospholipid content by replacing phospholipids with galactolipids and sulfolipids (Essigmann et al., 1998;Andersson et al., 2003Andersson et al., , 2005Kobayashi et al., 2009), a process that may be regulated through cross talk between the auxin and cytokinin signaling pathways (Kobayashi et al., 2006), thus decreasing the structural requirement for P.
Transcripts with homology to proteins and transcripts involved in the regulation of plant responses to low P availability also had significantly greater abundance under low P availability (Supplemental Table  S2). These included two transcripts with homology to two Arabidopsis SPX domain-containing proteins (At2g45130 and At5g20150) and a transcript with homology to Arabidopsis IPS1 (At3g09922). Proteins containing SPX domains have been implicated in the regulation of plant responses to low P availability, with the two proteins, SPX1 and SPX3, playing positive roles in plant adaptation to low P availability and acting in the PHR1 signaling cascade (Duan et al., 2008;Wang et al., 2009;Liu et al., 2010). The expression of the transcripts from the IPS1 family is induced rapidly and specifically in response to P starvation (Burleigh and Harrison, 1999;Martín et al., 2000;Hammond et al., 2003;Hou et al., 2005;Shin et al., 2006), and these noncoding transcripts sequester miR399 and serve to attenuate the miR399-mediated transcriptional responses to low P availability (Franco-Zorrilla et al., 2007).
Analysis of Gene Ontology (GO) terms associated with these transcripts was consistent with the role of these genes in plant adaptations to low P availability (Supplemental Tables S4 and S5). Transcripts with greater abundance under low P availability had an enrichment of GO terms attributed to flavonoid biosynthetic process (GO:0009813), sulfur compound biosynthetic process (GO:0044272), phosphoric ester hydrolase activity (GO:0042578), phosphatase activity (GO:0016791|GO:0016302), and response to nutrient levels (GO:0031667). The expression of these transcripts in the BraIRRI mapping population is thus consistent with previous observations of plant transcriptional responses to low P availability.

P-Responsive Transcripts Are Highly Heritable
The H 2 of gene expression was calculated for each probe on the array, using the log 2 -normalized signal values from all 78 genotypes at both low and optimal [P] ext . H 2 was calculated as the genotype variance as a proportion of the genotype, genotype and environment ([P] ext ) interaction, and a residual term using residual maximum likelihood procedures, with the effect of the [P] ext fixed ( Fig. 1A; Supplemental Table  S6). This provides a measure of how much variation in the expression of an individual gene was due to genotype relative to variation from environmental factors, such as experimental design and treatment. The transcript abundance for over 22% of probes had a H 2 of zero, and 90% of the probes had a H 2 of less than 0.5. Since the variance component model used to calculate H 2 apportions the variance between genotype, environment, and their interaction, there is a mathematical interdependence between the H 2 and the variance attributed to the environment component, [P] ext (Fig. 1B). This relationship identifies three groups of probes: (1) probes for which the transcript abundance is nonheritable and nonresponsive to [P] ext ; (2) probes for which the transcript abundance is heritable and that are highly responsive to [P] ext ; and (3) probes for which the transcript abundance is highly heritable but are not responsive to [P] ext (Fig. 1B).
We hypothesized that the annotations for probes in group 2 will be enriched with genes known to respond to low P availability. Enrichment for genes identified as responding to low P availability (Supplemental Table S2) was determined in overlapping groups of 3,000 probes (e.g. 0-3,000, 1,500-4,500, 3,000-6,000, etc.) generated from a list of all probes ordered by H 2 (e.g. probes in the group 0-3,000 had H 2 = 0, probes in group 90,000-91,854 had H 2 . 0.866). There was a significant (P , 0.05 for the x 2 distribution for a twoway contingency table) enrichment of probes identified as responding to low P availability (Supplemental Table S6) in groups of probes with H 2 between 0.23 and 0.95, with an average treatment effect of 0.35 ( Fig. 2A).
GO term analysis of these significant groups also showed enrichment for GO terms associated with the chloroplast, ribosome, and carbohydrate metabolism. Groups of probes with H 2 values less than 0.23 or greater than 0.95 showed no enrichment for any GO terms, with the exception of groups of probes with H 2 = 0, which were enriched for a single GO term associated with the endomembrane system.
Enrichment of genes responsive to low P availability in groups of genes with high H 2 and treatment effects was also confirmed using independent data from transcriptional profiling studies of Arabidopsis responses to low P availability (Fig. 2, B-D; Wu et al., 2003;Misson et al., 2005;Morcuende et al., 2007). This suggests that transcriptional responses to P availability are highly heritable and provide potential targets for breeding crops with improved abilities to grow under low soil P conditions.

Gene Expression Markers Are Highly Heritable But Not Responsive to P Availability
Using the gene expression data obtained for all the individual RILs in the BraIRRI mapping population, it was possible to identify gene expression markers (GEMs) in the genome suitable for mapping eQTL and QTL. In total, 125 robust GEMs were identified for 67 RILs (Fig. 3). The overall map length was 870.8 centimorgan (cM) across chromosomes A01 to A10, with an average distance between markers of 7.0 cM. The selection procedure used to generate the original Figure 2. Frequency distribution of genes differentially expressed between optimal and low P treatments identified in this paper (A), by Misson et al. (2005;, by Morcuende et al. (2007;, or by Wu et al. (2003;D) as a function of H 2 . Horizontal bars represent significance (P , 0.05) thresholds for overrepresentation of P-responsive genes within a bin of 3,000 genes.  2,306 candidate GEMs was expected to enrich for single-feature polymorphisms (group 3 of the marker types described above). In practice, this was borne out, with all of the final selected markers showing more signal variation between parental alleles than due to [P] ext , with none of the GEMs present in transcripts responding to low P availability (Supplemental Tables  S2 and S3), and an average H 2 across the 125 markers  Table S1) across the B. rapa A genome (A) and across the 67 genotypes of the BraIRRI mapping population used to identify eQTL and QTL associated with low and optimal P availability (B). of 0.93 (Fig. 1B). That the markers are largely nonresponsive to P indicates that their use for genetic map construction is justified, with the maps developed then being suitable for examining P-responsive transcripts (group 2 GEMs). The independence of marker types for different parts of the analysis gives confidence that the analysis is not tautological. Thus, the use of these spaced markers for eQTL and QTL analysis appears justified for the reasons described above. The segregation pattern shows a reasonable distribution of recombination break points across the 67 genotypes (Fig. 3B), although some degree of apparent local double recombination can be seen for a number of groups (e.g. lines 42 and 58 on A04; Supplemental Text S1).

The Expression QTL Hotspot on A06 Is Enriched with P-Responsive Transcripts
Using the genetic map based on GEMs (BraIR-RI_04_2010a), data for individual probes across 67 lines and two [P] ext treatments were subjected to interval mapping, using [P] ext as a covariate. A total of 18,874 eQTL were identified across the genome, representing 15,910 unique probes, with log of the odds (LOD) scores ranging between 3.17 and 2,824.18 (Supplemental Table S7). Robust physical genomic locations were identified for 10,545 of the 15,910 probes associated with eQTL, allowing the positions of the genes to be compared with the positions of their associated eQTL and the nature of their regulation to be determined. Highly significant eQTL tend to be cis-eQTL, which occur when a sequence polymorphism in a gene or promoter maps to quantitative variation in the gene transcript (Kliebenstein, 2009). In contrast, sequence polymorphisms in regulatory genes such as transcription factors, which do not colocate with the physical position of the gene, can result in allelic variation at trans-eQTL. The midpoint between markers was used to define boundaries for the colocation of eQTL with their physical positions within the genome. Of the 14,257 eQTL that could be physically mapped, 6,304 colocated to the same chromosome as indicated by the eQTL mapping, with 3,236 colocating to the same physical position as the marker and defined as cis-eQTL (Supplemental Tables S8 and  S9). This proportion of cis-eQTL to trans-eQTL is similar to that reported in the Arabidopsis Bay-0 3 Sha RIL population (West et al., 2007) but lower than that observed in the Arabidopsis Landsberg erecta 3 Cape Verde Island RIL population (Keurentjes et al., 2007). The median LOD statistic associated with the 3,236 cis-eQTL was 8.4 and that for the 11,021 trans-eQTL was 4.0. This is consistent with previous eQTL studies, which have reported higher LOD scores associated with cis-eQTL compared with trans-eQTL (Keurentjes et al., 2007;West et al., 2007). The additive effect of the parental alleles was consistent between treatments, with 47.5% and 46.7% of eQTL resulting from a negative effect of the IMB211 allele under optimal [P] ext and low [P] ext , respectively, and 52.3% and 53.3% of eQTL resulting from a positive effect of the IMB211 allele under optimal [P] ext and low [P] ext , respectively.
Using physical positions for the probes within the B. rapa genome and the number of eQTL associated with a specific marker above an empirical threshold, trans-eQTL hotspots were identified (Fig. 4). These are regions of the genome in which loci are associated with variation in the expression level of many hundreds or even thousands of genes. They are thought to represent master regulatory loci, with one or a few regulatory genes at this physical position controlling the expression of many genes associated with that position in the genome. We observed an average of 152.2 eQTL per marker, which is consistent with the expected number of eQTL per marker of 151.0, based on 125 markers and 18,876 eQTL. The 99.9% upper and lower confidence intervals of the Poisson distribution, which assumes that GEMs and eQTL are distributed equally, are 113 and 194. Thus, any marker associated with more than 194 eQTL is considered to be a regulatory hotspot, which is likely to contain many putative trans-eQTL. There are 27 markers with more than 194 eQTL (Fig. 4). These hotspots might represent gene-dense regions within the genome, and markers with less than 113 eQTL associated with them may represent regions of the genome with few genes, such as centromeres (Keurentjes et al., 2007). The most notable trans-eQTL hotspot occurs on chromosome A06, where marker 2 (B_1046058; 7.02 cM) is associated with 1,311 eQTL. Subsequent analysis revealed significant enrichment for GO terms associated with protein modification and P metabolism. Putative regulatory hotspots were also identified on chromosome A01 at marker 15 (B_1083235; 106.9 cM), which is associated with 785 eQTL, and is enriched with chloroplast and photosynthesis-related GO terms, on chromosome A04 at marker 5, which is enriched for His-related GO terms, and on chromosome A09 at marker 12 (B_1049533; 97.7 cM), which is enriched for cytoskeleton and nucleus-related GO terms.
Due to the low marker density, it is not possible to make direct comparisons to eQTL and QTL identified previously in Arabidopsis, but reanalyses of these data with a higher density map and improved genomic data will facilitate this and allow comparative genomics analyses of the eQTL and overlap of homologous genes between hotspots. Approaches that distinguish between paralogues within the genome, such as the use of high-density oligonucleotide arrays with exonspecific probes or RNAseq, will also increase the accuracy and depth of these analyses in the future (Trick et al., 2009b;Love et al., 2010).

Trait QTL Colocalize with eQTL
Transgressive segregation of trait values beyond the parental values was observed for all traits measured in the BraIRRI mapping population (Fig. 5). Shoot P concentration (shoot [P]) varied 1.97-fold at optimal P availability and 4.17-fold at low P availability, with population mean shoot [P] of 0.72% for lines grown at optimal P availability and 0.30% for lines grown at low P availability. Measures of PUE also varied among the 67 RILs grown at optimal and low P availability (Fig.  5). Genetic loci associated with the responsiveness to [P] ext were subsequently mapped using these data (Table I). Of the 13 trait QTL mapped, six colocalize with eQTL, including QTL associated with P utilization efficiency (g dry matter g 21 tissue P) at marker 2 (B_1046058; 7.02 cM) on A06, QTL associated with physiological PUE (g 2 dry matter g 21 tissue P) at optimal [P] ext at marker 7 (B_1022937; 48.85 cM) on A01, and marker 5 (B_1057766; 35.87 cM) on A06. A QTL associated with agronomic PUE (g dry matter g 21 fertilizer P applied) also colocalized with marker 5 on A06.
QTL associated with dry weight at low and optimal [P] ext and physiological PUE at optimal and low [P] ext colocalize with QTL associated with physiological PUE at high and low [P] ext , P uptake efficiency (g tissue P g 21 fertilizer P applied), and shoot dry weight identified on chromosome 1 in B. oleracea (Hammond et al., 2009). Alignment with QTL identified in Arabidopsis for shoot [P] reveals some colocalization between loci. QTL mapped to the top of A06 in this study colocalize with QTL for shoot [P] identified previously (Bentsink et al., 2003;Loudet et al., 2003) on the top of Arabidopsis chromosome 1. This suggests that loci for shoot [P] may be conserved in the Brassicaceae, but further work, including identification of the genes responsible for these QTL, is required to confirm this. Interestingly, several of the QTL associated with dry weight at optimal [P] ext on A01 and A06 and physiological PUE at optimal [P] ext , QTL associated with P utilization efficiency, and agronomic PUE on A06 all colocalize with eQTL hotspots. Further research is required to confirm the genetic nature of any links between these QTL.

CONCLUSION
Using high-density microarrays, we have identified robust GEMs within the BraIRRI mapping population. Combined with the recently released B. rapa genome sequence, we have defined cis-and trans-eQTL and their environmental responses to low P availability within a complex plant genome. Putative hotspots of trans-eQTL within the genome were also identified. Genes responsive to P supply had large environmental and heritable variance components. This suggests that transcriptional responses to P availability are highly heritable and provide potential targets for breeding Table I. Significant (P , 0.05) QTL associated with shoot P and measures of PUE in B. rapa Shoot dry matter, shoot [P], and measures of PUE were determined in 67 RILs of the BraIRRI mapping population. Plants were grown under CE conditions in compost containing 9 mg L 21 (low) or 30 mg L 21 (optimal) Olsen extractable P. Dry weight and shoot [P] were determined after 18 d of growth, and measures of PUE were calculated according to Hammond et al. (2009). QTL positions were estimated in QTL Cartographer 2.0 using the zmapQTL model 6 composite interval mapping option. DM, Dry matter; P, shoot [P]; P f , fertilizer P applied. crops with improved abilities to grow under low soil P conditions. Further analyses are required to identify the major regulators underlying trans-eQTL hotspots and the biochemical, morphological, and physiological processes they regulate in the plant adaptations to low soil P availability.

Plant Material and Growth Conditions
Seeds of 78 informative lines from the BraIRRI mapping population of Brassica rapa (2n = 2x = 10; A genome) and the two parent lines were selected for study. The establishment of the BraIRRI population is described by Iniguez-Luy et al. (2009). Briefly, the population was derived from a cross between a highly inbred rapid-cycling B. rapa (IMB211, female), derived from the WI Fast Plant population, produced from 10 generations of selection for early flowering and rapid generation time (Williams and Hill, 1986), and a highly inbred annual yellow sarson (var trilocularis; R500, male). An initial F1 plant was self-pollinated, and 160 RILs were obtained by single seed descent to the S4 generation (Iniguez-Luy et al., 2009) and subsequently to S8 (C. Hopkins, unpublished data, Rothamsted Research, UK).
Plants were grown from seed in individual pots (9 cm 3 9 cm 3 8 cm; Desch Plantpak) containing a moistened peat-based compost under controlled-environment conditions. The potting mix consisted of 25% sand and 75% (v/v) compost (Shamrock medium grade sphagnum peat; Scotts UK). Unfertilized, the potting mix had an Olsen extractable P (Olsen et al., 1954) of 8 mg L 21 . Two [P] ext treatments of 9 mg L 21 (low) or 30 mg L 21 (optimal) Olsen extractable P were produced by incorporating 0.075 and 0.45 g of sieved (mesh, 500 mm) single superphosphate (7% P) per liter of compost, and used to induce a growth response based on previous studies with Brassica (Hammond et al., 2009). Both compost treatments contained the following nutrients in sufficient amounts to prevent deficiencies: NH 4 NO 3 (0.4 g L 21 ), KNO 3 (0.75 g L 21 ), ground limestone (2.25 g L 21 ), magnesium limestone (2.25 g L 21 ), and a fritted trace elements mixture containing boron, copper, iron, manganese, molybdenum, and zinc (WM255; Fargro Ltd.) at 0.4 g L 21 . Potting mixes were made in bulk using a paddle mixer (model 156; St. Moritz). The controlled environment (CE) was a walk-in high-specification growth room (Weiss-Gallenkamp) set to a 16-h photoperiod using metal halide lamps, giving a photon flux density between 400 and 700 nm (photosynthetically active radiation) of 250 mmol photons m 22 s 21 at plant height. The day/night settings for temperature and relative humidity were 18°C/15°C and 76%/71%.
The experimental design was based on an a-design (John and Williams, 1995). The experiment was done in three runs, with all 78 RILs plus parents grown in each run. Each run comprised 12 blocks. Each block contained two subblocks: one for low [P] ext and one for optimal [P] ext treatments. Each subblock was placed on an independent tray, with capillary matting, and irrigated through automated drippers with deionized water. Within a block, the same set of lines was sown in the same (randomized) order in each subblock. Each subblock contained up to 42 pots, comprising six individual plants per RIL for six or seven RILs.
All plants were harvested 21 d after sowing. For each line at each treatment, fully expanded leaves from three plants (six-leaf stage, not flowering) were sampled, snap frozen in liquid nitrogen, freeze dried, ground, and stored at 280°C. For the remaining three plants, leaf, stem, and cotyledon fresh weight and dry weight were determined. Leaf samples were digested by the addition of 2 mL of nitric acid to 0.3 g of dried, ground material and processed in a closed vessel acid digestion microwave (MARSXpress; CEM). Digested samples were diluted with 23 mL of deionized water and analyzed using inductively coupled plasma emission spectrometry (JY Ultima 2; Jobin Yvon) to determine shoot [P].

RNA Extraction and Array Processing
RNA was extracted from leaf samples from one experimental run (run 2; 78 lines at low and optimal [P] ext , with one line duplicated [i.e. 158 samples]) and from leaf samples from the parent lines at low and optimal P in all three experimental runs (12 samples) using a modified TRIzol extraction method (Hammond et al., 2006). Extracted total RNA was purified using the RNA Cleanup protocol for RNeasy columns with on-column DNase digestion to remove residual genomic DNA (Qiagen). Samples of total RNA were checked for integrity and quality using an Agilent Bioanalyzer (Agilent Technologies). RNA samples were labeled with the QuickAmp Labeling Kit (Agilent Technologies) and hybridized to Agilent Brassica 95k 60-mer arrays (Trick et al., 2009a) for 17 h at 65°C at 10 rpm. The arrays were washed, then scanned on an Agilent G2565CA scanner, according to the manufacturer's instructions. Data files were generated using Agilent Feature Extraction Software (version 10.7.3.1). Array files have been submitted to the Gene Expression Omnibus (http://www.ncbi.nlm.nih.gov/projects/geo/; accession no. GSE27052).

Initial Data Analysis
Data files (n = 170) were imported into GeneSpring GX (version 11.0.2; Agilent Technologies). Signal values within arrays were normalized to the 50th percentile. Normalized signal values for individual probes were standardized to the median signal value for the probe across all arrays. The log 2normalized signal values were exported as a data matrix with 170 columns (unique RNA samples) and 91,854 (92k) rows (array probes; data available from Gene Expression Omnibus accession no. GSE27052). GO enrichment was performed using the GO category analysis tool in Gene Spring GX with a P value, adjusted for multiple testing using the method of Benjamini and Hochberg (1995), of 0.05. GO terms were assigned to individual probes based on their homology to Arabidopsis (Arabidopsis thaliana) gene sequences.
Custom GenStat scripts (VSN International; available on request from the corresponding author) were used to batch process the log 2 -normalized signal values in groups of up to 1,500 genes. Genotypic and environmental variance components were estimated for each probe, where environment represents low and optimal [P] ext and genotype represents the effect of line within the population. A variance components model was used to allocate sources of variation, with environment defined as a fixed factor and [genotype + (genotype.environment) + residual] defining the random term. The model was fitted using residual maximum likelihood procedures (Patterson and Thompson, 1971;Robinson, 1987). H 2 was calculated using the method of Cullis et al. (2006) as H 2 = 1 -mean(pev)/sG, where pev is the vector prediction error variance for the line effects and sG is the line variance component. This can be interpreted as broad-sense mean line H 2 , which is approximately equal to H 2 = sG/(sG + sGE + s/2), where sGE is the genotype 3 environment variance component and s is the residual variance (Cullis et al., 2006).

Development of GEMs
To enable mapping of eQTL and trait QTL, a robust set of 125 GEMs were selected for a subset of 67 genotypes (BraIRRI_04 subpopulation). First, the mean log 2 -normalized expression for each of the 92k transcripts was calculated for each of the two parents at both [P] ext treatments (i.e. IMB211 [n = 6] and R500 [n = 6]). Transcripts were then ranked following a two-tailed Student's t test in ascending order of significance. Based on a parental segregation threshold of P , 0.001, a subset of 2,306 transcripts was selected for exploration as suitable GEMs. For GEMs mapping, a reiterative procedure was followed in JoinMap4 (Kyazma; van Ooijen, 2006) using a combination of maximum likelihood mapping (MLM) and regression mapping (RM) approaches. A putative marker data set was identified through selection from the original 2,306 transcripts, minimizing missing data. Selecting for good separation of IMB211 and R500 allele scores within the 78 lines produced a data set of 838 putative GEM loci, and these were combined with 224 RFLP and simple sequence repeat markers from the S4 BraIRRI map (Iniguez-Luy et al., 2009) for initial group determination before the S4 BraIRRI markers were removed before map construction. Grouping used the "independence LOD" option ranging from LOD 2 to 15 in two LOD steps, with the appropriate level for defining a linkage group being determined by visual inspection of the groupings produced. Generally, the final LOD stringency selected was between LOD = 3 and LOD = 5, with the exception of A02 and A03, as described below. Results from MLM and RM (maps 1 and 2 only, with a "jump" value of 3) were used to create initial linkage group maps, using default settings unless otherwise stated. The comparison of results from MLM and RM iterations was used to test the robustness of observed GEM order. The final set of 125 markers was generated through a reiterative process of removing GEMs through examining "fit and stress" parameters for MLM and also through visual inspection of graphical genotypes for MLM and RM for each map iteration (for detailed discussion, see JoinMap4 manual). GEM markers containing clear miscalls were removed, focusing on double recombination events in small genetic distances (1-5 cM). From the initial "grouping" derived from 838 GEM markers, a "minimal" map was created by continuing to remove markers to try to produce a framework at approximately 10 cM resolution for optimal QTL analysis. For both BraIRRI and GEM data, A02 and A03 were difficult to separate at the "group" stage, so very high stringency was applied, suggesting a genuine (but unknown) genetic effect. The resulting GEMs-based map ( Fig. 3; Supplemental Table S1; Supplemental Text S1; BraIRRI_04_2010a), containing 125 GEMs mapped across 67 individuals from the BraIRRI mapping population, was subsequently used for QTL and eQTL analyses.
Identifying eQTL and Trait QTL eQTL associated with individual probes were identified using a two-stage process. GenStat procedure QIBDPROBABILITIES (Boer and Thissen, 2009) was used to calculate a set of genetic predictors at the marker positions on the GEMs genetic map (BraIRRI_04_2010a) for 67 RILs (BraIRRI_04 subpopulation). Initially, a parallel regression model was used to identify possible QTL effects across 1,500 genes simultaneously, by testing for a combined QTL + (QTL 3 environment) effect, where QTL indicates a genetic predictor used as a covariate. This model was fitted for each genetic predictor in turn. For probes with a significant result in the first step, simple interval mapping, using genetic predictors within the mixed model framework, was performed using QTL procedures implemented in GenStat (Boer et al., 2007). Briefly, a mixed model was fitted with fixed environment effects and QTL 3 environment interaction for each genetic predictor in turn and random terms comprising genotype, genotype 3 environment, and residual terms. A compound symmetry structure was used to account for genetic covariance across environments.
We aligned individual probes on the 95k oligonucleotide array to prepublication concatenated scaffolds of the B. rapa Chiifu-401 genome sequence (version 1.0, 255.9 Mb, representing 90% of the assembled sequences), kindly provided by Xiaowu Wang (Institute of Vegetables and Flowers-Chinese Academy of Agricultural Sciences). Probes that aligned with three or fewer locations with a match of 98% or greater within the genome sequence were then assigned to specific chromosome sequence coordinates. Therefore, we were able to distinguish cis-and trans-eQTL. cis-eQTL were defined as those whose physical position was in the same region as the marker associated with it. The region for defining cis-eQTL was taken as the midpoint between the marker associated the eQTL and the adjacent markers or the end of the chromosome for markers at the ends of chromosomes. eQTL that fell outside of this region were defined as trans-eQTL. eQTL hotspots were identified when the number of eQTL associated with a specific marker exceeded an empirical threshold. Here, the threshold was defined as the upper 99% confidence interval for the Poisson distribution, assuming an equal distribution of GEMs and eQTL.
Trait QTL associated with shoot [P], dry weight, and fresh weight at optimal and low [P] ext and measures of PUE (Hammond et al., 2009) were identified using the same 67 RILs (BraIRRI_04 subpopulation) and 125 GEMs (BraIRRI_04_2010a map). Marker mean and QTL positions were estimated in QTL Cartographer 2.0 (S. Wang, C.J. Basten, andZ.-B. Zeng, 2001-2004, Windows QTL Cartographer 2.0, Department of Statistics, North Carolina State University, Raleigh) using the zmapQTL model 6 composite interval mapping option. Five background cofactors were determined by forward stepwise regression and a 10-cM window, with genome scanning at 2 cM. This procedure estimated the LOD score and additive effect every 2 cM along each chromosome. Significant LOD thresholds were determined empirically based on 1,000 permutations and ranged from 1.54 to 2.81. The additive effect of each QTL was reported relative to the contribution of alleles from the female IMB211 parent.

Supplemental Data
The following materials are available in the online version of this article.
Supplemental Table S2. Transcripts with significantly greater abundance at low [P] ext .
Supplemental Table S3. Transcripts with significantly less abundance at low [P] ext .
Supplemental Table S4. Enrichment of GO terms in transcripts with significantly greater abundance at low [P] ext .
Supplemental Table S5. Enrichment of GO terms in transcripts with significantly less abundance at low [P] ext .
Supplemental Table S6. Variance components and H 2 of Brassica transcripts.
Supplemental Table S7. Expression QTL associated with low and optimal P availability.
Supplemental Table S8. cis-eQTL associated with low and optimal P availability.
Supplemental Table S9. trans-eQTL associated with low and optimal P availability.
Supplemental Text S1. Further information on the generation of the BraIRRI_04_2010a map.