Genetic control of a transition from black to straw-white seed hull in rice domestication.

The genetic mechanism involved in a transition from the black-colored seed hull of the ancestral wild rice (Oryza rufipogon and Oryza nivara) to the straw-white seed hull of cultivated rice (Oryza sativa) during grain ripening remains unknown. We report that the black hull of O. rufipogon was controlled by the Black hull4 (Bh4) gene, which was fine-mapped to an 8.8-kb region on rice chromosome 4 using a cross between O. rufipogon W1943 (black hull) and O. sativa indica cv Guangluai 4 (straw-white hull). Bh4 encodes an amino acid transporter. A 22-bp deletion within exon 3 of the bh4 variant disrupted the Bh4 function, leading to the straw-white hull in cultivated rice. Transgenic study indicated that Bh4 could restore the black pigment on hulls in cv Guangluai 4 and Kasalath. Bh4 sequence alignment of all taxa with the outgroup Oryza barthii showed that the wild rice maintained comparable levels of nucleotide diversity that were about 70 times higher than those in the cultivated rice. The results from the maximum likelihood Hudson-Kreitman-Aguade test suggested that the significant reduction in nucleotide diversity in rice cultivars could be caused by artificial selection. We propose that the straw-white hull was selected as an important visual phenotype of nonshattered grains during rice domestication.

Cultivated rice (Oryza sativa) is considered to be domesticated from its progenitor of the wild rice species Oryza rufipogon and Oryza nivara (Khush, 1997;Cheng et al., 2003;Kovach et al., 2007;Sang and Ge, 2007). The wild rice species exhibit seed-shattering habit, prostrate growth, and black-colored seed hull. Over the long time of domestication, large arrays of morphological traits have changed as a result of artificial selection, including the loss of seed shattering, changing of grain coloration, enlarging of the seed size, and changing from prostrate to erect growth habit. There are a number of genes known to control seed-shattering habit (Konishi et al., 2006;Li et al., 2006), red grain pericarp (Sweeney et al., 2006), grain discoloration (Yu et al., 2008), grain size (Shomura et al., 2008), shoot gravitropism (Li et al., 2007), and prostrate growth (Jin et al., 2008;Tan et al., 2008). Examination of these genes will help us understand the forces driving the evolution of phenotypes during domestication.
It is now a natural phenomenon that the color of the immature seed hull of cultivated rice is green due to the color of chlorophyll and then turns into the socalled straw-white color (golden colored or yellowish) after full ripening. However, the seed hull of the wild ancestor of cultivated rice is colored black at maturity. The black seed hull is ubiquitous among the ancestors of cultivated rice. The black hull seems always associated with seed-shattering habit in wild rice. The seed-shattering habit is believed to be a key functional trait for wild rice to survive efficiently, and the black hull is thus reasonably thought to be a natural color for shattered wild rice seeds in a dark mud land. In contrast, the nonshattering seed of cultivated rice is always associated with a straw-white hull, which is accompanied by a withering of stems and leaves of a similar color at the end of the growing season.
However, the genetic basis for the transition from the black hull of the wild rice to the straw-white hull of cultivated rice remains unknown. The black hull coloration has been identified to be controlled by two or three complementary genes (Chao, 1928;Nagao and Takahashi, 1954;Kuriyama and Kudo, 1967). Maekawa (1984) reports that three complementary genes, symbolized as Bh-a, Bh-b, and Bh-c, control black hull color. The Ph gene for phenol reaction is also found to be responsible for black hull color (Kuriyama and Kudo, 1967). The Ph gene (Phr1) corresponding to Bh-c has been cloned and characterized. Phr1 localizes on the long arm of chromosome 4 and encodes a polyphenol oxidase. The frequency of Ph is high in the indica type and low in the japonica type (Yu et al., 2008). In another study, using an F2 population developed from the cross between the weedy strain SS18-2 and the breeding line EM93-1, the two loci qHC4 and qHC7 that explained a major and a minor proportion of the phenotypic variation in black pigmentations on the hull were identified (Gu et al., 2005). But no candidate gene has been cloned for the transition from the black hull of the ancestral wild rice to the straw-white seed hull of cultivated rice, and the genetic basis for this kind of hull color alteration remains unknown. The functional variations resulting from this transition are also to be investigated. In this study, we cloned the Bh4 gene, a member of an amino acid transporter family conditioning for black hull on chromosome 4. A 22-bp deletion within Bh4 caused a frame shift and truncated the BH4 protein that led to the straw-white hull phenotype in cultivated rice. Sequence analysis of O. sativa varieties and wild rice accessions revealed that Bh4 has been the selection target and undergoes strong selection.

Characterization of an Introgression Line with Black Hull
To characterize the gene responsible for black seed hull coloration and phenotypic variations resulting from the transition from black to straw-white hull, we constructed a set of chromosome segment substitution lines from backcross progeny derived from a cross between cultivated rice variety O. sativa indica cv Guangluai 4 (straw-white hull) as the recurrent parent and wild rice accession O. rufipogon Griff W1943 (black hull) as the donor parent. A chromosome segment substitution line was found to carry the entire wild rice chromosome 4 (designated as SL4). The mature seeds of SL4 showed black hull, long awn (Fig. 1A), and SL4 also exhibited later heading date and strong seed shattering. Therefore, we designated the black hull locus as Black hull4 (Bh4) on chromosome 4. To separate the Shattering4 (Sh4) locus, which controls the seed shattering in wild rice (Li et al., 2006), we developed a near isogenic line (NIL8) that contained a short segment from wild rice in Guangluai 4 (Supplemental Fig.   S1). NIL8 showed black seed hull but did not shatter after ripening (Supplemental Fig. S2). NIL8 was awnless and had the same heading date as Guangluai 4.
All seed hulls of the NIL8 panicles kept green from the 1st to 14th d after heading. Black spots appeared on the hulls of the first developing grains on the 15th d. Most hulls of the panicle turned black on the 19th or 20th d. The hulls became completely black on about day 25 after flowering (Fig. 1B). We also observed that the grain hulls would not turn black if the flowers were not pollinated.
Cloning and Confirmation of the Bh4 Gene An F2 population was derived from the cross between SL4 and Guangluai 4, and a total of 300 individual F2 plants were genotyped using 28 insertion or deletion (InDel) markers along chromosome 4. All seeds of these 300 plants were harvested, and the F3 population, therefore, was derived from the F2 individuals. From the segregation of the F3 population, the genotype of each F2 individual for the Bh4 locus was determined. With the combination of the genotype of the entire chromosome 4 and Bh4, Bh4 was primarily mapped to the region between markers M1 and M2 and cosegregated with simple sequence repeat (SSR) marker RM3524 ( Fig. 2A; Supplemental Fig. S3). Sub- Figure 1. Phenotypes of mature rice grains of SL4 and Guangluai 4 and the NIL8 seed hull color changes. A, Mature grains of the introgresson line SL4 and the recipient parent indica Guangluai 4 (G4). The SL4 seed hulls are black, and the Guangluai 4 seed hulls are straw white (yellowish). B, The course of the NIL8 seed hull color change on maturing NIL8 panicles. Days after heading are indicated. From the 15th d after heading, black spot appears on the hulls covering developing seeds.
sequently, a larger F2 population was generated, and a total of 3,276 plants were genotyped in the seedling stage using the flanking markers M1 and M2. The genotype of the Bh4 locus of the individual recombinants with B (Guangluai 4 homozygous) and H (heterozygous) was determined in this population, and the seeds of individual recombinants with A (SL4 homozygous) and H (heterozygous) between the two markers were collected and sown to determine the genotype of Bh4. By progressively examining SSR and single nucleotide polymorphism (SNP) markers between M1 and M2, we finally delimited the gene responsible for black hull in wild rice to an 8.8-kb region between the markers M5 and M7 (Fig. 2B).
A bacterial artificial chromosome (BAC) library of O. rufipogon W1943 was screened using the method of pool-PCR to identify the BAC clones that contained the 8.8-kb target region. Finally, two overlapped BAC clones were selected, and one of them (ORW1943-Ba0077G13) was fully sequenced (Fig. 2C). Only one open reading frame (ORF) was identified in the 8.8-kb DNA fragment using FGENESH gene prediction analysis (http://linux1.softberry.com/berry.phtml; Fig. 2D). Two BACs corresponding to the Bh4 locus from cultivated rice indica Guangluai 4 (OSIGBa0097P08) and japonica Nipponbare (OSJNBa0072F16) were also identified and sequenced. In the Nipponbare annotations at the The Rice Annotation Project Database (http:// rapdb.dna.affrc.go.jp/viewer/gbrowse/build4/), there was only one hypothetical gene within this region (Os04g0460000). To test whether the predicted gene was the Bh4 gene, a complementation test was conducted. An 11,708-kb genomic DNA fragment containing the entire predicted Bh4 coding region from W1943 including the 5,962-bp upstream sequence and the 3,282-bp downstream sequence was cloned into the HindIII-PstI site of the binary vector pCAMBIA1301 to generate the transformation plasmid pC13Bh4-W for the complementation test (Fig. 2D). The binary plasmids were introduced into the Agrobacterium tumefaciens EHA105 strain by electroporation, and the indica cv Kasalath was transformed using Agrobacterium-mediated transformation. The vector pCAMBIA1301 was used as a control. We found that eight independent transgenic lines for pC13Bh4-W showed complementation of the black hull phenotype in mature seed hull of cv Kasalath (Fig. 2E). These results confirmed that Os04g0460000 in O. rufipogon was the Bh4 gene controlling the synthesis of black hull pigment.

Sequence Comparison of Bh4 in Wild Rice and Cultivated Rice
To investigate sequence variation between wild rice and cultivated rice, we compared the Bh4 sequences of japonica Nipponbare and indica Guangluai 4 with that of W1943. There were sequence differences in the promoter region. We designed a new marker, M10 (Supplemental Table S1), close to the site of the start codon to test the recombinant individuals and eliminated polymorphisms at the promoter as the source of the hull color change. So we focused on the mutations that could affect the protein sequence. The Bh4 coding sequence of Guangluai 4 was identical to that of Nipponbare. Both varieties have straw-white seed hull. Sequence comparison of the Bh4 locus between W1943 and Guangluai 4 revealed a 22-bp deletion and a SNP in the third exon in Guangluai 4 (Fig. 3A). The deletion induced a frame shift in the sequence, resulting in a premature stop codon. The stop codon truncated the protein by 150 amino acids. These results showed that the different alleles of Bh4 were consistent with the observed phenotypic differences between Guangluai 4 and W1943, suggesting that the 22-bp deletion was the only apparent reason for the loss of the black hull phenotype in Guanglua4 (Fig. 3A).

Overexpression Analysis of Bh4 in Guangluai 4
To confirm the accuracy of the different gene models, we cloned the Bh4 full-length cDNAs from W1943, Guangluai 4, and Nipponbare on the basis of the Bh4 gene sequence derived from the W1943 gene model. The Bh4 sequences could be amplified successfully from the three varieties (Fig. 3B). Sequencing of these cDNAs confirmed the gene model and the splice sites predicted from W1943. We then constructed an overexpression plasmid using the full-length cDNA of Bh4 from W1943. Transformation of the Bh4 overexpression plasmid succeeded in restoring the black hull phenotype in Guangluai 4 (Fig. 3C).
In addition, comparison of the Bh4 cDNA sequence and its genomic DNA sequence in Guangluai 4 showed that there was one SNP-located coding region in the third exon. This SNP-located coding region exactly located the SNP site in the third exon between Guangluai 4 and W1943 genomic sequences. It is an adenine in W1943 and a guanine in Guangluai 4 ( Fig.  3A). But the guanine turned to the same base of adenine as the wild-type W1943 allele after transcription in Guangluai 4. We also detected this transcriptional modification in Nipponbare. The function of this transition (CAT/CGT/His/Lys) might be involved in a protection mechanism that the plant adapted to random mutation in the evolutionary process.

Bh4 Encodes an Amino Acid Transporter Protein and Is Specifically Expressed in the Seed Hull
Sequence analysis of the reverse transcription (RT)-PCR products and RACE-PCR indicated that the Bh4 cDNA was 1,411 bp long, with an ORF of 1,191 bp, a 65bp 5# untranslated region, and a 155-bp 3# untranslated region. BLAST analysis revealed that the rice Bh4 encoded an amino acid transporter. Phylogenetic comparison of BH4 protein and other amino acid transporters showed that BH4 was closely related to AtANT1, an aromatic and neutral amino acid transporter in Arabidopsis (Arabidopsis thaliana; Chen et al., 2001;Fig. 4A;Supplemental Fig. S4). Given the precursor for the synthesis of pigment in plants, BH4 is most likely to transport aromatic amino acids, such as Phe, Tyr, and Trp (Grotewold, 2006). We then assayed the concentration of free amino acids in straw-white hull and black hull. The result showed that the concentration of most free amino acids and the total concentration of free amino acids decreased in black hull, but the concentration of Tyr increased 3-to 6-fold (Supplemental Table S2). On the basis of this analysis, we propose that BH4 is most likely a functional Tyr transporter.
Using the TMHMM program (http://www.cbs.dtu. dk/services/TMHMM/), BH4 was detected to have 10 transmembrane motifs, which are specific to transporter proteins in plants (Fig. 4B). Because amino acids are transported not only across the plasma membrane but essentially in and out of all cellular compartments, the cellular localization of the transporter was analyzed. Fluorescence microscopy revealed N-terminal P35S-GFP-BH4 fusions transiently expressed in Arabidopsis protoplasts. These results indicated that BH4 was located not only at the plasma membrane but also at other cellular compartments (Fig. 4C).
To profile Bh4 transcription, we used RT-PCR to amplify mRNA from root, leaf, stem, hull, and the corresponding dehulled seed at about 18 d after head- ing. The total RNA was isolated from Guangluai 4 and NIL8 plants. RT-PCR showed no expression of Bh4 in root, leaf, stem, or dehulled seed, as expected for Bh4 associated with seed hull phenotype. Because the promoter of the Bh4 gene had been eliminated as the source of polymorphism based on the recombination data, we anticipated that similar expression levels of Bh4 would be detected in straw-white and black hull seeds. Our results confirmed this expectation (Fig. 5A), and the RNA transcripts from Guangluai 4 contained the 22-bp deletion predicted by the sequence information. We also carried out quantitative real-time RT-PCR and found that the expression level of Bh4 reached the highest point when black spot appeared on the hull (Fig. 5B).

Effect of Expression of Bh4 in Transgenic Rice
To identify functional variations resulting from the black-to-straw-white hull transition, we compared grain number per panicle, fertility rate, thousand grain weight, tiller number per plant, amylose content, gel consistency, gelatinization temperature, and grain chalkness degree between NIL8 and Guangluai 4. Only thousand grain weight, tiller number, and AC exhibited significant changes (Supplemental Fig. S5). Consequently, we further compared the three traits in Bh4 transgenic and control lines of Kasalath in order to make sure that the changes were caused by Bh4. But there were no significant changes in these traits between transgenic and control lines (Supplemental Fig. S6). This indicated that Bh4 did not associate with the yield or any grain quality.

Multiple Independent Mutations of Bh4 in Rice Cultivars
To investigate the occurrence of Bh4 deficiency among rice cultivars, we first detected a 22-bp deletion among 16 japonica and 11 indica varieties using molec-ular marker M22 (Supplemental Table S1). All cultivars examined were with straw-white seed hull. Eighteen of them had the 22-bp deletion in the site as identified in Guangluai 4; the remainder had no such deletion. We then sequenced the Bh4 alleles of the 27 a Accession numbers preceded by RA refer to Garris et al. (2005); GLA4, MH63, Nipponbare, Kasalath, Lansheng, Suyunuo, Lijiang, W1943, and HP series were collected by other authors; all other accessions were obtained from the Genetic Resources Center of the International Rice Research Institute at Los Banos, Philippines. b Individual accession is abbreviated by first three letters of the taxon name followed by the code of its origin country.
c The cultivar subgroup classification followed Garris et al. (2005). NA indicates that the classification is not available for an accession. d Y indicates straw-white (yellowish) hull phenotype and B indicates black hull phenotype. e DEL represents a 1-bp deletion. f DEL represents a 22-bp deletion.
cultivars. Among the nine varieties without the 22-bp deletion, eight had a 1-bp deletion in the second exon, which also resulted in a frame shift (produced a truncated protein of 213 amino acids). One indica var Kasalath had a single base change from C to A (TCG/ TAG), leading to a premature stop codon in the third exon (producing a truncated protein of 359 amino acids; Table I). Taken together, at least three mutations in the Bh4 coding region of cultivated rice were associated with the straw-white hull color, suggesting multiple origins of the phenotype.
We further surveyed the distribution of the three mutations in a large panel of 433 rice landraces with straw-white hull collected in China. Among them, 94.9% (411) contained a 22-bp deletion, 3.7% (16) contained a 1-bp deletion, and the remaining 1.4% (six) had none of the three mutations. Among the six varieties, three of them had the same ORF as W1943, and another three accessions had the same ORF as O. sativa japonica cv Lijiang with black hull color. These results indicated that some other genes may be involved in the synthesis of the rice black hull pigment.
Remarkably, the C-to-A transversion, which caused a premature stop codon, was not found in other varieties except for Kasalath (Table I).

Phylogenetic Analysis of Bh4
We obtained Bh4 sequences from 28 cultivated rice (27 of them are with straw-white hull and one of them is with black hull) and 24 wild rice varieties with black hull ( Table I). The length of aligned sequences for each taxon varied from 2,442 to 2,458 bp (Table II). All of the four taxa included both coding and noncoding regions from ATG to TGA. These sequences were aligned with an outgroup sequence of Oryza barthii (Table I).
Phylogenetic analysis of the Bh4 sequences showed that all of the cultivars were grouped together with 98% bootstrap support except for indica var Kasalath (code ind-IND2 in Fig. 6), which was nested within a wild rice group from Thailand. Within this group, indica and japonica cultivars are intermixed. Given their sequence differences, except Guangluai 4 and Minghui 63, which have a SNP in the second intron, they differ only in a 22-bp or a 1-bp deletion (Table I). These results suggested that those cultivars might originate from a common progenitor. Over the long time of domestication, different mutations have been selected on the Bh4 gene. Outside of this clade, four wild rice accessions (one O. nivara accession and three O. rufipogon accessions) formed a sister group to the cultivated alleles. All of the accessions in this group came from India except W1943, and the bootstrap support came up to 83%. This result suggested that the bh4 mutation might originate from India.

Neutrality Tests
At the taxon level, polymorphisms in wild rice (Watterson's estimator of u per base pair calculated based on silent sites [u sil ] = 0.00787 for O. rufipogon and u sil = 0.00876 for O. nivara) were significantly higher than those in cultivated rice (u sil = 0.00021 for indica excluding Kasalath and u sil = 0 for japonica). With respect to both the entire region and silent sites, the wild rice maintained comparable levels of nucleotide diversity, which were about 70 times higher than those in the cultivated rice (Table II). A neutrality test was conducted to determine whether the reduction in nucleotide diversity in rice cultivars could be caused by artificial selection. The maximum likelihood Hudson-Kreitman-Aguade (MLHKA) test, performed in reference to seven neutral genes (Zhu et al., 2007), indicated that the test of neutrality is borderline significant in indica varieties (P , 0.025) but not significant for O. nivara and O. rufipogon. Similar results were obtained for cultivated rice and wild species. The test of neutrality was significant in cultivated rice (P , 0.005) but not significant for the wild species (Table III).
Unlike the MLHKA test, Tajima's D did not detect significant selection in any of four taxa (Tajima, 1989 ;  Table III). However, those nonsignificant statistics are likely to be an artifact of population demographics (Tajima, 1989;Nielsen, 2001) and do not necessarily indicate the absence of selection. The previous analysis of rice domestication gene Sh4 also showed that the multilocus HKA test is more powerful in detecting artificial selection than the single-locus Tajima's D when polymorphisms are much reduced (Zhang et al., 2009).

DISCUSSION
Straw-white panicles filled with ripened grains are a characteristic feature of mature cultivated rice plants.
In this study, we cloned a Bh4 gene encoding an amino acid transporter protein and identified that a 22-bp deletion within the third exon caused the loss of function of Bh4, resulting in the transition from black hull in wild rice to straw-white hull in cultivated rice. The 22-bp deletion accounted for 94.9% of the strawwhite hull color in the screened cultivated rice varieties. Transgenic study confirmed that the Bh4 gene controlled the synthesis of black hull pigment during seed maturation.
Although the prevalent mutation of Bh4 was the 22bp deletion within the third exon, there were other mutations in this gene that led to the loss of black hull pigment. This suggested that different mutations occurred in the Bh4 gene during rice domestication, suggesting that this phenotype of cultivated rice might have multiple origins.
Why did modern rice cultivars discard the black hull trait? Did Bh4 undergo some artificial or nature selection? MLHKA test showed that the Bh4 gene was fixed in cultivated rice by artificial selection. It is now a natural phenomenon that the nonshattering seed of cultivated rice is always associated with straw-white hull, which is accompanied by a withering of stems and leaves near the harvest. To identify functional variations resulting from the transition of black to straw-white hull, we compared a number of yield-and grain quality-related traits between NIL8 and cv Guangluai 4 as well as transgenic lines. Since there were no significant functional variations detected to be associated with the transition from black to straw-white seed hull, we proposed that the straw-white hull for nonshattering grains was the favored visible phenotype and therefore artificially selected in cultivated rice.
Previous studies reported that Phr1 participates in the formation of black hull (Kuriyama and Kudo, 1967) and that Phr1 in japonica is a nonfunctional allele (Yu et al., 2008). We transferred the Bh4 gene into the two subspecies of cultivated rice, indica (Guangluai 4, Kasalath) and japonica (Nipponbare). We found that only  indica varieties can be rescued with black hull phenotype, which demonstrated that some other genes in this path might mutate in japonica varieties during domestication. These results are consistent with previous genetic analyses. As multiple mutations occurred in Rc (Sweeney et al., 2006) and Phr1 (Yu et al., 2008) genes, Bh4 also had multiple mutations. All three of these genes are related to color changes in grains and seeds.
We tested whether birds played a vital role in the selection of straw-white seed hull in rice. We proposed that if the loss of function of Sh4 (Li et al., 2006) was selected and fixed first, nonshattered black-hull seeds in the background of yellowish stems and leaves at maturity would be easily targeted by birds. In wild rice, blackhull seeds fall off easily from plants at maturation, and the black hull color can protect them from being targeted by birds. However, nonshattered black-hull seeds of panicles were presumably easily targeted by birds in the background of the withering of stems and leaves at maturity. When a black-hull seed mutated to straw white, it would have been protected from bird predation and the allele frequency would increase over generations of cultivation. However, we found that birds almost equally favored eating straw-white hull and black hull rice grains while they were ripening in the field. It is possible that the birds around the rice field might have adapted to eating straw-white hull grains. Perhaps more rigorous field experiments have to be designed in order to replicate the process of rice domestication thousands of years ago. If this works, it might also explain why the majority of cereal crops have straw-white hulls while their wild relatives have dark-colored hulls.
We observed that the seeds began to turn black about 15 d after pollination and that seeds did not turn black if they were not pollinated. From the temporal and spatial expression of Bh4, the phenotype is in accordance with the expression pattern of Bh4. We think that there must be something transported from grain to hull for the black hull formation. Further study will be required to investigate which substance leads to black hull at maturity.

Plant Materials
In the cloning of Bh4 and expression analysis, SL4 and NIL8 were used. SL4 is an introgression line developed by introgressing chromosomal segments from an accession of common wild rice (Oryza rufipogon) W1943 into an indica cultivar (Oryza sativa), Guangluai 4, based on four generations of backcrossing and four generations of self-fertilization. All the genetic background of SL4 comes from Guangluai 4 except for the entire chromosome 4. NIL8 is derived from the cross between SL4 and Guangluai 4 (Supplemental Fig. S1).
In phylogenetic analysis and neutrality test, we sampled 28 accessions of cultivated rice, including 11 accessions of O. sativa indica and 17 accessions of O. sativa japonica. This sample represents all five recognized subgroups within O. sativa (Garris et al., 2005). A total of 24 accessions of the wild rice progenitors were sampled, including 11 accessions of O. rufipogon, 12 accessions of Oryza nivara, and one accession of Oryza barthii (Table I). These collections covered a wide distributional range of the wild rice from southeastern Asia to India, including all potential areas for rice domestication. O. barthii was used as an outgroup for phylogenetic analyses.

Measurement of Grain Quality Traits
Mature rice grains were milled after being harvested, air dried, and stored at room temperature for 3 months. Amylose content, gel consistency, and gelatinization temperature were measured according to methods reported by Tan et al. (1999). Gelatinization temperature was evaluated as the alkali spreading value.
For chalkness degree analysis, we adopted the image autorecognize chalkness system. Briefly, first, rice seeds were dehulled and the seed coat removed. Then, the polished seeds (intact) were imaged through a scanner, and using specific software, the chalkness region was recognized automatically by color difference between chalkness and endosperm. The chalkness degree was calculated by dividing the area of chalkness by the area of intact seed. All this work was down automatically by computer. Each sample was repeated four times, and each time we analyzed 50 seeds and obtained the average chalkness degree.

DNA Extraction and Molecular Marker Analysis
DNA was extracted from fresh leaves according to the cetyl-trimethylammonium bromide method (Murray and Thompson, 1980) with minor modifications. The molecular markers analyzed in the target region contained InDels, SNPs, cleaved-amplified polymorphic sequences, and one SSR marker, RM3524; they were all newly designed in this study except RM3524. For InDels and SSR, PCR products were loaded on agarose gels to assay polymorphism. For SNPs, PCR fragments were directly sequenced in both directions by the ABI 3730 Sequence instrument after purification by agarose gels. For cleaved-amplified polymorphic sequences, PCR fragments were digested with specific restriction endonuclease enzyme and then loaded on agarose gels for polymorphism. The information for newly developed markers is given in Supplemental Table S1.

Positive BAC Clone Screening
The BAC library comprises 80 384-well microtiter plates. The coverage of the library is about eight times the rice genome. To get the specific clone, three PCR rounds were adopted. In the first round, the 80 384-well microtiter plates were 80 superpools, and DNA was extracted from each superpool by the alkaline lysis method for PCR screening. Each superpool was screened with the two markers flanking the target region. The superpools that were positively screened by two markers were selected for the next PCR screening round. Each superpool was divided onto four 96-well plates, which formed four subpools. The subpools that were also positively screened by two markers were selected for the third PCR screening round. In the third round, each positive pool was divided into eight-row pools and 12-column pools. The target clone was specifically detected after three screening rounds.

Subcellular Localization of BH4
Transient transformation of the protoplasts with polyethylene glycol was performed according to the protocol of Negrutiu et al. (1992). For confocal laser scanning microscopy, protoplasts were incubated overnight at 24°C in the dark after transformation and observed with an Olympus microscope.

Quantitative Real-Time PCR
Total RNA was extracted from rice tissues using TRIZOL reagent (Invitrogen) as described by the supplier. Two micrograms of RNA was reverse transcribed with oligo(dT) 18 primer using the PrimeScriptRT reagent kit (Takara). For quantitative real-time RT-PCR, first-strand cDNAs were used as templates in real-time PCR using the SYBR Green PCR Master Mix (Takara) according to the manufacturer's instructions. The amplification of the target genes was analyzed using the ABI Prism 7500 Sequence Detection System and Software (PE Applied Biosystems). UBQ5 (AK061988) and eEF-1a (AK061464; Jain et al., 2006) were used as controls to normalize all data.

Bh4 Allele Sequencing
A 2,465-bp region of the Bh4 gene was amplified using PCR primers Bh4aF (5#-CTCCAAGATGACCCTGCATT-3#) and Bh4eR (5#-AGGCCACCACAC-TAATCGAC-3#). This region includes the entire coding and noncoding sequences of Bh4 from ATG to TGA. The PCR and internal sequencing primers were designed based on W1943 genome sequences. The Bh4 region was amplified with Takara LA Taq DNA polymerase with GC buffer I (Takara). The PCR products were purified with a purification kit from Tiangen. All PCR products were sequenced directly on both strands with the primers Bh4a, Bh4b, Bh4c, Bh4d, and Bh4e (Supplemental Table S1).

Phylogenetic Analyses of DNA Sequences
A phylogenetic tree of the sequenced accessions (excluding heterozygote accessions) was reconstructed by the neighbor-joining (NJ) method (Saitou and Nei, 1987) based on the two-parameter distances of Kimura (1980). MEGA version 4.0 (Tamura et al., 2007) was used to perform the phylogenetic reconstruction. Bootstrap values were estimated (with 1,000 replicates) to assess the relative support for each branch. All positions containing alignment gaps were eliminated in pairwise sequence comparisons in NJ analyses. The NJ tree is shown rooted by the midpoint to improve clarity.

Neutrality Tests
Sequence alignments were performed using ClustalX (Thompson et al., 1997) and were refined manually. Genetic variation was estimated with average pairwise differences per base pair between sequences (; Nei and Li, 1979) and Watterson's estimate (u;Watterson, 1975) based on the number of segregating sites using DnaSP version 4.10 (Rozas et al., 2003). Under the standard neutral model for an autosomal gene from a random-mating population of a constant size, both and u are expected to be equal to 4Nem, where Ne represents the effective population size and m represents the mutation rate per generation per site (Watterson, 1975).
Selection on Bh4 was tested using two methods. Departure from neutrality was tested at segregating nucleotide sites with Tajima's D using the program DnaSP (Rozas et al., 2003). We also performed the MLHKA test (Wright and Charlesworth, 2004). The maximum likelihood ratio test assesses departure from neutrality at a focal locus compared with neutral standards. Seven genes (Adh1, GBSSII, Ks1, Lhs1, Os0053, SSII1, and TFIIA; Zhu et al., 2007) were used as reference sequences, and O. barthii was used as the outgroup for the MLHKA test. The test was performed using the program MLHKA provided by S.I. Wright (http://labs.eeb.utoronto.ca/wright/Stephen_I._Wright/ MLHKA/). With different random numbers of seeds, Markov chain lengths of 100,000 were run in the MLHKA tests to specify each gene in each accession, and at least three independent runs per model were performed to assess the maximum likelihood values.

Supplemental Data
The following materials are available in the online version of this article.
Supplemental Figure S1. The size of the W1943 segment in NIL8. Figure S2. Plant and panicle architecture of NIL8 and Guangluai 4.

Supplemental
Supplemental Figure S3. Some of the F2 individuals used for Bh4 primary mapping.
Supplemental Figure S4. Multiple sequence alignment of the rice BH4 protein and the amino acid transporter proteins from Arabidopsis and tomato (Solanum lycopersicum).
Supplemental Table S1. Primers used in this study.
Supplemental Table S2. Concentration of free amino acids in seed hull.