• Background Perusing the literature on nuclear ‘genome size’ shows that the term is not stabilized, but applied with different meanings. It is used for the DNA content of the complete chromosome complement (with chromosome number n), for which others use ‘C-value’, but also for the DNA content of the monoploid chromosome set only (with chromosome number x). Reconsideration of the terminology is required.
• Aim Our purpose is to discuss the currently unstable usage of the terms ‘genome size’ and ‘C-value’, and to propose a new unified terminology which can describe nuclear DNA contents with ease and without ambiguity.
• Proposals We argue that there is a need to maintain the term genome size in a broad sense as a covering term, because it is widely understood, short and phonetically pleasing. Proposals are made for a unified and consensual terminology. In this, ‘genome size’ should mean the DNA content based on chromosome number x and n, and should be used mainly in a general sense. The necessary distinction of the kinds of genome sizes is made by the adjectives ‘monoploid’ and the neology ‘holoploid’. ‘Holoploid genome size’ is a shortcut for the DNA content of the whole chromosome complement characteristic for the individual (and by generalization for the population, species, etc.) irrespective of the degree of generative polyploidy, aneuploidies, etc. This term was lacking in the terminology and is for reasons of linguistic consistency indispensable. The abbreviated terms for monoploid and holoploid genome size are, respectively, Cx-value and C-value. Quantitative data on genome size should always indicate the C-level by a numerical prefix, such as 1C, 1Cx, 2C, etc. The proposed conventions cover general fundamental aspects relating to genome size in plants and animals, but do not treat in detail cytogenetic particularities (e.g. haploids, hybrids, etc.) which will need minor extensions of the present scheme in a future paper.
Discussions at the 2003 Plant Genome Size Workshop, held at the Royal Botanic Gardens, Kew, included a review of the modern usage of several terms commonly used to describe nuclear DNA contents. The expression ‘genome size’ is often used for the DNA content of the monoploid genome or chromosome set, whereas ‘DNA C-value’ stands for the DNA content of the whole chromosome complement or karyotype irrespective of the degree of generative polyploidy of the organism. For example, Bennett et al. (1998) and Johnston et al. (2005) espoused this traditional usage. However, ‘genome size’ and ‘DNA C-value’ are often also used synonymously. Obermayer and Greilhuber (1999) and Leitch et al. (2005) are examples of this second usage. The restricted traditional use of ‘genome size’ (Bennett et al., 1998), if followed consistently, would largely eliminate from the discourse this established term, which is convenient, comprehensible and phonetically pleasing. In many cases, e.g. when the degree of generative polyploidy of a plant is unknown, a genome size in the restricted sense could not be given (Bennett et al., 1998). Moreover, comparative genomics recently confirmed that possibly all plants, and probably most organisms, have experienced one or more polyploidization events in their ancestry (Wendel, 2000). If so, any narrow insistence now regarding the term ‘genome size’ would be altogether unfounded. Thus, a reconsideration of the terminology is clearly required (N.B. a glossary of terms used in this paper is given in Appendix 1). The purpose of this paper is to discuss the currently unstable usage of the terms ‘genome size’ and ‘C-value’, and to propose a new unified terminology that can describe nuclear DNA contents with ease, but without ambiguity.
GENOME AND GENOME SIZE
The term ‘genome’ was coined by Winkler (1920, p. 165; see Appendix 2). From a literal interpretation of his writing, we determined that Winkler intended that polyploid organisms have more than one genome. Winkler's definition for ‘genome’ has been formulated more tersely by Rieger et al. (1991): ‘in eukaryotes, the basic (monoploid) chromosome set, consisting of a species specific number of linkage groups and the genes contained therein’. So, seen from the perspective of historical priority of the term ‘genome’ and its meaning, Bennett et al. (1998) were correct in using the term ‘genome size’ (first used by Hinegardner, 1976; see below) for the DNA content of the monoploid chromosome set only. However, everyday usage now of the term ‘genome’ is not restricted to only the narrow definitions that Winkler (1920), Rieger et al. (1991) and Bennett et al. (1998) indicated. Today when we speak of the ‘wheat genome’, we may think not only of one of its monoploid genomes A, B or D, but rather of the whole complement of the 2n = 42 or n = 21 chromosomes of Triticum aestivum. Similarly, when speaking about the ‘Plant Genome Size Workshop 2003’, we would not imagine it concerned only monoploid genomes. These examples alone show that ‘genome’ and ‘genome size’ can be used in both a more inclusive or less inclusive sense. Indeed, a genome can be generatively polyploid or monoploid, reduced or non-reduced, replicated or non-replicated—but in each case the same term ‘genome’ remains appropriate. In scientific terminology, priority is not a sacred cow. Rather, convenience and consensus determine which meanings persist over time, and how the usage of terms evolves.
Ambiguity of the term ‘genome size’ is even underlined when looking at its historical roots. It was apparently used first by Hinegardner (1976) in the title of his paper ‘Evolution of genome size’, where it was probably intended to denote the mass or quantity of DNA in a non-replicated haploid genome, (e.g. in fish sperm nuclei). Yet throughout the text ‘DNA content’ was used instead of ‘genome size’ and no explicit definition was given for the latter term. Thus, we note that ‘genome size’ was used by Hinegardner (1976) without an explicit connotation of monoploidy. Cavalier-Smith (1985, p. 1), who refers to Hinegardner (1976), treated genome size and C-value as synonyms, and so did Singh in his textbook (2003, p. 44). Gregory and Hebert (1999) interpreted ‘basal genome size’ and ‘C-value’ of an organism as equivalent and defined these as ‘the content of DNA (measured by weight or number of base pairs) in a single copy of the entire sequence of DNA found within a nucleus of that organism’. This definition changes the meaning of ‘genome’ to the chromosome complement with the number n (see also Gregory et al., 2000; Gregory, 2002). However, note that the expression ‘basal genome size’ could lead to confusion with the DNA content of the genome with the ‘chromosome base number’ x. Ambiguous use of the term ‘genome’ (relating to the meiotically reduced chromosome number n or monoploid chromosome base number x) is another source of potential error and misunderstanding.
Swift (1950a) introduced the ‘C’ terminology (using ‘class C’, ‘C value’ and ‘C amount’), but it is not clear from the text which word at the time ‘C’ actually was intended to symbolize or abbreviate—class, category, content or constant. Swift's (1950b; apparently as manuscript antecedent to 1950a) ‘class I’, being usually the most frequent type in a tissue, corresponds to 2C, ‘class II’ to 4C, but sperm nuclei are not assigned to a class. The term ‘C-value’ was stated to mean ‘constant’ by Bennett and Smith (1976) based on a personal communication by H. Swift to M. D. Bennett, and was defined as ‘DNA content of the unreplicated haploid chromosome complement’ (Bennett and Smith, 1976). Later, Marie and Brown (1993) ascribed to Bennett and Smith (1976) a change in the meaning of the symbol ‘C’ to ‘complement quantity’, but this modification is not found in that paper. Recently, Gregory (2002) seemed to be certain that ‘C’ was derived from ‘class’ and not from ‘constant’ or ‘characteristic’, but this was only concluded indirectly from the text of Swift (1950a). Here we are not dependent on recent opinion, as in 1975 Swift (pers. comm.) stated unambiguously that ‘the letter C stood for nothing more glamorous than constant, i.e. the amount of DNA that was characteristic of a particular genotype’ (see Bennett and Leitch, 2005).
In contrast to ‘genome size’, the ‘C’ symbol (or ‘-value’) is not self-explanatory. It is an abbreviation, which is understood to refer to a nuclear DNA amount only in a narrow scientific community (and not always perfectly correct even here). Indeed, it may cause confusion for other biological disciplines where it may not even refer to genetics. Thus, elsewhere ‘C’ may be used as a symbol for cytidine, colchicine, arm-binding frequency in meiosis, carbon, Coulomb, speed of light, and degrees Celsius.
Difficulties arising from vague C-value and genome size terminology are best demonstrated for odd-numbered polyploid plant species. Brandizzi and Caiola (1998), who estimated 2C nuclear DNA content of saffron (Crocus sativus, 2n = 3x = 24), calculated its genome size as one third of 2C nuclear DNA content. Although they determined an average genome size of one basic chromosome set (x), they designated the results as ‘C’. Lysák et al. (1999) determined 2C DNA contents in a series of diploid and triploid (2n = 33) Musa species and clones. In an attempt to compare genome sizes of basic chromosome complements of triploids with those of diploids, they divided 2C values of triploids by three and designated the results as 1C-values with a remark ‘one copy of nuclear genome’. However, Arumuganathan and Earle (1991) divided the 2C DNA content determined in a triploid Musa plant by two and presented the results as DNA amount of unreplicated haploid genome (1C). While each paper derives C-values, all three have a different meaning for the term.
Several proposals are made below, which are intended to maintain the relaxed usage of the term ‘genome size’, which is practical in combinations such as ‘genome size variation’, ‘genome size research’, ‘genome size measurements’, ‘genome size workshop’ and ‘genome size database’, and to establish a clear terminology for quantitative data on genome size which leave no room for ambiguities. However, before doing so it is necessary to outline briefly the complexity of the matter we are dealing with.
DNA COPY NUMBER STATUS AND THE C-TERMINOLOGY
Basically four different kinds of nuclear DNA copy number status can be distinguished:
Replication-division levels of the mitotic nuclear cycle, related to its G1, S and G2 phase.
Alternation of nuclear phases (not to be confused with alternation of generations) associated with meiotic reduction and fertilization (in angiosperms including double fertilization followed by nuclear phase change in endosperm).
Generative ploidy levels, which means the presence of one, two or more monoploid genomes (each one with the chromosome number x; or x1, x2, etc., if the respective chromosome base numbers differ) in the reduced (haplophasic) genome (with chromosome number n), which may characterize single individuals, populations or whole taxa.
Somatic polyploidy, caused by endocycles (or more rarely mitotic disturbances) in somatic tissues.
The nuclear replication status (G1 for non-replicated, S for replicating, G2 for replicated) leads to DNA content changes expressed in terms of ‘C’. For instance, 1C can be the DNA content of a young pollen cell nucleus just after meiosis, 1·5 C the content of a generative cell nucleus in S-phase, 2C the content of a telophase root tip nucleus, etc. However, 1C can also be the DNA content of a diploid telophase nucleus divided by 2.
The nuclear phase status is by convention indicated using the letter ‘n’. The designations ‘(meiotically) reduced’ and ‘non-reduced’, or ‘haplophasic’ and ‘diplophasic’, are preferable to ‘haploid’ and ‘diploid’, respectively, because their meaning is unambiguous. n indicates the meiotically reduced chromosome number, 2n the non-reduced number, and 3n, 5n, etc. the endospermic chromosome number (in angiosperms only). The DNA content is also indicated on the basis of C, 1C usually being the lowest level recognized. For example, the triploid primary endosperm nucleus (3n = 63) of Triticum aestivum has in prophase a DNA content of 6C, while the pentaploid primary endosperm nucleus of Gagea lutea (5n = 180) has in prophase a DNA content of 10C.
The degree of generative polyploidy is indicated using the letter ‘x’. For instance, a diploid embryophyte taxon has as sporophyte 2n = 2x, a tetraploid 2n = 4x, a pentaploid 2n = 5x. Greilhuber (1979, p. 273) indicated the DNA content of the non-replicated monoploid genome (‘basic DNA content’) by 1Cx (originally the x as subscript) and used it as an abbreviation for ‘1C value (x-level)’. A tetraploid sporophyte (2n = 4x) with a certain 2C-value has a (mean) 1Cx-value that is a quarter of the 2C value. For perfect semiotic consistency, it would be tempting to replace ‘C’ by ‘Cn’—n indicating the chromosome number n—but C has both historical priority (Swift 1950a) and is firmly introduced, so that the use of ‘Cn’ (although correct) is not necessary or recommended.
In the case of somatic polyploidization, the degree of polyploidy and the DNA amount of a nucleus can be given quantitatively based on ‘C’. For example, 1C is the DNA amount of the reduced non-replicated nucleus with n chromosomes, as could be found in the embryo sac in an antipodal nucleus, which can later become endopolyploid. Or, a DNA content of 12C would be found in an already endopolyploid triplophasic endosperm nucleus after two rounds of replication without mitotic division. The degree of somatic polyploidy should be given as C-values and not as ploidy levels (n or x indicate chromosome numbers and do not indicate whether the chromosomes are replicated or non-replicated). For example, an endopolyploid root cell nucleus in Arabidopsis thaliana (1C = 0·16 pg; Bennett et al., 2003) with a DNA content (not genome size!) of 1·28 pg is in 8C. Here, tetraploid (4n) or octoploid (8n) would be inappropriate designations, because these terms relate to chromosome number and not to replication status.
PROPOSED GENERAL SOLUTIONS
We have shown that restricting the definition of the term ‘genome size’ to the DNA content of the monoploid genome only (although justifiable by strict historical priority) is now not very practical. Rather, it seems linguistically and scientifically acceptable to use the terms ‘genome’ and ‘genome size’ in the wider sense as covering terms, including both the whole chromosome complement (with chromosome number n) and its DNA content, and the monoploid genome (with chromosome number x) and its DNA content in polyploids. So, ‘genome’ and ‘genome size’ can be used ad libitum in the inclusive and in the restricted sense, i.e. they can be related to n or x chromosomes. This usage will be helpfully applicable in titles, introductory and concluding phrases, etc., but will need further specification in a scientific text.
To refer to the whole chromosome complement with chromosome number n irrespective of the degree of generative polyploidy, we propose the term ‘holoploid genome’ (from Greek holos, complete). Moreover, to indicate its size we suggest the full term ‘holoploid genome size’ and the abbreviating term ‘C-value’. ‘Holoploid genome size’ is required as a shortcut for ‘DNA content of the whole complement of chromosomes characteristic for the organism, irrespective of the degree of generative polyploidy, aneuploidies, etc.’. The term ‘holoploid genome size’ is indispensable, because ‘genome size’ (broad sense) alone is ambiguous, and the attributes monoploid and polyploid are mutually exclusive. An organism always has a holoploid genome, but it does not always have a polyploid or a monoploid genome.
For the (averaged) DNA content of the monoploid genome(s) in polyploids and non-polyploids the full term ‘monoploid genome size’ and the abbreviating term ‘Cx-value’ are suggested. (N.B. The latter was first used by Greilhuber, 1979.) In generatively non-polyploid organisms C-value and Cx-value are congruent.
In connection with quantitative data these terms should always be given with the prefix-number indicating the DNA level: 1C, 2C, 4C, 1Cx, 2Cx, etc.
Unless stated otherwise in a text, then the 1C-value should be understood in the original sense (Bennett and Smith, 1976) as the DNA content of the unreplicated reduced chromosome complement. It follows from arithmetical rules that it must be equal to half of the unreplicated non-reduced (zygotic, diplophasic) complement, which has a 2C content. This applies to both even- and odd-numbered polyploids. Thus, the 1C-value of a triploid sporophyte (2n = 3x) is half of the 2C-value as measured, for instance, in a root tip telophase nucleus of that plant. This also makes biological sense insofar as the arithmetical mean of the four meiotic products conforms to this value.
Future use of the terms to describe DNA amounts would be rapidly stabilized if the editors of scientific journals and referees could ensure that these simple rules are strictly observed. Thus, phrases such as ‘the haploid genome size of A. thaliana is …’ or ‘the haploid genome size of Capsella bursa-pastoris is …’ are prone to cause misunderstanding. Instead, the correct phrases would be ‘the 1C-value of A. thaliana is …’, ‘the 1Cx-value of Capsella bursa-pastoris is …’, or the ‘1C-value of Capsella bursa-pastoris is …’. (Note: A. thaliana has 2n = 2x = 10, C. bursa-pastoris has 2n = 4x = 32.)
Table 1 gives a summary of these basic proposals and shows the formal consistency of the new terms with current usage of existing terms. The conventions suggested here cover general fundamental aspects relating to genome size in plants and animals, but do not treat in detail cytogenetic particularities, such as haploidy in sporophytes, hybridization at the same and different levels of polyploidy, individual Cx-values of allopolyploids and hybrids, anorthoploidy and permanent anorthoploidy, gametophytes and endosperm under conditions of apomixis, sex chromosomes, B-chromosomes and aneuploids, and chromatin diminution. The formalization of these situations needs minor extensions of the present scheme for using C-terminology, to be treated in a future paper (Greilhuber et al., unpubl.).
|Chromosome number designation||x||n|
|Covering term for genomic DNA content||Genome size||Genome size|
|Kinds of genome size||Monoploid genome size||Holoploid genome size|
|Short terms quantified||1Cx, 2Cx, etc.||1C, 2C, etc.|
|Chromosome number designation||x||n|
|Covering term for genomic DNA content||Genome size||Genome size|
|Kinds of genome size||Monoploid genome size||Holoploid genome size|
|Short terms quantified||1Cx, 2Cx, etc.||1C, 2C, etc.|
As already noted in the Introduction, a glossary of terms used in the present text is given in Appendix 1.
The numbered items defined below are given in a logical sequence that keeps related terms together, which an alphabetical order would disrupt. However, an antecedent index in alphabetical order, giving the number of each item defined, allows these terms to be located quickly.
Index in alphabetical order of terms defined:
Alternation of generations (13)
Alternation of nuclear phases (10)
Chromosome complement (1)
Chromosome set (2)
Genome (nuclear) (17)
Genome size (22)
Holoploid genome (19)
Monoploid genome (18)
Nuclear DNA amount (21)
Nuclear DNA content (21)
Polyploid genome (20)
Definitions of terms in a logical order
(See index above for terms in alphabetical order)
Chromosome complement (Darlington, 1932): The endowment of an organism with chromosomes as typically found after fertilization (in number 2n) or after meiosis (in number n).
Chromosome set (Dyer et al., 1970): the chromosomes of a monoploid genome, their number being indicated by x.
x: symbol for the chromosome number of the monoploid genome and for the chromosome base number in a generatively polyploid series of related organisms.
n: symbol for the meiotically reduced (haplophasic) chromosome number of any organism, generatively polyploid or not.
2n: symbol for the non-reduced (diplophasic, zygotic) chromosome number.
Haploid: (1) the lowest recognized level of generative polyploidy in haplophase, where n = x (e.g. a ‘haploid’ moss); (2) the meiotically reduced (haplophasic) chromosome number n.
Diploid: (1) level of generative polyploidy in haplophase, where n = 2x (e.g. a ‘diploid’ moss); (2) the lowest recognized level of generative polyploidy in diplophase, where 2n = 2x (e.g. a ‘diploid’ grass); (3) the non-reduced (zygotic, diplophasic) chromosome number 2n.
Polyploid: (1) level of generative polyploidy in haplophase, where n is a multiple of x; (2) level of generative polyploidy in diplophase, where 2n represents multiples of x higher than 2x; (3) shortcut for somatically polyploid, endopolyploid or endoreduplicated.
Endopolyploid: status of nuclei that have undergone endocycles of replication.
Alternation of nuclear phases: alternation of n and 2n by meiotic reduction and fertilization.
Reduced: in nuclear phase with chromosome number n (haplophase).
Non-reduced: in nuclear phase with chromosome number 2n (diplophase).
Alternation of generations (‘primary a. of g.’): alternation of gametophyte and sporophyte(s), usually but not necessarily connected with alternation of nuclear phases. (Exceptions, e.g. in apomicts, in which non-reduced embryo-sacs alternate with non-reduced sporophytes.)
Sporophytic: belonging to the sporophyte in plants, which is in general but not necessarily non-reduced (diplophasic).
Gametophytic: belonging to the gametophyte in plants, which is in general, but not necessarily reduced (haplophasic).
Endospermic: belonging to the endosperm in angiosperms, which has variable initial chromosome numbers, given as multiples of n, dependent on the embryo-sac type and variations of the fertilization process.
Genome (nuclear): covering term including the chromosome complement and its DNA characteristic for an organism and, in polyploid organisms, also a monoploid chromosome set of the complement and its DNA.
Monoploid genome: one chromosome set of an organism and its DNA having the chromosome base number x.
Holoploid genome: the whole chromosome complement (with chromosome number n) and its DNA characteristic for the organism, irrespective of the degree of generative polyploidy, aneuploidies, etc.
Polyploid genome: a generatively polyploid chromosome complement of an organism and its DNA with the chromosome number n being multiples of x, or being derived from such a multiple.
Nuclear DNA content or amount: the amount of DNA in any given cell nucleus irrespective of the state of replication, degree of endopolyploidy, etc.
Genome size: covering term for the amount of DNA in the holoploid genome of an organism and also in the monoploid constituent genomes in polyploids.
C-value: DNA content of a holoploid genome with chromosome number n; abbreviation for holoploid genome size.
1C-value: DNA content of one non-replicated holoploid genome with the chromosome number n. Also the half of a non-replicated holoploid non-reduced genome with the chromosome number 2n.
Cx-value: DNA content of a monoploid genome with chromosome base number x; abbreviation for monoploid genome size.
1Cx-value: DNA content of one non-replicated monoploid genome with chromosome number x.
Winkler (1920, p. 165) wrote:
‘Ich schlage vor, für den haploiden Chromosomensatz, der im Verein mit dem zugehörigen Protoplasma die materielle Grundlage der systematischen Einheit darstellt, den Ausdruck: das G e n o m zu verwenden und Kerne, Zellen und Organismen, in denen ein gleichartiges Genom mehr als einmal in jedem Kern vorhanden ist, homogenomatisch zu nennen, solche dagegen, die verschiedenartige Genome im Kern führen, heterogenomatisch …’.
We have made the following translation to English:
‘I suggest to use for the haploid chromosome set, which together with the appertaining cytoplasm constitutes the basis of the taxonomic unit, the term genome, and to name those nuclei, cells and organisms which contain a genome of the same kind more than once per nucleus homogenomatic, and those which contain different genomes in the nucleus heterogenomatic …’.
J.G. thanks the Austrian Science Fund for support by project P14607-B03, and Eva M. Temsch for technical support.
1Institute of Botany and Botanical Garden of the University of Vienna, Austria, 2Institute of Experimental Botany, Olomouc, Czech Republic and 3Royal Botanic Gardens, Kew, Richmond, Surrey, UK