Abstract

The program “Nb_HetEx” estimates the effective number of breeders (Nb) that produced the sampled progeny based on genotype counts contained in that sample. When the number of breeders is very small, there is an excess of heterozygotes in their progeny: the smaller the number of breeders, the larger the heterozygote excess. The Nb_HetEx program also estimates Ne through the temporal method.

Effective population size per generation (Ne) and effective number of breeders (Nb) are important quantities in population and conservation genetics, especially significant in monitoring endangered populations and species. However, they are difficult to estimate in natural populations, for which demographic characteristics often are not known. To date, 4 indirect methods have been suggested for estimating contemporary Ne or Nb from genetic data: the temporal (Waples 1989), linkage disequilibrium (Waples and Do 2008), allele rarefaction (Hedgecock et al. 2007), and heterozygote-excess (Pudovkin et al. 1996) methods. All these methods have rather low precision, though they are nevertheless often the only means available for inferring Ne or Nb. As they have different advantages and disadvantages, it seems advisable to use them in conjunction and to integrate the estimates thereby obtained.

The heterozygote-excess method is based on the observation that when the number of breeders is very small, an excess of heterozygotes is expected in the progeny (Robertson 1965; Rasmussen 1979; Falconer 1989). This method has enough power and precision to detect a finite Nb when the number of breeders is 30 or less, the progeny sampled numbers 200 or more, and the cumulative number of independent alleles exceeds 80. The method is applicable for situations in which progeny generation is produced by only a small number of parents (as might be the case in highly fecund marine species like mussels, oysters or some fish), or when trying to infer the parental numbers of a shoal of fish, or of a cohort of recruits in marine invertebrate settlements. Though the estimator is most applicable for the situation of random union of gametes (e.g., as might occur in some marine invertebrates or fish), it works for other mating systems (monogamous or polygamous pairings and polygyny) when the effective number of breeders is small (Nb ≤ 20), which is shown by Luikart and Cornuet (1999), Balloux (2004), and Pudovkin et al. (in preparation).

Program Description

The program Nb_HetEx estimates the effective number of breeders (Nb) that produced the sampled progeny based on genotype numbers contained in the processed sample: 

graphic
where D is the heterozygote excess (dij) averaged over all the alleles and loci: 
graphic
wj = sqrt(nj) × (kj − 1)/kj, nj is the number of individuals surveyed for the j-th locus, kj is the number of alleles at the j-th locus, dij is the heterozygote excess for i-th allele at the j-th locus: dij = (Hobij − Hexpij)/Hexpij. Hobij and Hexpij are the observed and expected heterozygosites, respectively, for the i-th allele at the j-th locus. The loci are assumed to be unlinked. Considering heterozygosities at multiallelic loci for each allele i, we count the number of genotypes that contain 1 i allele and 1 non–i allele, thus transforming the data into di-allelic cases. Correspondingly, the expected heterozygosities for the i-th allele will be 2 × pij × (1 − pij) × 2nj/(2nj − 1), where pij is the frequency of the i-th allele at the j-th locus in the sample. The rationale for the estimation method is given by Pudovkin et al. (1996). Statistical properties of the estimator are considered in Luikart and Cornuet (1999) and in more detail in Pudovkin et al. (in preparation).

The Nb_HetEx program also estimates Ne through the temporal method (the F-estimator module). We implemented “Pollack estimator” version of this method given by Waples (1989). In this case, 2 temporal samples of genotypes and the number of generations between samples should be provided.

The program also generates a bootstrap estimate of the probability (P) of obtaining a sample equal to or more deviant in heterozygote excess from Hardy–Weinberg expectation as the original sample. A somewhat different procedure generates bootstrapped confidence intervals (CIs) for the observed D and Nb(D) for P = 60%, 75%, 90%, 95%, and 99%. Bootstrapping is done through resampling of individuals from the original sample into replicate samples. Individuals are drawn from the original sample with replacement, and each replicate sample has the same size as the original sample. After a replicate is formed, all the parameters are calculated (as for the original sample). Limits for 95% CIs are the 2.5% and 97.5% percentiles of the sorted set of replicate D and Nb values. Other CIs (60%, 75%, etc.) are obtained similarly.

Though the bootstrap-generated CIs are not assessed in terms of coverage probability, the authors believed it useful to provide users with this option, which is so popular among scientists nowadays. If the parametric CIs do well agree with the bootstrap-generated ones, there would be more confidence in the estimates. A considerable disagreement will indicate that the results should be considered with more caution.

Input and Output

The data to be processed should be in a text file. They should conform either to GenePop (Raymond and Rousset 1995) format (a line with multilocus genotypes for each individual) or be grouped as genotype frequency summary. Examples for both formats are given in the Help file. Sample data that consist of genotype frequencies at 4 microsatellite loci in samples of adult and juvenile European flat oysters are taken from the study reported by Hedgecock et al. (2007). It is possible to process samples from several populations in a single run of the program.

The output (result) file contains allele frequencies for the processed sample, expected and observed heterozygosities at each locus, and 50% and 95% asymptotic CIs. Averaged over all the loci estimates of Nb(D) and CIs (for different P) are given, as well as the asymptotic standard error (SE) of D: SE(D)2 = (ΣΣwi × dij2wi − (ΣΣwi × dijwi)2) × (Σwi)2/((Σwi)2 − Σwi2)/Σ(ki − 1), where wi = sqrt(nj) × (ki − 1)/ki. Bootstrap-generated CIs and median Nb are given below the parametric estimates. As stated above, the bootstrap-generated estimates are considered only as complementary to the parametric ones.

At the bottom of the file a summary of the results is given, which is especially convenient if several samples were processed at once.

The file generated by choosing the F-estimator module also contains allelic frequencies at the 2 temporal samples, expected and observed heterozygosities, locus-by-locus estimates of fj, Ne(F), and 50% and 95% CIs. For the details of the temporal method (obtaining the values of fj, averaging of fj over the loci for obtaining F, etc.) consult the paper by Waples (1989).

Availability and Usage

The program operates under Windows XP. It does not require a special installation. The executable and auxiliary files should be placed into a separate folder on the hard disk. Note: this folder should not be on a shared disk; in that case the Help file (fe.chm) would not open properly. On clicking the executable file (fe.exe) a panel appears, containing buttons for navigation and operation of the program. A Help file available from this panel gives detailed instruction on the program use. The program is available either from ftp://ftp.dvo.ru/pub/Personal/NB-Estimator or on request by e-mail to axanka@iacp.dvo.ru.

Funding

Supported in part by Russian Science Support Foundation.

We would like to thank Dennis Hedgecock, Robin Waples, and Dmitri Zaykin for helpful comments

References

Balloux
F
Heterozygote excess in small populations and the heterozygote-excess effective size
Evolution
 , 
2004
, vol. 
58
 (pg. 
1891
-
1900
)
Falconer
DC
Introduction to quantitative genetics
 , 
1989
3rd ed
New York
Longman Scientific & Technical with Wiley & Sons, Inc
Hedgecock
D
Launey
S
Pudovkin
AI
Naciri
Y
Lapègue
S
Bonhomme
F
Small effective number of parents (Nb) inferred from a naturally spawned cohort of juvenile European flat oyster Ostrea edulis
Mar Biol.
 , 
2007
, vol. 
150
 (pg. 
1173
-
1282
)
Luikart
G
Cornuet
J
Estimating the effective number of breeders from heterozygote excess in progeny
Genetics
 , 
1999
, vol. 
151
 (pg. 
1211
-
1216
)
Pudovkin
AI
Zaykin
D
Hedgecock
D
On the potential for estimating effective number of breeders from heterozygote-excess in progeny
Genetics
 , 
1996
, vol. 
144
 (pg. 
383
-
387
)
Rasmussen
DI
Sibling clusters and gene frequencies
Am Nat.
 , 
1979
, vol. 
113
 (pg. 
948
-
951
)
Robertson
A
The interpretation of genotypic ratios in domestic animal populations
Anim Prod.
 , 
1965
, vol. 
7
 (pg. 
319
-
324
)
Raymond
M
Rousset
F
GENEPOP (version 1.2): population genetics software for exact tests and ecumenism
J Hered.
 , 
1995
, vol. 
86
 (pg. 
248
-
249
)
Waples
RS
A generalized approach for estimating effective population size from temporal changes in allele frequency
Genetics
 , 
1989
, vol. 
121
 (pg. 
379
-
391
)
Waples
RS
Do
S
LDNE: a program for estimating effective population size from data on linkage disequilibrium
Mol Ecol Res.
 , 
2008
, vol. 
8
 (pg. 
753
-
756
)

Author notes

Corresponding Editor: Robin Waples