Ethnically Tibetan women in Nepal with low hemoglobin concentration have better reproductive outcomes

Elevated maternal hemoglobin concentration associated strongly with lower lifetime reproductive success among ethnically Tibetan women at 3000-4100m in Nepal. These findings are consistent with the hypothesis that unelevated hemoglobin concentration is an adaptation shaped by natural selection resulting in low hemoglobin levels among Tibetans compared with visitors and Andean highlanders.


INTRODUCTION
Adaptations to the environment arise from evolution by natural selection on heritable phenotypes. These adaptations can be difficult to distinguish among many phenotypes distinctive of populations living in specific environments. Difficulties include establishing the heritable basis of a trait that may also have developmental and environmental influences and linking phenotypes with reproductive success. Social, economic, and public health features may contribute to variation in reproductive success, thereby making it challenging to isolate contributions by heritable biological traits. Detecting natural selection on heritable phenotypes by relating them to reproductive success is essential for understanding how adaptations become established in human populations.
High-altitude populations inform understandings of the adaptive process because virtually every physiological system responds to the severe and unavoidable stress of low partial pressure of oxygen (hypoxia) [1]. Two well documented and distinctive heritable phenotypes characterize high-altitude Tibetans.
The first trait is the percent of oxygen saturation of hemoglobin. It falls with increasing altitude, although there is a wide range of oxygen saturation at any altitude. The trait has significant heritability (h 2 ) [2, 3] and a major gene for percent of oxygen saturation has been inferred among Tibetans [4]. Candidate gene studies reported associations of saturation with oxygen homeostasis loci [5][6][7][8]. Relatively high saturation may benefit residents because it increases the oxygen content of arterial blood and somewhat lowers the physiological stress of high-altitude hypoxia. Tibetan women estimated to have the genotypes for higher oxygen saturation had more surviving children at 3900-4200 m [4], suggesting that natural selection favors higher oxygen saturation among Tibetans.
The second trait is hemoglobin concentration. The pathways for the well-known increase in hemoglobin concentration within days of acute exposure to high altitudes have been described [9]. However, this response varies significantly inter-individuals and inter-populations. Some highlanders, including those from the Andes, show the same high hemoglobin concentration phenotype as acutely exposed lowlanders. In contrast, Tibetan highlanders have relatively low hemoglobin concentration at similar altitudes [10]. Tibetans' lower average hemoglobin concentration relates to higher physical work capacity [11] and probably lowers the risk of thrombosis, Chronic Mountain Sickness or pre-eclampsia.
Polymorphisms in two main genes of the oxygen homeostasis pathway contribute to variation in hemoglobin concentration in Tibetans. The EGLN1 locus codes for an oxygen sensor and the EPAS1 locus codes for the alpha subunit of the hypoxia-inducible factor 2 (HIF2) transcription factor that induces dozens of target genes contributing to the homeostatic responses to hypoxic stress. Genomic studies report signals of natural selection at both EGLN1 and EPAS1. EGLN1 variants are associated with hemoglobin concentration in some studies [12, 13, males only], but not others [14,15]. EPAS1 variants are associated with hemoglobin concentration in four studies of Tibetans from different geographic areas [15][16][17][18]. Furthermore, the EPAS1 alleles associating with lower hemoglobin concentration have elevated frequencies among Tibetans [6,[19][20][21][22][23][24][25]. The 'Tibetan' alleles at EPAS1 protect against excessively high hemoglobin concentration, chronic mountain sickness, and low birthweight [26,27]. These findings suggest that Tibetans' relatively low hemoglobin concentrations are heritable adaptations reflecting a distinctive gene pool shaped by natural selection.
Therefore, the present study aimed to detect and account for nonheritable sociocultural factors affecting reproductive success and to test the hypotheses that an elevated percent of oxygen saturation of hemoglobin or a low hemoglobin concentration can be associated with higher reproductive success (pregnancies, live births, or children surviving to 15 years of age) among highland Tibetan women. The results provided little support for the elevated-oxygen-saturation hypothesis and strong support for the unelevated-hemoglobin-concentration hypothesis. This study links a distinctive oxygen homeostasis phenotype, hemoglobin concentration, to reproductive success in a contemporary Tibetan sample living under the stress of high-altitude hypoxia.

Study populations
The study communities in Nepal lie on the southern aspects of the Tibetan Plateau. Although they are citizens of Nepal, local people self-identify as ethnically Tibetan: they speak Tibetan dialects, practice forms of religion and social organization akin to those across the Tibetan Plateau, and retain the characteristic agro-pastoral and trading mode of subsistence common among highland Tibetans [28]. Fieldwork took place during the summer of 2012 in (1) the Nubri and Tsum Valleys in Gorkha District and (2) the ethnic Tibetan areas of Upper Mustang and Baragaon in Mustang District (Fig. 1). All regions lie along the border with the Tibet Autonomous Region, China. Nubri and Tsum are contiguous valleys where villages range in altitude from 2090 to 3787 m (6900-12 500'). The inhabitants descend from people who migrated from various nearby areas, including the Tibetan Plateau, beginning in the 14th century or earlier. Nepal incorporated Nubri during the 1850 s [29]. The Tibetan area of Upper Mustang lies along the international border while Baragaon is situated farther south.
Villages range in altitude from 2800 to 4200 m (9240-13 860'). Upper Mustang includes the Kingdom of Lo, home of a local hereditary leader whose lineage in the region dates to the 14th century. Upon its founding in the mid-18th century, the nation-state of Nepal incorporated the Kingdom of Lo [30,31].
The residents of these areas make a living with traditional agriculture (barley, wheat, buckwheat, potatoes, maize) and animal husbandry (yaks and yak-cow crossbreeds, sheep and goats, horses). They also engage in trans-Himalayan trade (timber, medicinal plants), seasonal migration for commodity trade in Indian and Nepali cities, government services, and tourism.

Study samples
Institutional review boards at Case Western Reserve University, Dartmouth College, the Nepal Health Research Council, Oxford Tropical Research Centre, and Washington University approved the study protocol. Participants provided informed consent.
The reproductive history sample included women who had completed or nearly completed childbearing because they were 40 years of age and older (by Tibetan reckoning, which corresponds to 39 in the western system). All were native to and born at or above 3000 m and had experienced marriage or pregnancy. About 1020 women provided interviews. The final reproductive history sample included 1006 women from 987 households.
The household survey sample included all households in the villages where reproductive history collection took place and provided information on education and wealth. About 1487 household surveys enumerated 8187 people in 63 villages.

Sample ascertainment
The household survey identified 332 women of appropriate age who were not included in the reproductive survey. 43 of those in Gorkha District and 57 women in Mustang did not meet study selection criteria. Some potentially eligible women were temporarily away from home. Supplementary Table S1 provides details on exclusions. The sample includes over 85% of age-appropriate women in these areas and, therefore, is unlikely to be biased.

Surveys
Research teams of six Nepali research assistants and two of the authors collected data in each study area. Authors GC and SC have been conducting long-term fieldwork in Gorkha and Mustang Districts, respectively. The research assistants were in their twenties, had a secondary education or more, were fluent in Nepali and the local Tibetan dialect and were born in a study village or nearby. The Gorkha District team included two men; the Mustang team included one.
Upon arriving in a village, the team explained to local officials and leaders the purpose of the study and described the data collection process. All parties readily endorsed the research project and provided a list of village households. Teams of two interviewers visited each household in Gorkha District. Similar teams visited households in some villages of Mustang. Other villages made data collection sites available at a temple, school, or community center. The authors rotated among the interview teams to maintain uniform data collection. A global positioning system measured longitude, latitude, and altitude upon arrival in a village (Garmin eTrex HC series, Garmin International Inc., Olathe, KS). Barometric pressure, temperature and relative humidity were recorded every morning between 6 and 7 am. The median altitude of residence was 3632 m. Supplementary Table S2 lists the residential altitudes for the women in the sample.
The authors trained the interviewers in the protection of human subjects and the collection of reproductive histories, household surveys, and genealogies. Two days of training before data collection included reviewing the informed consent and data collection documents. We discussed the principles of voluntary informed consent, the meaning, and justification of each reproductive history or household survey question, plausible answers and possible misunderstandings. Training also included mock interviews with one another and demonstration interviews for the whole team and with volunteers who were not part of the study. When teams adjusted interview protocols, they were encouraged to add explanatory notes to interview sheets. Training for collecting the biological measurements described below followed the same sequence.
The Tibetan calendar is in daily use in the study communities and enabled researchers, in conversation with study participants, to determine accurate ages and the timing of reproductive history events. This 12-year, repeating cycle of named animal years links inexactly to specific years in the western calendar. A child is considered to be one year old during the year of birth; he or she turns two at the next Tibetan New Year (February or March). For example, a woman with the animal year 'tiger' and born in the year 1950 was 63 years of age by Tibetan reckoning in the year 2012 and 62 years of age by western calculation. Interviewers carried a conversion table relating Tibetan ages to western calendar years and used it for age determination and dating events. If a woman reported a chronological age inconsistent with her animal year of birth, then we assumed the latter provided the correct age. In this article, we have converted Tibetan age into western age.
A pregnancy history comprised the core of the reproductive survey. It followed procedures that have been field tested in numerous settings [32]. Questions began with marital history details, then the first pregnancy, the animal year of birth, the outcome (live birth, stillbirth or miscarriage), sex, name, currently alive or dead, age at death and cause of death, if applicable. The reproductive history concluded with information about contraceptive use and, in Mustang District, menopausal status and age at menopause. Women openly discussed issues such as marriage and divorce, miscarriages, stillbirths, infant mortality, and contraception. Other women and sometimes husbands were present during interviews, occasionally clarifying responses or adding details.
Interviewers cross-checked responses for internal consistency and probed as needed for clarifying information. They read the pregnancy history back to the woman for confirmation and asked specifically about pregnancies before the first and after the last on the list to fill in possible omissions. A report of two children born in the same year, such as twins, prompted a follow-up. When a pregnancy history had a gap of three or more years between pregnancies, the interviewer probed for an explanation. About 236 women recalled another pregnancy when asked about specific gaps. The pregnancies added after probing amount to 6.1% of all pregnancies reported. There were no reports of induced or spontaneous abortions. Women probably reported recognized spontaneous abortions as miscarriages. Two women spoke to our research assistants 'off the record' about having had an induced abortion. They were uncomfortable reporting this for the record, so reported a miscarriage. The resulting misclassification of 2 out of 129 reported miscarriages may have introduced a small measurement error. The extent of induced abortion underreporting is unknown although it is likely to be very infrequent owing to the limited available health care. In our experience, local Tibetan doctors in the study areas did not make traditional Tibetan abortifacients [33], and women did not mention them. We did not ask about breastfeeding because it was the only option at the time these women had infants.
The household survey recorded the de jure population, that is, all individuals born into or married into each household and not officially separated. It included the name, animal year of birth, age, relationship to the household head, education and current whereabouts. The surveyors also asked about land and animal ownership, wage labor, remittances and other sources of household income. In Gorkha District, the main income source was the sale of Ophiocordyceps sinensis (a wild fungus in high demand for use in traditional Chinese medicine); in Mustang it was primarily remittances coming from family members living abroad, and from petty seasonal trade in urban Nepal and India. Household economic status was measured using a relative wealth approach. Knowledgeable insiders ranked all village households from one (wealthy) to five (poor) by household assets and other factors [34].
The interviewers returned the completed forms at the end of each day of data collection. The forms were checked for consistency of overlapping information and completeness and then photographed for backup.

Physiological measurements
Pulse, oxygen saturation of hemoglobin and hemoglobin concentration were measured noninvasively (Masimo Pronto-7 ß, Masimo Corporation, Irvine, CA). The device is accurate to ± 0.99 gm/dL compared to a laboratory reference device (http://masimo.com/pronto-7/index.htm accessed August 1, 2014). Women washed their hands and then sat still with one arm comfortably resting on a flat surface at about heart height, hand resting upon a reusable hand warmer to ensure adequate perfusion. A sensor placed on the forefinger obtained stable readings after a few seconds and then saved a single value for each trait. A reading was not obtained for 47 women, usually because enlarged or misshapen arthritic knuckles prevented the sensor from enclosing the finger. The women without physiological measurements had similar ages, ages at marriage, first and last pregnancy, numbers of pregnancies and live births as the women with those measurements. Therefore, we infer that there was no ascertainment bias in these measurements.
Re-measuring 51 women several months after the interview assessed the reliability and repeatability of the physiological measures. Supplementary Table S3 presents the average pulse, oxygen saturation, and hemoglobin concentration at the first and second measurements for 51 women. Intra-class correlation coefficients (ICC) evaluated the reliability of the repeated measurements [35]. The average difference in hemoglobin concentration evaluated the agreement between the measurements made at different times [36]. Based on the outcomes, the physiological measurements had good to excellent reliability and agreement [37].

Sample characteristics
Three directly obtained dependent variables measure reproductive success and offspring survival: numbers of pregnancies, live births and children surviving to the age of 15 years. We chose 15 years because that has been used to indicate reproductive age [38]. Reproductive success varied widely at all residential altitudes ( Fig. 2). An average of 5.6 pregnancies resulted in 5.4 live births and four children surviving to the age of 15 (Table 1). Sixtyfive percent experienced the death of at least one child before 15 years of age. Numbers of pregnancies and live births correlated very highly (r = 0.97) and in turn correlated highly with the number of children surviving to 15 years of age (r = 0.72 and 0.74, respectively, all P < 0.1EÀ6). Right censoring occurred in the measure of offspring survival because some women had children born within the past 15 years; the study terminated before these children reached 15 years of age. Therefore, offspring survival analysis included only the 774 women who reported one or more live births born 15 years or more before the study.
The women averaged 55 years of age (Table 1). Reproductive careers began and ended at average ages of 24 and 37 years with the first and last pregnancies/births. Two percent (n = 20) of evermarried women reported no pregnancies. Just under five percent had twin pregnancies, and more than 10% experienced a stillbirth or miscarriage (Supplementary Table S5).
The analysis considered 20 independent variables as possible contributors reproductive success. These variables measured social, cultural, economic, and public health factors' influence on reproductive success independent of the physiological variables of interest. Supplementary Table S4 lists the variables, and  Supplementary Table S5 provides descriptive statistics.
The independent variables comprise four groups: direct determinants of exposure to intercourse, direct determinants of susceptibility to conception and successful gestation, indirect determinants of fertility and physiological phenotypes. Direct determinants are variables 'that must always be operating at some level if reproduction is to occur at all.' [39 p. 68]. Indirect determinants influence fertility by operating through direct determinants [39][40][41].
On direct determinants of exposure to intercourse, 95% of the women had been married at least once (Table 1). 60% had a single marriage throughout the ages of 25-40 years (categorized as continuously married) and probably had the highest potential exposure to intercourse (Table 1 and Supplementary Table S5).
In regard to other indicators of direct determinants, most women had not used contraception (Table 1) [28]. Older women reported that contraception had not been available locally during their childbearing years. Contraception became available in Mustang District during the early to mid-1990s, with the extension of safe motherhood government programs. In the late 1990s, the establishment of several non-governmental organization (NGO)-funded clinics added sources of care. Modern contraception became locally available in Nubri and Tsum sub-districts of Gorkha District in 2009 with the founding of non-governmentalorganization-sponsored health posts. This history explains why 38% of Mustang women had used contraception as compared with only 16% of Gorkha District women.
As for indirect determinants of reproductive success, roughly 5% had never married, 7% had a cross-cousin marriage, and roughly 11% had a polyandrous marriage (Table 1).
Physiological phenotypes included pulse, the percent of oxygen saturation of hemoglobin, and hemoglobin concentration (Table 1). No hemoglobin concentration exceeded 19 gm/dL, the threshold for pathologically elevated hemoglobin concentration that is common in Andean high-altitude populations [42]. 11% had hemoglobin concentration below the 12.3 gm/dl used as a cut-off for anemia at low altitudes [43]. Neither hemoglobin concentration nor pulse correlated with altitude. Percent of oxygen saturation of hemoglobin  correlated negatively with altitude. The range of variation at each altitude widened to include women with lower saturation at the higher altitudes (Fig. 3). Higher oxygen saturation, a sign of less physiological stress, correlated with lower hemoglobin concentration and lower pulse. These correlations lead us to include pulse in the independent variable lists although we had no a priori hypotheses associating pulse with reproductive success.

Statistical analyses
The analyses excluded the 20 women who had been married yet had never been pregnant; we assumed that 2% of the sample had primary infertility for reasons unrelated to altitude adaptation because the percent is similar to that reported for other samples. For each count outcome, the numbers of pregnancies, live births, and   (1) below and their possible two-way interactions. To overcome over-parameterization, we included only interactions for which the individual variable itself x i was selected by an initial stepwise selection of 20 variables. These candidate influential factors were connected as a linear predictor inside the parametric model fitting: where a Ã x 0 ¼ a 1 Ã x 1 + . . . + a 20 Ã x 20 , a i 's were estimated and could differ in analyses with different dependent variables, a i not selected in the initial model selection was set to be zero, b = (a 21, . . .) were the coefficients of interaction terms and also estimated using data, and candidate factors x = (age, age at first pregnancy and birth, age at last pregnancy and birth, altitude of residence, household relative wealth rank, hemoglobin concentration, percent of oxygen saturation of hemoglobin, pulse, living in natal village, continuously married throughout the ages of 25-40 years, schooling of woman and her husband, household of residence after marriage, marital status, miscarriages, stillbirths, district of residence, sub-district of residence, twin pregnancies, type of marriage, contraceptive use).
To cross validate findings from our parametric model fitting by a Poisson regression analysis, we also conducted a non-parametric procedure using the Classification and Regression Tree (CART). Comparing the cross validated deviances of 'prune' and 'shrink' trees determined the number of nodes and selected the best CART model. The significant covariates chosen by the best Poisson models were consistent to those found by the corresponding tree models, which confirms the choice of the final Poisson model. The results of the Poisson analysis are presented and examined in detail below while the results of the CART analysis appear in Supplementary Online Figures S1-S3.
The Poisson analyses detected very large effects of variables such as marital status and the relative wealth rank of a woman's household. For example, women from the wealthiest households averaged 6.5 pregnancies as compared with 4.4 for women from the poorest households. However, neither hemoglobin concentration nor percent of oxygen saturation correlated with ages at first or last pregnancy; mean values did not vary across marital status, marriage type or relative wealth rank categories (data not shown).
Women with more or fewer pregnancies had more insert or fewer opportunities for successful or unsuccessful outcomes regardless of physiological characteristics. Therefore, we addressed the influence of the physiological variables from two additional perspectives by choosing a measure of reproductive success independent of the numbers. We calculated the proportions of successful pregnancies (the number of live births divided by the number of pregnancies) and surviving children (number of children surviving to 15 divided by the number of live births). Binomial regression used both the numerators and denominators as outcomes; that model accommodated all the women in the sample. Also, a linear model in logit (log-odds) scale for the proportions themselves accommodated the subgroups of women who had some pregnancy loss or loss of children. The linear predictors described above in Equation (1) were considered in both the binomial regression and the regression in logit scale to find the important factors contributing to the outcomes in proportions.
The model fit of Poisson regression was examined using residual deviance plots, and the Quasi R 2 and adjusted Quasi R 2 defined below Quasi R 2 and adjusted Quasi R 2 defined here are analogous to the R 2 or adjusted R 2 typically provided with a linear regression fit. However, the Quasi R 2 , do not indicate the proportion of variability in the outcome variable explained by the independent variables (Supplemental Table 6). The binomial regression and linear regression on logit scale was examined using likelihood ratio test (LRT) and the p-value of the test is provided to show the goodnessof-fit.
Descriptive tables and text report means and standard deviations or percentages. We adopted a critical value of P < 0.05 to indicate statistical significance and P < 0.01 to indicate strongly significant.

RESULTS
At the time of data collection in 2012, the women who comprise our sample reported 5667 pregnancies starting in 1943 and 3986 children who had survived to 15 years of age or older. We consider the influence of social, economic and public health factors as well as physiological characteristics to account for the variation in reproductive success (Tables 2-8).

INDIRECT INFLUENCES ON FERTILITY AND OFFSPRING SURVIVAL
Indirect influences included relative wealth rank and residence district or sub-district. Relative wealth ranks 1 (highest) to 5 (lowest)  showed a gradient of effect sizes from the most to the fewest pregnancies and live births (Tables 2, 3 and 8). Residence in Mustang District was associated with fewer live births and pregnancies. Women in the Baragaon sub-district of Mustang had more children surviving to age 15 (Table 6), likely reflecting the longer history of nearby medical care, including childhood vaccinations.

Direct influences on fertility and offspring survival
Exposure to the risk of intercourse We used marital status and marital type to assess the risk of intercourse. Never-married women had the lowest reproductive success: an average of 1.9 ± 1.43 (n = 55) pregnancies compared with 6.0 ± 2.75 (n = 646) among those married at the time of the survey. However, never-married women had the same probability of a successful pregnancy outcome and child survival to 15 years of age as those of other marital statuses. Cross-cousin marriage (to the son of mother's brother or father's sister) to one husband associated with a lower probability that a pregnancy became a live birth ( Table 5). The effect was smaller among women who were continuously married to cross-cousins.

Risk of conception
Contraceptive use did not associate with pregnancies or live births. It related to more children surviving to the age of 15 and with a higher probability that a live birth survived to 15 (Tables 6 and 7). 87% of the contraceptive users wanted to stop pregnancy, explaining that they had enough children (many also mentioned the cost of raising children and concerns for their own health) while only 7% reported using contraception to space births.

Successful gestation
Twin pregnancies, miscarriages, and stillbirths related to more pregnancies. The lack of association with live births or surviving  children suggests that the additional pregnancies simply replaced the pre-natal losses (Tables 2, 3 and 6). Shorter intervals between pregnancies partly accounted for the direct relationship. If a first and second pregnancy both ended in live births, the modal interval between pregnancies/births was two years. However, when a first pregnancy ended in a miscarriage or stillbirth, the modal interval between the first and second pregnancy was one year. The intervals between births 2 and 3 or 3 and 4 showed the same acceleration of pregnancies following a prenatal loss.

Life history
Age at first and last pregnancy both related strongly to reproductive success. Older age associated with a lower probability that a live birth survived to 15 years of age (Table 7). A later first birth also conferred a strong reproductive disadvantage measured as a lower probability that a pregnancy progressed to a live birth, fewer pregnancies, live births, and surviving children (Tables 2-6).
On the other hand, a later last birth conferred a reproductive advantage measured as more live births and more children surviving to adulthood (Tables 3 and 6). Higher rates at which a pregnancy became a live birth or a live birth survived to 15 years partly explained this advantage (Tables 5 and 8). Later last births also partly accounted for the association of miscarriages and stillbirths with more pregnancies. Among those reporting zero, one or two miscarriages, last pregnancies occurred at 37 ± 6, 38 ± 5.9 and 40 ± 5.2 years of age, respectively (one-way ANOVA, F (2, 977), P = 0.001). The combination of shorter time between pregnancies and later end of reproduction resulted in more pregnancies overall.

Physiological phenotypes
We hypothesized that higher percent of oxygen saturation of hemoglobin and lower hemoglobin concentration associated with reproductive success. Percent of oxygen saturation did not associate with these outcome measures. Higher hemoglobin concentration associated strongly with a lower probability a pregnancy progressed to a live birth (Table 4).
Taking a different analytical approach by comparing women from the extremes of reproductive failure and success provides additional support for the regression analysis finding that hemoglobin concentration influenced reproductive success. Six women reported that none of their pregnancies became live births, 208 reported pregnancy loss and 720 reported that all their pregnancies became live births. Percent of oxygen saturation and pulse did not vary among those three groups. However, hemoglobin concentration lowered with greater reproductive success: from an average of 14.7 ± 1.5, to 14.2 ± 1.3, and to 13.7± 1.5 gm/dl among women with no, intermediate and complete success in converting pregnancies to live births (one-way ANOVA F(2, 860) = 6.153, P < 0.01).
Continuing with the approach of comparing extremes, 26 women reported that none of their live births survived to 15 years, 539 reported some loss and 342 reported that all their live births survived 15 years. Pulse did not differ among the three groups. However, the average percent of oxygen saturation was 1.7% higher in the group with complete child survival, and the average hemoglobin concentration was 0.9 gm/dl lower as compared to the group reporting no offspring survival. The average oxygen saturation increased with better reproductive success: from 84.4 ± 5.9, 87.2 ± 4.6-87.9 ± 4.3% (one-way ANOVA F (2, 860 = 7.228, P < 0.001). The average hemoglobin concentration decreased with reproductive success: from 14.7 ± 1.6, to 14.0 ± 1.4, and to 13.6 ± 1.6 gm/dl (one-way ANOVA F (2,860 = 6.153, P < 0.01). The 26 women with no surviving children were another tiny fraction of the sample, yet their characteristics contribute to understanding those among whom selection may be strongest.
Importantly, the two sub-groups with the most successful reproductive outcomes identify the hemoglobin concentration associated with most favorable reproductive outcomes. It was 13.6-13.7 gm/dl, slightly lower than the 13.8 ± 1.5 gm/dl sample average.

CONCLUSIONS AND IMPLICATIONS
Women with average hemoglobin concentration had a higher probability that pregnancies progressed to live births compared with those having elevated hemoglobin concentration. Furthermore, women whose pregnancies all ended in live births or whose children all survived had hemoglobin concentrations near the sample mean. The consistent pattern supports the hypothesis that natural selection is acting against elevated hemoglobin concentration, or something closely correlated with it, among high-altitude resident Tibetan women.
As for the other physiological measurements, higher pulse related only marginally to a higher probability of child survival. Women with live births but no children surviving to 15 years of age had significantly lower oxygen saturation than others. That result was broadly consistent with a different study of the reproductive success of younger (20-59 years of age) Tibetan women at 3800-4200 m. It found that women with inferred genotypes for lower oxygen saturation had fewer living children [4]. The agreement of these two studies with different designs, age ranges, and analyses both finding that higher oxygen saturation associated with higher offspring survival bolsters the present study's weak support for the hypothesis that natural selection favors higher oxygen saturation.
Indirect and direct factors influenced reproductive success. Importantly, physiological phenotypes did not vary with indirect or direct factors such as relative wealth or marital status. Based on the estimates of effect sizes provided by the regression equations, the largest reproductive disadvantages accrued to women who never married or had a relatively late first birth. Based on  [39,41] and should be a crucial methodological consideration when designing studies of natural selection in humans.
Limitations of this study include relying on retrospective data collected at one interview and a lack of vital records or other documentation of demographic events. Recall of events from decades earlier may have been incomplete. However, women 70 years of age and older reported as many events as the total sample. Those 104 women reported an average of 5.8 pregnancies and 3.7 children surviving to reproductive age; 8% recalled miscarriages, and 13% reported stillbirths. Those values are essentially the same as those in the total sample (Table 1, Supplementary Online Table  S5). Any under-reporting appears to be unrelated to age. Another indicator of thorough recall and reporting is rates of stillbirths and miscarriages. The rate of reported stillbirths (27.4/1000 pregnancies) in this study is very similar to the rate of 28.9/ 1000 in a multidisciplinary, 4-year prospective study in seven low-resource countries [44]. The sex-ratio at birth was 1.06 (#males/#females) indicating no female shortage. Thus, we conclude that the women accurately recalled their reproductive histories.
A methodological limitation arises from the technical specifications of the Masimo-Pronto-7ß used to measure hemoglobin concentration noninvasively. The device measures to ± 0.99 gm/ dl accuracy of the true value provided by a reference machine. The range around the true value introduces measurement noise. On average, there is little difference between values obtained using this noninvasive device and a reference laboratory machine [45]. We chose to accept this limitation to remove any barriers to participation possibly posed by blood draws that would have provided samples for more accurate results.
The extent to which physiological phenotypes collected postreproductively reflect those during the reproductive period is unknown. A small subset of 27 women from Baragaon participated in a 1981 study by one of the authors [46,47]. Hemoglobin or pulse measured in 1981 did not correlate with that in 2012, perhaps due to different equipment; maybe these phenotypes are not canalized throughout adulthood, are affected by undetected disease, or perhaps they correlate with another trait under selection. Also, the extent of the phenotypic variation attributable to genetic differences is not known for this sample.
This cross-sectional study cannot directly evaluate whether the association of unelevated hemoglobin concentration with better reproductive success reflects a benefit, as hypothesized, or a cost in the form of maternal depletion owing to multiple pregnancies [48,49]. Two percent of the 49 women with 10 or more pregnancies were anemic as compared with 6% of the 68 women with a single pregnancy [50]. That is the opposite of expectation if maternal depletion accounted for the association detected here.
As for strengths, three authors with extensive experience conducting reproductive history surveys in Tibetan societies designed the survey and its administration to collect the best quality data in these circumstances [32]. Interviewers were fluent in the local dialects of Tibetan, they used the Tibetan calendar system, and they followed up on potential omissions or errors. Many study participants were already familiar and comfortable with members of the interviewing team. Other strengths of the study include the large sample size, the very high ascertainment of eligible women, the wide age range, multiple data collection sites and the variety of indirect and direct determinants of fertility considered. The sample of 1006 was large enough to detect a range of effect sizes and included a wide scope of realistic life circumstances. The two districts and sub-districts differed in the timing of reproductive history events, income sources, and public health infrastructure. The nearly complete ascertainment avoided sample bias, for example toward more or less fertile women. Overall, the sample selection reduced the chance that a random unknown confounder biased the results. Another strength was quantification of the exposure to the environmental agent of natural selection-high-altitude hypoxia-by including altitude of birth and of current residence as independent variables. Published census data along with a previous analysis of fertility in Gorkha District led to the expectation of very low fertility after the age of 40. Our findings agreed. Maternal age was 40 years or older for just 381 births (7%) of the 5,378 live births reported altogether. Only nine women ranging from 39 to 51 years of age (0.01% of the sample) were pregnant at the time of the interview. Thus the reported reproductive histories measure lifetime reproductive success for nearly all the sample.
These findings identified a public health concern arising from the cultural practice of cross-cousin marriage (7.4% of the women), which lowered the probability a pregnancy progressed to a successful live birth. The cost was highest among short crosscousin marriages, intermediate among stable cross-cousin marriages and did not occur among polyandrous cross-cousin marriages (n = 15, all their pregnancies became live births).
These results suggest that natural selection in Tibetan populations resulted in dampening the sustained elevation of hemoglobin concentration characteristic of the acclimatization response to visitors and Andean highlanders. The outcome was un-elevated or little-elevated hemoglobin levels compared with healthy lowaltitude samples [50]. The phenomenon of selection favoring phenotypes similar to non-stressed populations is known as cryptic adaptive evolution [51].
A prenatal cost of elevated hemoglobin concentration has been reported previously at high and low altitudes. Elevated hemoglobin concentration during gestation relates to poorer pregnancy outcomes [52][53][54][55][56][57]. For instance, the absence of hemodilution during pregnancy may double the risk of stillbirths [58].
In summary, this study of ethnically Tibetan women residing at 3000-4100 m in Nepal who have completed reproduction took into account sociocultural and public influences on fertility to detect physiological associations. The results provide little support for the hypothesis that relatively high oxygen saturation is an adaptive phenotype. The results strongly support the hypothesis that unelevated hemoglobin concentration (within the normal sea level range), or a closely correlated trait, is an adaptive phenotype in the Tibetan high-altitude population.

SIGNIFICANCE
A crucial test of the hypothesis of natural selection involves linking the range of variation of phenotypes to fertility and offspring mortality patterns. To date, linking genotypes, physiological phenotypes, and reproduction is uncommon [59][60][61]. To establish such a link, this study took into account nonheritable social, cultural, and public health covariates whose effects may fluctuate from positive to negative during a woman's lifetime or between generations. The study added to the classic measures of reproductive success-counts of reproductive events. Reproductive success outcomes measured as probabilities and rates incorporated both sets of variables for the entire sample. Although women had sociocultural characteristics associated with fewer pregnancies, the physiological phenotypes of pulse, oxygen saturation, and hemoglobin concentration were distributed throughout the sample regardless of the social, cultural, and public health covariates.
An important review on measuring natural selection in contemporary populations noted that, 'The major unresolved issues are how to deal with cultural evolution and gene-culture evolution.' [62 p. 618]. Our study illustrates that one effective way to deal with "culture" is to measure it systematically. The classic demographic model of indirect and direct determinants of fertility provides a structured approach to such measurements than can be tailored appropriately using ethnographic understanding of the sample. The institution of Tibetan fraternal polyandry is relevant for our sample, while education is not, for instance. Existing studies of natural selection in contemporary populations take into account a few 'confounders' such as education, date and place of birth or religion [e.g. 60,63,[64][65][66][67]. Two considered socioeconomic status [68,69]. Interestingly, status as 'ever married' or 'currently in a heterosexual relationship' was considered in two studies [69,70], while none of these mentioned accounting for marital status.
Study samples will inevitably have variation in indirect and direct determinants of fertility. Measuring and accounting for them can change estimates of the relative fitness of phenotypes. A comparison of women in our sample with average hemoglobin concentration and those with hemoglobin one standard above the mean illustrates this point. Consider two women of average age at first pregnancy, continuously married to one man who was not a cross cousin, and who never used contraception. 95.1% of the pregnancies became live births for the hypothetical woman with average hemoglobin levels as compared with 93.2% for the woman with elevated hemoglobin (a 1.9% difference). If we allow cousin marriage to two women with the same set of characteristics, then 76.2% of the pregnancies became live births for the woman with average hemoglobin level as compared with 69.3% for the woman with elevated hemoglobin (a 6.9% difference). That is, the cost of elevated hemoglobin varied according to that indirect determinant, implying that the strength of natural selection can vary depending on variation in such non-heritable traits. Natural selection 'in the wild' is generally weak, as discovered by a meta-analysis of published studies of non-humans that found that most estimates of directional selection gradients were between À1 and 1 [68][69][70]. Our estimate for hemoglobin concentration was consistent with a value À0.23 (Table 4). Our study made the assumption that natural selection is acting on hemoglobin concentration. While hemoglobin concentration was chosen based on the scientific understanding described in the introduction, we cannot exclude the possibility that an unmeasured trait correlated with hemoglobin concentration is the target of selection.
In summary, these results add to the body of evidence that phenotypic selection acting over one or two generations is weak, yet detectable. This study advances our understanding of natural selection among high-altitude Tibetans by linking the heritable phenotype of unelevated hemoglobin concentration, or a correlated trait, to greater reproductive success, consistent with the hypothesis of an adaptation shaped by natural selection.

supplementary data
Supplementary data is available at EMPH online.