Abstract

BACKGROUND

Knowing predictors of pregnancy in IVF is helpful for clinicians to individualize the treatment plans and improve patient counseling and for patients to decide whether to undergo infertility treatment. The aim of the study was to identify independent predictors of the chance of clinical pregnancy after a completed IVF/ICSI cycle (fresh plus cryopreserved embryos transferred from one stimulated cycle) and to compare the predictive value of important predictors identified.

METHODS

This was a single center, retrospective study of 2450 infertile women undergoing their first IVF treatment between 2002 and 2007. A bootstrapping stepwise variable selection algorithm was performed to identify independent predictors of clinical pregnancy chance from a list of 27 candidate variables. Multivariable logistic regression was used for assessing the effects of predictors. Proportion of explained variation analysis and concordance index were adopted to compare the predictive value of factors.

RESULTS

The following nine independent predictors were included in the final multivariable model: total number of good-quality embryos, total number of embryos, age, antral follicle count, fertilization rate, duration of infertility, endometrial thickness, number of 10–14-mm follicles and progesterone level on the day of hCG injection. The model was cross-validated internally in the training data and validated externally in an independent data with robust performance. The stratified analysis demonstrated that the total number of good-quality embryos was a better predictor of clinical pregnancy chance after a completed IVF/ICSI cycle than age for women <40 years, whereas age was a better predictor for women ≥40 years. The restricted cubic spline analysis revealed the relationship between the total number of good-quality embryos and log-odds of achieving a clinical pregnancy was nonlinear.

CONCLUSIONS

Quality and quantity of the whole embryos are the two most important predictors of the cumulative outcome in IVF/ICSI among independent predictors indentified. The importance of embryo quality on cumulative outcome in IVF/ICSI increases with increasing age.

Introduction

Infertility and sterility will be the third most serious disease worldwide in this century, estimated by The World Health Organization, after cancer and cardiovascular diseases. In China alone, more than 10 million couples need assisted reproductive technology (ART) and the infertility rate is still ‘on the rise’ (China Daily, 2010). Although in vitro fertilization (IVF) has become a commonly used ART since the first birth after IVF-embryo transfer on 25 July 1978, the IVF process still remains a difficult one for the infertile couples with low success rates on the first attempt. Furthermore, the high twin pregnancy rate arising from IVF has been recognized as a significant public health issue leading, in many countries, to policies encouraging or mandating increased use of single-embryo transfer (SET) (Hunault et al., 2002; Ledger et al., 2006; Ottosen et al., 2007; Roberts et al., 2009). These factors have led to a need to study further predictors and to evaluate the probability of a patient achieving a pregnancy as accurately as possible before and during the course of her IVF treatment. Although many studies have used multivariable models to identify predictors of IVF outcome (Elizur et al., 2005; Rhodes et al., 2005; Lintsen et al., 2007; Roberts et al., 2010), there is still no real consensus.

Most previous studies have analyzed the IVF outcome of either fresh or frozen–thawed cycles. However, what really matters to the infertile couples under IVF treatment is whether there will be a success or not after a completed treatment cycle which includes fresh plus cryopreserved embryos transferred from one stimulated cycle. Few studies (Roberts et al., 2011) have been done to address this question. Therefore, our study concentrated on the chance of achieving a pregnancy after the completed treatment cycle per couple treated and only considered the candidate variables collected before embryo transfer of the fresh cycle.

The objectives of our present study were first to build a parsimonious multivariable model and identify independent predictors of clinical pregnancy chance after a completed IVF/ICSI cycle and then to compare the predictive value of variables retained in the final model statistically.

Materials and Methods

Institutional approval

Institutional review board approval was not required because our ART unit is licensed and regulated by the Ministry of Public Health of P.R. China, and there were no interventions other than those for standard IVF treatment.

Study subjects

Between January 2002 and December 2007, 3033 patients attended the IVF unit at the Women and Children's Hospital of Guangdong Province, P.R. China, and underwent 3400 ovarian stimulation cycles. To collect the cumulative outcome of a completed IVF/ICSI cycle (fresh plus cryopreserved embryos transferred from one stimulated cycle), the patients were followed up until June 2009.

To meet the independence assumption for a logistic regression model, only the first available completed treatment cycle of each patient was analyzed. We also excluded: (i) cycles that involved oocyte or sperm donation or in vitro maturation; (ii) unstimulated cycles; (iii) cycles that result in neither fresh nor frozen–thawed embryo transfer in a completed treatment cycle; (iv) patients who had not become pregnant but still had frozen embryos left; and (v) cycles that involved blastocyst transfer, to keep embryo quality and quantity variables consistent and comparable for all patients. In total, 583 patients with 950 cycles were excluded. Thus, 2450 patients with one completed treatment cycle each were included in this study as the training data.

A total of 256 patients who attended the reproductive medicine center at Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, P.R. China, between January and May 2010 were collected under the same exclusion criteria. These patients were used as the independent validation data to evaluate the predictability of the final predictive model.

In vitro fertilization

Patients were started on one of three standard protocols: (i) long gonadotrophin-releasing hormone (GnRH)-agonist protocol; (ii) short GnRH-agonist protocol; or (iii) GnRH-antagonist protocol. Controlled ovarian hyperstimulation with gonadotrophins [recombinant follicle-stimulating hormone (rFSH) with or without human menopausal gonadotrophin] was then initiated and dictated according to the patients' response to stimulation which was evaluated by estradiol (E2) concentration and serial vaginal ultrasound examinations. When at least three follicles had developed with a diameter of ≥18 mm each, 10 000 units of human chorionic gonadotrophin (hCG) was administered. Oocyte retrieval was then performed 34–36 h after hCG administration, and fertilization was achieved with IVF, ICSI or 50% IVF + 50% ICSI in rare cases. Embryo quality was assessed by considering number of blastomeres, day of transfer and degree of fragmentation. Embryos with a normal cleavage rate (four cells on Day 2 and six to eight cells on Day 3) and <20% fragmentation were defined as good-quality embryos (Strandell et al., 2000; Lee et al., 2006). Embryos were transferred on either Day 2 or 3 after oocyte retrieval.

In this study, the outcome of interest was clinical pregnancy ever in a completed treatment cycle, defined as transvaginal ultrasound observation of intrauterine gestation sac, fetal pole and cardiac activity at 6–7 weeks of gestation. All other cycle outcomes were classified as not pregnant.

Statistical analysis

From literature review and clinical experiences, 27 candidate variables were identified. Continuous variables were categorized using generalized additive models (Hastie and Tibshirani, 1987) to enhance their applicability in clinical practice. Multicollinearity among the variables was assessed using correlation coefficients and variance inflation factor. The one with more clinical importance and bigger variance was selected. The resulting clinically relevant variables were entered into a bootstrap variable selection process (Austin and Tu, 2004). First, 5000 bootstrap samples were generated and stepwise logistic regression was then performed in each bootstrap sample with thresholds of P= 0.1 for both variable entry and variable elimination. Variables present in at least 3250 runs were chosen to construct the multivariable logistic regression model. This method determines the empirical distribution of a variable's likelihood of being included in the model, thereby quantifying the strength of evidence that a given variable is indeed a true independent predictor of cumulative outcome of IVF/ICSI. Forward and backward selection algorithms were also applied to cross-examine the stability of the final list of retained predictors and to avoid the potential bias from using one specific algorithm (Steyerberg et al., 2001; Beyene et al., 2009).

A receiver operating characteristic (ROC) curve along with the area under the ROC curve (AUC) or concordance index (c-index) was constructed to evaluate the performance of the multivariable model. To avoid over-fitting, a 10-fold cross-validation was used to cross-examine the model and compute the averaged c-index (Harrel, 2001). The original data were randomly partitioned into 10 subsamples first, 9 subsamples were used as training data and the remaining subsample was used as validation data to compute the ROC curve and c-index. This process was repeated 10 times, with each subsample used as validation data exactly once. The proportion of explained variation (PEV) analysis (Schemper, 1993; Mittlbock and Schemper, 1996) was used to compare the predictive value of the total number of good-quality embryos with other predictors using the bootstrap technique (n= 5000). PEV, also known as R2, is a widely used measure that is easy to interpret because it represents the amount of variation of outcome variable that can be explained by one or more predictors. Marginal PEV measures the contribution of a given factor to the model univariately, and partial PEV measures the decline in PEV after removing a factor from the final model. Furthermore, c-indexes were compared using χ2 tests to specifically compare the predictive capability of total number of good-quality embryos, total number of embryos and age in three age groups (<35, 35–40 and ≥40 years).

A second multivariable logistic regression with five-knot restricted cubic spline, with knot locations at five percentiles (5, 25, 50, 75 and 95%), was run to model the association between the total number of good-quality embryos as continuous variable and log-odds of achieving a clinical pregnancy while controlling eight other predictors. The linearity assumption was examined with the likelihood ratio χ2 test (Harrell, 2001). Afterward, a segmented multivariable logistic regression model adjusting eight other predictors, known as a ‘change point’ model (Ulm, 1991), was used to estimate the break point where the increasing rate in log-odds of achieving a clinical pregnancy per unit increase in the total number of good-quality embryos before this point differed from the rate after this point. The change point model is essentially a piecewise regression model but treats the trend changing points as unknown parameters.

Five-knot restricted cubic spline analyses were also used to explore the functional form of relationship between outcome, total number of good-quality embryos, number of embryos and age.

Statistical analyses and plotting of graphs were performed using the SAS 9.2 statistical package (SAS, Inc., Cary, NC, USA) and the R statistical package (www.r-project.org).

Results

We evaluated a single completed treatment cycle in IVF/ICSI among 2450 infertile women in our unit within the study period. A total of 1425 clinical pregnancies with 905 (63.5%) from fresh cycles were reported. The cumulative clinical pregnancy rates for IVF and ICSI were 59.8 and 54%, respectively, whereas 50% IVF + 50% ICSI yielded a slightly lower rate (52.3%). The patients' and cycle characteristics by cumulative clinical pregnancy are summarized in Table I. The two groups differed significantly for most of the variables studied. Among 256 patients in validation data, 74% received IVF treatment and 26% received ICSI treatment; the cumulative clinical pregnancy rate was 64%; the mean age was 32.1 ± 4.69 years, the mean number of embryos available for transfer was 7.61 ± 4.97 and the mean number of good-quality embryos was 4.96 ± 3.96.

Table I

Patient and cycle characteristics by cumulative pregnancy outcome

Variables Pregnant (n= 1425) Not pregnant (n= 1025) P-value 
Protocol 
 Long protocol 1009 (70.81%) 643 (62.73%) 0.0001a 
 Short protocol 369 (25.89%) 350 (34.15%) 
 GnRH antagonist protocol 47 (3.3%) 32 (3.12%) 
Insemination method 
 IVF 993 (69.68%) 667 (65.07%) 0.0873a 
 ICSI 223 (15.65%) 190 (18.54%) 
 50% IVF + 50% ICSI 56 (3.93%) 51 (4.98%) 
 NSA-ICSI 153 (10.74%) 117 (11.41%) 
Diagnosis of infertility 
 Tubal factor 809 (56.77%) 536 (52.29%) 0.0336a 
 Male factor 279 (19.58%) 221 (21.56%) 
 Endometriosis 40 (2.81%) 47 (4.59%) 
 Ovarian factor 47 (3.3%) 23 (2.24%) 
 Unexplained 81 (5.68%) 67 (6.54%) 
 Other reasons 169 (11.86%) 131 (12.78%) 
Type of infertility 
 Primary infertility 697 (48.91%) 518 (50.59%) 0.4139a 
 Secondary infertility 728 (51.09%) 506 (49.41%)  
Age (year) 30.55 ± 3.71 31.90 ± 4.30 <0.001b 
Antral follicle count 10 (1–45) 8 (1–40) <0.001c 
BMI (kg/m220.94 ± 2.79 21.04 ± 2.75 0.375b 
Basal serum FSH level (mIU/ml) 6.25 ± 1.93 6.55 ± 2.29 0.001b 
Basal serum E2 level (pg/ml) 36.21 ± 18.64 37.63 ± 17.20 0.055b 
Duration of infertility (year) 4 (0.1–18) 4.6 (0.4–18) <0.001c 
Mean ovarian volume (cm35.75 ± 2.73 5.42 ± 2.89 0.005b 
Total ovarian volume (cm311.49 ± 5.47 10.83 ± 5.78 0.005b 
E2 on hCG injection day (pg/ml) 3610.5 ± 2281.3 2958.3 ± 2051.5 <0.001c 
Progesterone on hCG injection day (ng/ml) 1.22 ± 0.74 1.25 ± 0.77 0.359b 
LH on hCG injection day (mIU/ml) 1 (0.2–19) 1.1 (0.2–23.6) 0.001c 
E2 on the day after hCG injection (pg/ml) 4286.7 ± 2839.5 3548.9 ± 2659.3 <0.001c 
Number of follicles [10, 14 mm) 3 (0–30) 3 (0–34) <0.001c 
Number of follicles [14, 18 mm) 7 (0–28) 6 (0–23) <0.001c 
Number of follicles ≥18 mm 3 (0–11) 2 (0–13) <0.001c 
Number of follicles <10 mm 6 (0–30) 5 (0–20) <0.001c 
Number of follicles ≥14 mm 10 (0–33) 8 (1–28) <0.001c 
Endometrial thickness (mm) 10.88 ± 2.12 10.40 ± 2.22 <0.001b 
Number of oocytes retrieved 17 ± 8.5 12.9 ± 7.2 <0.001c 
Number of oocytes fertilized 13 ± 6.9 8 ± 5.1 <0.001c 
Fertilization rate 0.75 ± 0.19 0.64 ± 0.24 <0.001b 
Total number of embryos 12.3 ± 6.8 7.6 ± 4.9 <0.001c 
Total number of good-quality embryos 6.2 ± 4.3 2.8 ± 2.8 <0.001c 
Variables Pregnant (n= 1425) Not pregnant (n= 1025) P-value 
Protocol 
 Long protocol 1009 (70.81%) 643 (62.73%) 0.0001a 
 Short protocol 369 (25.89%) 350 (34.15%) 
 GnRH antagonist protocol 47 (3.3%) 32 (3.12%) 
Insemination method 
 IVF 993 (69.68%) 667 (65.07%) 0.0873a 
 ICSI 223 (15.65%) 190 (18.54%) 
 50% IVF + 50% ICSI 56 (3.93%) 51 (4.98%) 
 NSA-ICSI 153 (10.74%) 117 (11.41%) 
Diagnosis of infertility 
 Tubal factor 809 (56.77%) 536 (52.29%) 0.0336a 
 Male factor 279 (19.58%) 221 (21.56%) 
 Endometriosis 40 (2.81%) 47 (4.59%) 
 Ovarian factor 47 (3.3%) 23 (2.24%) 
 Unexplained 81 (5.68%) 67 (6.54%) 
 Other reasons 169 (11.86%) 131 (12.78%) 
Type of infertility 
 Primary infertility 697 (48.91%) 518 (50.59%) 0.4139a 
 Secondary infertility 728 (51.09%) 506 (49.41%)  
Age (year) 30.55 ± 3.71 31.90 ± 4.30 <0.001b 
Antral follicle count 10 (1–45) 8 (1–40) <0.001c 
BMI (kg/m220.94 ± 2.79 21.04 ± 2.75 0.375b 
Basal serum FSH level (mIU/ml) 6.25 ± 1.93 6.55 ± 2.29 0.001b 
Basal serum E2 level (pg/ml) 36.21 ± 18.64 37.63 ± 17.20 0.055b 
Duration of infertility (year) 4 (0.1–18) 4.6 (0.4–18) <0.001c 
Mean ovarian volume (cm35.75 ± 2.73 5.42 ± 2.89 0.005b 
Total ovarian volume (cm311.49 ± 5.47 10.83 ± 5.78 0.005b 
E2 on hCG injection day (pg/ml) 3610.5 ± 2281.3 2958.3 ± 2051.5 <0.001c 
Progesterone on hCG injection day (ng/ml) 1.22 ± 0.74 1.25 ± 0.77 0.359b 
LH on hCG injection day (mIU/ml) 1 (0.2–19) 1.1 (0.2–23.6) 0.001c 
E2 on the day after hCG injection (pg/ml) 4286.7 ± 2839.5 3548.9 ± 2659.3 <0.001c 
Number of follicles [10, 14 mm) 3 (0–30) 3 (0–34) <0.001c 
Number of follicles [14, 18 mm) 7 (0–28) 6 (0–23) <0.001c 
Number of follicles ≥18 mm 3 (0–11) 2 (0–13) <0.001c 
Number of follicles <10 mm 6 (0–30) 5 (0–20) <0.001c 
Number of follicles ≥14 mm 10 (0–33) 8 (1–28) <0.001c 
Endometrial thickness (mm) 10.88 ± 2.12 10.40 ± 2.22 <0.001b 
Number of oocytes retrieved 17 ± 8.5 12.9 ± 7.2 <0.001c 
Number of oocytes fertilized 13 ± 6.9 8 ± 5.1 <0.001c 
Fertilization rate 0.75 ± 0.19 0.64 ± 0.24 <0.001b 
Total number of embryos 12.3 ± 6.8 7.6 ± 4.9 <0.001c 
Total number of good-quality embryos 6.2 ± 4.3 2.8 ± 2.8 <0.001c 

aPearson χ2.

bTwo-sample t-test.

cTwo-sample Wilcoxon test.

The bootstrap stepwise variable selection algorithm chose the following nine predictors that present in at least 65% of 5000 runs: (i) total number of good-quality embryos (100%); (ii) total number of embryos (99%); (iii) progesterone level on the day of hCG injection (92%); (iv) endometrial thickness on the day of hCG injection (92%); (v) antral follicle count (AFC) (88%); (vi) duration of infertility (77%); (vii) age (74%); (viii) fertilization rate (69%); and (ix) number of follicles measuring between 10 and 14 mm in diameter on the day of hCG injection (69%). Despite the existence of possible differences in the list of predictors selected in each run, both forward and backward selection algorithms determined the same final list of predictors with 65% of 5000 runs as the cut-off, which confirms the stability of predictors we identified.

Table II summarizes the following statistics from the multivariable model: P-values, odds ratios (ORs) and corresponding 95% CIs. The modified Hosmer–Lemshow goodness-of-fit Z-test statistics (Harrell, 2001) was 0.609 (P> 0.05), which suggested that the multivariable model was a good fit. Table III summarizes the predictive power of the nine predictors in terms of c-index (AUC) and marginal and partial PEVs. For the total number of good-quality embryos, the marginal PEV was 19.05% and the partial PEV was 5.16%, both of which were significantly higher than the corresponding PEVs of the eight other predictors (P-values <0.001). The c-indexes of the total number of good-quality embryos and total number of embryos are the highest among all the predictors.

Table II

Multivariable analysis of predictors

Predictors ORs (95% CI) P-value 
Age  0.055a 
 ≥35 0.672 (0.481–0.939) 0.020b 
 [30, 35) 0.939 (0.727–1.213) 0.628b 
 [28, 30) 1.023 (0.763–1.372) 0.880b 
 (0, 28)  
Antral follicle count  0.019a 
 ≥15 1.769 (1.208–2.593) 0.003b 
 [9, 15) 1.594 (1.161–2.190) 0.004b 
 [6, 9) 1.451 (1.056–1.994) 0.022b 
 (0, 6)  
Duration of infertility  0.050a 
 ≥8 0.660 (0.481–0.907) 0.010b 
 [5, 8) 0.876 (0.670–1.146) 0.334b 
 [2.5, 5) 0.965 (0.748–1.244) 0.782b 
 (0, 2.5)  
Progesterone on hCG injection day  0.025a 
 ≥1.2 0.796 (0.651–0.972) 0.025b 
 (0, 1.2)  
Number of follicles [10, 14 mm)  0.084a 
 ≥4 0.841 (0.690–1.024) 0.084b 
 [0, 4)  
Endometrial thickness  0.016a 
 ≥13 1.477 (1.099–1.985) 0.010b 
 [10, 13) 1.246 (1.020–1.521) 0.031b 
 (0, 10)  
Fertilization rate  0.035a 
 ≥65% 1.262 (1.017–1.566) 0.035b 
 <65%  
Total number of embryos  <0.001a 
 ≥15 2.619 (1.696–4.045) <0.001b 
 [10, 15) 2.223 (1.549–3.191) <0.001b 
 [5, 10) 1.590 (1.184–2.137) 0.002b 
 (0, 5)  
Total number of good-quality embryos  <0.001a 
 ≥9 15.501 (9.142–26.284) <0.001b 
 [6, 9) 9.872 (6.312–15.439) <0.001b 
 [4, 6) 5.765 (3.768–8.819) <0.001b 
 [1, 4) 3.499 (2.377–5.149) <0.001b 
 0  
Modified Hosmer–Lemshow goodness-of-fit test statistics Z = 0.6092, P-value = 0.5424 
Predictors ORs (95% CI) P-value 
Age  0.055a 
 ≥35 0.672 (0.481–0.939) 0.020b 
 [30, 35) 0.939 (0.727–1.213) 0.628b 
 [28, 30) 1.023 (0.763–1.372) 0.880b 
 (0, 28)  
Antral follicle count  0.019a 
 ≥15 1.769 (1.208–2.593) 0.003b 
 [9, 15) 1.594 (1.161–2.190) 0.004b 
 [6, 9) 1.451 (1.056–1.994) 0.022b 
 (0, 6)  
Duration of infertility  0.050a 
 ≥8 0.660 (0.481–0.907) 0.010b 
 [5, 8) 0.876 (0.670–1.146) 0.334b 
 [2.5, 5) 0.965 (0.748–1.244) 0.782b 
 (0, 2.5)  
Progesterone on hCG injection day  0.025a 
 ≥1.2 0.796 (0.651–0.972) 0.025b 
 (0, 1.2)  
Number of follicles [10, 14 mm)  0.084a 
 ≥4 0.841 (0.690–1.024) 0.084b 
 [0, 4)  
Endometrial thickness  0.016a 
 ≥13 1.477 (1.099–1.985) 0.010b 
 [10, 13) 1.246 (1.020–1.521) 0.031b 
 (0, 10)  
Fertilization rate  0.035a 
 ≥65% 1.262 (1.017–1.566) 0.035b 
 <65%  
Total number of embryos  <0.001a 
 ≥15 2.619 (1.696–4.045) <0.001b 
 [10, 15) 2.223 (1.549–3.191) <0.001b 
 [5, 10) 1.590 (1.184–2.137) 0.002b 
 (0, 5)  
Total number of good-quality embryos  <0.001a 
 ≥9 15.501 (9.142–26.284) <0.001b 
 [6, 9) 9.872 (6.312–15.439) <0.001b 
 [4, 6) 5.765 (3.768–8.819) <0.001b 
 [1, 4) 3.499 (2.377–5.149) <0.001b 
 0  
Modified Hosmer–Lemshow goodness-of-fit test statistics Z = 0.6092, P-value = 0.5424 

aP-value of type-3 χ2 test for each variable's overall effects after adjusting for the other variables.

bP-value of χ2 test between each variable's subgroups and reference group.

Table III

Relative importance of predictors retained in the multivariable model

 PEV marginal (%) PEV partial (%) AUC 
Total number of good-quality embryos 19.05 5.16 0.744 (0.726–0.763) 
Total number of embryos 15.07** 0.79*** 0.715 (0.695–0.734) 
Antral follicle count 6.15*** 0.39*** 0.632 (0.611–0.654) 
Fertilization rate 4.84*** 0.12*** 0.607 (0.588–0.626) 
Age 2.58*** 0.25*** 0.581 (0.559–0.603) 
Duration of infertility 1.06*** 0.30*** 0.554 (0.532–0.576) 
Endometrial thickness 0.81*** 0.31*** 0.548 (0.527–0.569) 
Number of follicles [10, 14 mm) 0.32*** 0.10*** 0.528 (0.508–0.548) 
Progesterone on hCG day 0.15*** 0.17*** 0.520 (0.500–0.540) 
 PEV marginal (%) PEV partial (%) AUC 
Total number of good-quality embryos 19.05 5.16 0.744 (0.726–0.763) 
Total number of embryos 15.07** 0.79*** 0.715 (0.695–0.734) 
Antral follicle count 6.15*** 0.39*** 0.632 (0.611–0.654) 
Fertilization rate 4.84*** 0.12*** 0.607 (0.588–0.626) 
Age 2.58*** 0.25*** 0.581 (0.559–0.603) 
Duration of infertility 1.06*** 0.30*** 0.554 (0.532–0.576) 
Endometrial thickness 0.81*** 0.31*** 0.548 (0.527–0.569) 
Number of follicles [10, 14 mm) 0.32*** 0.10*** 0.528 (0.508–0.548) 
Progesterone on hCG day 0.15*** 0.17*** 0.520 (0.500–0.540) 

***Significantly different from relative importance of number of good embryos at α = 0.0001 level.

**Significantly different from relative importance of number of good embryos at α = 0.001 level.

ROC analysis of training data, independent validation data and 10-fold cross-validations are presented in Fig. 1. The c-index of multivariable model was 0.778 (0.763–0.801) (Fig. 1A) and the bias-corrected c-index (Harrell, 2001) was 0.77 (bootstrap n= 5000). The averaged c-index estimate from 10-fold cross-validation was 0.77 (Fig. 1B). The closeness of these values supported the robust predictive ability of the multivariable model. At the optimal cut-off point of the probability of achieving a clinical pregnancy 0.537, the sensitivity was 0.763 (0.739–0.783) and specificity was 0.647 (0.617–0.676). The c-index for the independent validation data was 0.775 (0.712–0.838) (Fig. 1A). The sensitivity was 0.707 (0.634–0.772) and the specificity was 0.707 (0.607–0.789) at the same cut-off of 0.537.

Figure 1

(A) ROC curves for training data (solid line) and independent validation data (dashed line); (B) ROC curves of 10-fold cross-validation.

Figure 1

(A) ROC curves for training data (solid line) and independent validation data (dashed line); (B) ROC curves of 10-fold cross-validation.

Table IV reports the c-indexes of age, total number of embryos and good-quality embryos in different age strata (<35, 35–40 and ≥40 years). Total number of good-quality embryos possessed better predictive value than age in two age groups (<35 and 35–40 years, P< 0.001), whereas age achieved the best predictive value (c-index = 0.73) for patients ≥40 years although there were no statistically significant differences due to small sample size.

Table IV

Comparisons of AUC within age strata

Variable Age < 35 (n= 2057)
 
35 ≤ Age < 40 (n= 393)
 
Age ≥ 40 (n= 50)
 
 c-Index (AUC) P-value c-Index (AUC) P-value c-Index (AUC) P-value 
Total number of good-quality embryos 0.756 (0.735–0.777) <0.0001 0.761 (0.711–0.811) 0.0002 0.654 (0.477–0.831) 0.5426 
Total number of embryos 0.714 (0.692–0.737) <0.0001 0.762 (0.711–0.812) <0.0001 0.610 (0.424–0.796) 0.3123 
Age 0.540 (0.515–0.566)  0.619 (0.560–0.677)  0.734 (0.586–0.883)  
Variable Age < 35 (n= 2057)
 
35 ≤ Age < 40 (n= 393)
 
Age ≥ 40 (n= 50)
 
 c-Index (AUC) P-value c-Index (AUC) P-value c-Index (AUC) P-value 
Total number of good-quality embryos 0.756 (0.735–0.777) <0.0001 0.761 (0.711–0.811) 0.0002 0.654 (0.477–0.831) 0.5426 
Total number of embryos 0.714 (0.692–0.737) <0.0001 0.762 (0.711–0.812) <0.0001 0.610 (0.424–0.796) 0.3123 
Age 0.540 (0.515–0.566)  0.619 (0.560–0.677)  0.734 (0.586–0.883)  

Results of restricted cubic spline analyses are reported in Fig. 2. Figure 2B shows a decreasing trend in chance of conceiving with increasing age and a sharp decline after 35. Fig. 2C and D reveals that there are decreasing trends in both the total number of embryos and number of good-quality embryos with increasing age and a sharp decline after 35.

Figure 2

(A) Log odds of clinical pregnancy according to the total number of good-quality embryos as a continuous variable while setting other factors to baseline values. Dashed lines are 95% CIs. The curves were estimated using logistic regression modeling adjusting for eight other predictors. (B) Restricted cubic spline curve and 95% CIs between age and log odds of clinical pregnancy. (C) Restricted cubic spline curve and raw data points between age and total number of embryos. (D) Restricted cubic spline curve and raw data points between age and total number of good-quality embryos.

Figure 2

(A) Log odds of clinical pregnancy according to the total number of good-quality embryos as a continuous variable while setting other factors to baseline values. Dashed lines are 95% CIs. The curves were estimated using logistic regression modeling adjusting for eight other predictors. (B) Restricted cubic spline curve and 95% CIs between age and log odds of clinical pregnancy. (C) Restricted cubic spline curve and raw data points between age and total number of embryos. (D) Restricted cubic spline curve and raw data points between age and total number of good-quality embryos.

Figure 2A reveals a nonlinear relationship between the total number of good-quality embryos in a continuous scale and the cumulative outcome. The likelihood ratio χ2 test of linearity assumption was 28.04 (P< 0.0001, df = 3), which provided a strong evidence of a nonlinear relationship between the total number of good-quality embryos and log-odds of achieving a clinical pregnancy after a completed IVF/ICSI cycle. Figure 2A suggests a change in the increasing trend in log-odds of achieving a clinical pregnancy. The two-segment multivariable logistic regression model estimated the change point as 1.916 (95% CI: 1.558–2.273), with the regression coefficient before the change point (first segment) β1 as 0.750 (95% CI: 0.599–0.902) and after the change point (second segment) β2 as 0.165 (95% CI: 0.148–0.180). Among women with no more than two good-quality embryos, one extra good embryo will increase the odds of achieving a clinical pregnancy by 111.2%, which was six times more than that (17.4%) of women with more than two good-quality embryos, while holding other factors constant.

Discussion

This large retrospective study is the first one designed to examine predictors that predict the chance of pregnancy for a patient after her first completed IVF/ICSI cycle. To the best of our knowledge, all previous studies were performed for either fresh cycles only or frozen–thawed cycles only. However, as a retrospective study, our analysis depends on data that are recorded, and certain variables cannot be collected. The total number of good-quality embryos was the most important predictor but its predictive value declined among women >40 years, which was most likely due to the inadequacy of the current embryo morphology scoring system to measure embryo quality accurately in older patients.

Although a recent meta-analysis study (Van Loendersloot et al., 2010) still listed basal FSH as an important predictor of pregnancy, it is not surprising that AFC was selected in the model instead of basal FSH in our study. Many studies have revealed that AFC is a better predictor of ovarian response and IVF outcome than basal FSH (Nahum et al., 2001; Klinkert et al., 2005; Maseelall et al., 2009).

Our study is the first to discover that the number of follicles measuring between 10 and 14 mm in diameter on the day of hCG injection is an independent and negative predictor of clinical pregnancy chance. The development of several stimulated follicles often is not perfectly synchronous. In general, 15–20% of oocytes are arrested in the germinal vesicle stage or at metaphase І after hCG injection during controlled ovarian hyperstimulation in IVF, rather than advancing to metaphase II to complete meiotic maturation. Clinically, the diameter of follicles is an important indicator to judge the maturation of oocytes and to decide the timing of hCG injection. Thus, we speculate that the number of 10–14 mm follicles on the day of hCG injection could be a marker for the quality of oocytes to some extent and therefore affects the cumulative outcome of IVF/ICSI.

Two embryonic variables, the total number of good-quality embryos and total number of embryos, ranked as the most important and second most important predictors of clinical pregnancy chance after a completed IVF/ICSI cycle (Table III). Most previous studies focused on the quality and number of transferred embryos because they aimed at the outcome of fresh cycles (Terriou et al., 2001; Hunault et al., 2002; Lee et al., 2006; Ottosen et al., 2007). We used the total number of good-quality embryos and total number of embryos to assess the quality and quantity of all the embryos from one ovarian stimulated cycle in our study. Among 5000 stepwise variable selections, the total number of embryos and total number of good-quality embryos were retained in the same model 99% of times and the partial PEV for the total number of embryos is the second highest among all predictors. These results suggest that the total number of embryos contributes additional information to the predictive model that the total number of good-quality embryos lacks. It may be a surrogate marker for hormone factors and may act through the uterine receptivity (Roberts et al., 2010).

Age is well known as one of the most important factors predicting IVF/ICSI outcome (Roseboom et al., 1995; Templeton et al., 1996; Lintsen et al., 2007; Van Loendersloot et al., 2010). Our study shows that the chance of conceiving after a completed treatment cycle decreases as age increases and there is a sharp decline in women >35 years (Fig. 2B). This sharp decline coincides with the corresponding deeper drop in both the total number of embryos (Fig. 2C) and total number of good-quality embryos (Fig. 2D) after age 35. These findings suggest that the age-related decline on the success in IVF most likely lies in the progressively diminished ovarian reserve, with decreases in both quantity and quality of oocytes (Broekmans et al., 2006, 2007; Van Loendersloot et al., 2010).

Several studies attempted to compare the predictive value between embryo quality and age in fresh cycles. Terriou et al. (2001) pointed out that the mean score of transferred embryos was a better predictor of pregnancy than age. Lee et al. (2006) found out that embryo quality was more important for younger women (≤35), whereas age was more important for older women (>35). Our present study observed that the total number of good-quality embryos' predictive value measured by marginal and partial PEV was much higher than age, with statistical significance in the overall population (Table III).

To avoid a potentially biased conclusion due to a higher proportion of younger patients in our data, we stratified the patients into three age groups and compared the predictive value of age, total number of embryos and good-quality embryos within each age stratum (Table IV). The total number of good-quality embryos and total number of embryos were still more predictive than age for women <40 years. Both the total number of embryos and number of good-quality embryos decrease after age 35 (Fig. 2C and D) but such a decline is not significant enough to affect their predictive power for women aged 35–40 years.

The c-indexes of the age variable in the three age strata were 0.540, 0.619 and 0.734, respectively (Table IV). This increasing trend in the predictive value of age may be attributed to the fact that age is a surrogate variable for decreasing quantity and quality of embryos. However, the observation that the total number of good-quality embryos is less predictive than age for women ≥40 years is not appropriate to be interpreted as the embryo quality being less important to older women.

Figure 3A clearly shows that there is a decreasing trend in the cumulative clinical pregnancy rate across age groups for each total number of good-quality embryos group, with even a deeper decline in women ≥35 years. To confirm this graphic finding, we did a case–control matching by pairing one pregnant woman with one non-pregnant woman having exactly the same total number of good-quality embryos (number of pairs = 740) and ran a conditional logistic regression including age, total number of embryos and endometrial thickness, etc. (data not shown). We discovered that there was still a decreasing trend for age (β = −0.056, OR5years= 0.756, P< 0.0001) and significantly lower chance of achieving a clinical pregnancy for women ≥35 years (OR<28 versus ≥35= 1.937, P= 0.0003; OR[28, 30) versus ≥35= 2, P= 0.0002; OR[30, 35) versus ≥35= 1.55, P= 0.007) even after accounting for the same total number of good-quality embryos. These findings suggest that the decrease in cumulative clinical pregnancy rate for women ≥35 years is not entirely caused by poor embryo morphology scoring. However, the observation of high success rates in oocyte donation programs demonstrates that the age-related decline in fertility is primarily attributed to a decrease in oocyte quality and quantity rather than to poor endometrial receptivity (Sauer et al., 1990). Therefore, all of these imply that the current scoring system only partially reflects embryo quality.

Figure 3

(A) The cumulative pregnancy rate for four groups regarding the total number of good-quality embryos across four age groups. (B) The ratios of the cumulative pregnancy rates between the highest and lowest number of good-quality groups across four age groups.

Figure 3

(A) The cumulative pregnancy rate for four groups regarding the total number of good-quality embryos across four age groups. (B) The ratios of the cumulative pregnancy rates between the highest and lowest number of good-quality groups across four age groups.

This inadequacy of embryo morphology to measure the quality of embryos, particularly for older women, might explain in part why the total number of good-quality embryos was less predictive for older women (≥40 years) in our study. The predictive ability of a variable can be seen as a degree of ‘concordance’ between the values of this explanatory variable and the outcome. For instance, a patient having more good-quality embryos should be more likely to conceive. However, aneuploidy increases with maternal age and is as high as 17% in patients ≥40 years. Also, aneuploidy occurs more frequently in embryos with good morphology and development rate than in embryos developing poorly for older patients (Marquez et al., 2000). Hence, the ‘concordance’ between the total number of good-quality embryos measured by morphology and the IVF outcome may be distorted most for women ≥40 years, which makes the total number of good-quality embryos less predictive.

Although the cumulative clinical pregnancy rate decreases as age increases (Fig. 3A), embryo quality has more influence on IVF/ICSI outcome with increasing age, particularly for older women (Fig. 3B). The cumulative pregnancy rate was 8-fold higher in older women with the highest number of good-quality embryos than those with the lowest (75 versus 8.7%). This finding emphasizes the need for a careful selection of embryos for all patients but especially for women ≥35 years. More parameters that could define the embryo quality more accurately are required, especially for the older patients. These parameters could include an improved embryo scoring system, spindle visualization imaged by polscope (Wang et al., 2001), metabolism characteristics or chromosomal characteristics such as aneuploidy in particular (Kuliev, 2002).

Although more sophisticated algorithms such as decision trees (C5) and artificial neural network could be more accurate in prediction than traditional logistic regression models (Delen et al., 2005), the logistic regression model is simple to use and easy to interpret. The regression coefficients stand for ORs, a well-understood concept among clinical researchers. Moreover, overfitting and generalizability are more of a concern to sophisticated learning algorithms. We hereby used logistic regression to construct our prediction model and expect it to be easily validated by other researchers.

The prediction model developed in present study was cross-validated internally in a large training data set and validated externally using independent data with robust performance, which proves its value as a tool of clinical decision making. The model could be used to decide which patients should get priority for elective SET by deriving a risk score. The patients with high scores could be selected as potential candidates for elective SET. Although, the training and validation data in the current study were from two independent IVF centers, the accuracy of prediction model still could be improved.

Authors' roles

Q.C.: substantial contributions to conception and design, acquisition of data, drafting and revising the article and final approval of the version to be published; F.W.: substantial contributions to analysis and interpretation of data, manuscript drafting and final approval of the version to be published. R.H.: substantial contributions to acquisition of data, revising the article and final approval of the version to be published. H.Z.: substantial contributions to acquisition of independent validation data, revising the article critically for important intellectual content and final approval of the version to be published.

Acknowledgements

We thank Dan Hu and Kai Huang from Tongji Medical College, Huazhong University of Science and Technology, for their assistance in validation data collection.

References

Austin
PC
Tu
JV
Bootstrap methods for developing predictive models
Am Stat
 , 
2004
, vol. 
58
 (pg. 
131
-
137
)
Beyene
J
Atenafu
EG
Hamid
JS
To
T
Sung
L
Determining relative importance of variables in developing and validating predictive models
BMC Med Res Methodol
 , 
2009
, vol. 
9
 pg. 
64
 
Broekmans
FJ
Kwee
J
Hendriks
DJ
Mol
BW
Lambalk
CB
A systematic review of tests predicting ovarian reserve and IVF outcome
Hum Reprod Update
 , 
2006
, vol. 
12
 (pg. 
685
-
718
)
Broekmans
FJ
Knauff
EA
te Velde
ER
Macklon
NS
Fauser
BC
Female reproductive ageing: current knowledge and future trends
Trends Endocrinol Metab
 , 
2007
, vol. 
18
 (pg. 
58
-
65
)
China Daily
 
Countrys Infertility Rate ‘in the Rise’. http://www.chinadaily.com.cn/china/2010-02/27/content_9512682.htm. (4 June 2011, date last accessed)
Delen
D
Walker
G
Kadam
A
Predicting breast cancer survivability: a comparison of three data mining methods
Artif Intell Med
 , 
2005
, vol. 
34
 (pg. 
113
-
127
)
Elizur
SE
Lerner-Geva
L
Levron
J
Shulman
A
Bider
D
Dor
J
Factors predicting IVF treatment outcome: a multivariate analysis of 5310 cycles
Reprod Biomed Online
 , 
2005
, vol. 
10
 (pg. 
645
-
649
)
Harrell
FE
Regression Modeling Strategies
 , 
2001
New York
Springer
Hastie
T
Tibshirani
R
Generalized additive models—some applications
J Am Stat Assoc
 , 
1987
, vol. 
82
 (pg. 
371
-
386
)
Hunault
CC
Eijkemans
MJ
Pieters
MH
te Velde
ER
Habbema
JD
Fauser
BC
Macklon
NS
A prediction model for selecting patients undergoing in vitro fertilization for elective single-embryo transfer
Fertil Steril
 , 
2002
, vol. 
77
 (pg. 
725
-
732
)
Klinkert
ER
Broekmans
FJ
Looman
CW
Habbema
JD
te Velde
ER
The antral follicle count is a better marker than basal follicle-stimulating hormone for the selection of older patients with acceptable pregnancy prospects after in vitro fertilization
Fertil Steril
 , 
2005
, vol. 
83
 (pg. 
811
-
814
)
Kuliev
A
Chromosomal abnormalities in a series of 6733 human oocytes in preimplantation diagnosis for age related aneuploidies
Reprod BioMed Online
 , 
2002
, vol. 
6
 (pg. 
54
-
59
)
Ledger
WL
Anumba
D
Marlow
N
Thomas
CM
Wilson
EC
The costs to the NHS of multiple births after IVF treatment in the UK
BJOG
 , 
2006
, vol. 
113
 (pg. 
21
-
25
)
Lee
TH
Chen
CD
Tsai
YY
Chang
LJ
Ho
HN
Yang
YS
Embryo quality is more important for younger women whereas age is more important for older women with regard to in vitro fertilization outcome and multiple pregnancy
Fertil Steril
 , 
2006
, vol. 
86
 (pg. 
64
-
69
)
Lintsen
AM
Eijkemans
MJ
Hunault
CC
Bouwmans
CA
Hakkaart
L
Habbema
JD
Braat
DD
Predicting ongoing pregnancy chances after IVF and ICSI: a national prospective study
Hum Reprod
 , 
2007
, vol. 
22
 (pg. 
2455
-
2462
)
Marquez
C
Sandalinas
M
Bahce
M
Alikani
M
Munne
S
Chromosome abnormalities in 1255 cleavage-stage human embryos
Reprod Biomed Online
 , 
2000
, vol. 
1
 (pg. 
17
-
26
)
Maseelall
PB
Hernandez-Rey
AE
Oh
C
Maagdenberg
T
McCulloh
DH
McGovern
PG
AFC is a significant predictor of livebirth in in vitro fertilization cycles
Fertil Steril
 , 
2009
, vol. 
91
 (pg. 
1595
-
1597
)
Mittlbock
M
Schemper
M
Explained variation for logistic regression
Stat Med
 , 
1996
, vol. 
15
 (pg. 
1987
-
1997
)
Nahum
R
Shifren
JL
Chang
Y
Leykin
L
Isaacson
K
Toth
TL
Antral follicle assessment as a tool for predicting outcome in IVF—is it a better predictor than age and FSH?
J Assist Reprod Genet
 , 
2001
, vol. 
18
 (pg. 
151
-
155
)
Ottosen
LD
Kesmodel
U
Hindkjaer
J
Ingerslev
HJ
Pregnancy prediction models and eSET criteria for IVF patients—do we need more information?
J Assist Reprod Genet
 , 
2007
, vol. 
24
 (pg. 
29
-
36
)
Rhodes
TL
McCoy
TP
Higdon
HL
III
Boone
WR
Factors affecting assisted reproductive technology (ART) pregnancy rates: a multivariate analysis
J Assist Reprod Genet
 , 
2005
, vol. 
22
 (pg. 
335
-
346
)
Roberts
SA
Fitzgerald
CT
Brison
DR
Modelling the impact of single-embryo transfer in a national health service IVF programme
Hum Reprod
 , 
2009
, vol. 
24
 (pg. 
122
-
131
)
Roberts
SA
Hirst
WM
Brison
DR
Vail
A
towardSET collaboration
Embryo and uterine influences on IVF outcomes: an analysis of a UK multi-centre cohort
Hum Reprod
 , 
2010
, vol. 
25
 (pg. 
2792
-
2802
)
Roberts
SA
McGowan
L
Mark Hirst
W
Vail
A
Rutherford
A
Lieberman
BA
Brison
DR
towardSET Collaboration
Reducing the incidence of twins from IVF treatments: predictive modelling from a retrospective cohort
Hum Reprod
 , 
2011
, vol. 
26
 (pg. 
569
-
575
)
Roseboom
TJ
Wermeiden
JPW
Schoute
E
Lens
JW
Schats
R
The probability of pregnancy after embryo transfer is affected by the age of the patients, cause of infertility, number of embryos transferred and the average morphology score, as revealed by multiple logistic regression analysis
Hum Reprod
 , 
1995
, vol. 
10
 (pg. 
3035
-
3041
)
Sauer
MV
Paulson
RJ
Lobo
RA
A preliminary report on oocyte donation extending reproductive potential to women over 40
N Engl J Med
 , 
1990
, vol. 
323
 (pg. 
1157
-
1160
)
Schemper
M
The relative importance of predictors in studies of survival
Stat Med
 , 
1993
, vol. 
12
 (pg. 
2377
-
2382
)
Steyerberg
EW
Eijkemans
MJ
Harrell
FE
Jr
Habbema
JD
Prognostic modeling with logistic regression analysis: in search of a sensible strategy in small data sets
Med Decis Making
 , 
2001
, vol. 
21
 (pg. 
45
-
56
)
Strandell
A
Bergh
C
Lundin
K
Selection of patients suitable for one-embryo transfer may reduce the rate of multiple births by half without impairment of overall birth rates
Hum Reprod
 , 
2000
, vol. 
15
 (pg. 
2520
-
2525
)
Templeton
A
Morris
JK
Parslow
W
Factors that affect outcome of in vitro fertilisation treatment
Lancet
 , 
1996
, vol. 
348
 (pg. 
1402
-
1406
)
Terriou
P
Sapin
C
Giorgetti
C
Hans
E
Spach
JL
Roulier
R
Embryo score is a better predictor of pregnancy than the number of transferred embryos or female age. [Comment]
Fertil Steril
 , 
2001
, vol. 
75
 (pg. 
525
-
531
)
Ulm
K
A statistical method for assessing a threshold in epidemiological studies
Stat Med
 , 
1991
, vol. 
10
 (pg. 
341
-
349
)
Van Loendersloot
LL
van Wely
M
Limpens
J
Bossuyt
PM
Repping
S
van der Veen
F
Predictive factors in in vitro fertilization (IVF): a systematic review and meta-analysis
Hum Reprod Update
 , 
2010
, vol. 
16
 (pg. 
577
-
589
)
Wang
WH
Meng
L
Hackett
RJ
Keefe
DL
Developmental ability of human oocytes with or without birefringent spindles imagined by Polscope before insemination
Hum Reprod
 , 
2001
, vol. 
16
 (pg. 
1464
-
1468
)