Hormone References for Ultrasound Breast Staging and Endocrine Profiling to Detect Female Onset of Puberty

Abstract Context Application of ultrasound (US) to evaluate attainment and morphology of glandular tissue provides a new rationale for evaluating onset and progression of female puberty, but currently no hormone references complement this method. Furthermore, previous studies have not explored the predictive value of endocrine profiling to determine female puberty onset. Objective To integrate US breast staging with hypothalamic-pituitary-gonadal hormone references and test the predictive value of an endocrine profile to determine thelarche. Design Setting and Participants Cross-sectional sample of 601 healthy Norwegian girls, ages 6 to 16 years. Main Outcome Measures Clinical and ultrasound breast evaluations were performed for all included girls. Blood samples were analyzed by immunoassay and ultrasensitive liquid chromatography–tandem mass spectrometry (LC-MS/MS) to quantify estradiol (E2) and estrone (E1) from the subpicomolar range. Results References for E2, E1, luteinizing hormone, follicle-stimulating hormone, and sex hormone–binding globulin were constructed in relation to chronological age, Tanner stages, and US breast stages. An endocrine profile index score derived from principal component analysis of these analytes was a better marker of puberty onset than age or any individual hormone, with receiver-operating characteristic area under the curve 0.91 (P < 0.001). Ultrasound detection of nonpalpable glandular tissue in 14 out of 264 (5.3%) girls with clinically prepubertal presentation was associated with significantly higher median serum levels of E2 (12.5 vs 4.9 pmol/L; P < 0.05) and a distinct endocrine profile (arbitrary units; P < 0.001). Conclusions We provide the first hormone references for use with US breast staging and demonstrate the application of endocrine profiling to improve detection of female puberty onset.

A comprehensive metastudy recently established a statistically significant and still ongoing trend of earlier breast development (thelarche, ie, puberty onset) in girls, corresponding to almost 3 months per decade since 1977 (1). This finding underlines the need to compile updated references for puberty development milestones and pertinent hormones used in the diagnosis and management of altered puberty onset in the current pediatric population.
Hypothalamic activation initiates female puberty by enhanced release of pituitary gonadotropins, mainly luteinizing hormone (LH) and follicle-stimulating hormone (FSH) to stimulate gonadal maturation (2). Abnormal circulating levels of FSH and LH in conjunction with gonadal hormones can support the diagnosis of pituitary malfunction, hypogonadism and various disorders of sexual development associated with altered puberty onset (3)(4)(5). The main female sex steroids estrone (E 1 ) and estradiol (E 2 ) are involved in development and maintenance of the female phenotype and gonadal function. In particular, E 2 is an integral biomarker for assessing female pubertal timing, menstrual status, and fertility.
Female puberty onset is traditionally defined by breast formation, specifically, attainment of palpable glandular breast tissue and areolar enlargement, corresponding to Tanner stage B2 (6). In contrast, ultrasound (US) leverages short-wavelength echogenicity of different tissues to render visual representations of internal anatomic structures and morphology. Evaluation of breast glandular tissue by US is a new approach for assessing female puberty, but its clinical utility remains unexplored and compatible hormone references have been lacking. Notably, formation of glandular tissue in early thelarche may be detectable by US but not palpable by hand (7). In the pediatric subspecialist setting, US breast staging may improve clinical investigations of altered puberty timing since this method allows for storage and objective retrospective assessments of digital images during longitudinal patient follow-up. Moreover, the procedure is harmless (8,9); perceived by the patient as less invasive than palpation (our unpublished observations); and provides better definition of breast maturation than Tanner stages (10), while being able to differentiate between glandular and adipose tissue in overweight girls where excess fat accretion may confound traditional breast staging by palpation and visual inspection (11,12). Age references for both breast staging by clinical Tanner and US breast examinations are provided ( Fig. 1) in line with our previously published population study (7).
In medicine and biology, principal component analysis (PCA) is a statistical approach to stratify patients or identify phenotype clusters by capturing the variance from several variables, or dimensions (13). Previous applications of PCA in pediatric research include composite risk index scoring for metabolic syndrome (14) and identification of allergy phenotypes (15). Endocrine profiling by PCA was previously demonstrated to identify distinct subtypes of Cushing syndrome (16), improve the predictive value of newborn screening for congenital adrenal hyperplasia (17), and extract endocrine phenotypes with implications for puberty timing in a longitudinal study of female puberty (18). However, the utility of a reference endocrine profile as diagnostic marker of puberty onset remains unexplored.
From the female pediatric population sample in the cross-sectional Bergen Growth Study 2 (BGS2) in Norway, we aimed to establish hormone references in relation to both traditional Tanner and US breast stages. We leveraged a newly established in-house liquid chromatography-tandem mass spectrometry (LC-MS/MS) method to quantify estrogens in the subpicomolar range (19) in order to investigate whether US stratification of clinically prepubertal girls would reveal biochemically distinct phenotypes. Lastly, we explored the predictive value of hormone biomarkers and an endocrine profile to determine female puberty onset.

Cohort description
Clinical examinations and data collection for the BGS2 cross-sectional cohort was conducted in 2016. Children in the age interval from 6 to 16 years from 6 schools were voluntarily recruited with parental consent to be examined regarding puberty status. This cohort was described previously (7). Briefly, a total of 703 girls were included in the cohort (participation rate 49.5%), of whom 651 donated blood samples. The following participants were excluded: 27 due to chronic disease; 12 due to use of oral contraceptives, and 11 due to insufficient blood sample volume.

Clinical and ultrasound evaluation of puberty
Breast maturation was evaluated both clinically (Tanner B) and with US. US images were analyzed after study completion to prevent observer bias. Clinical evaluation was performed in accordance with the schematics proposed by Marshall and Tanner (6) and included breast palpation. An experienced radiologist established the US staging protocol in line with a previous study (20). Description of the morphological distinctions providing the basis for US breast staging was provided previously (21). A trained nurse performed all clinical and US evaluations, and the most mature breast was examined. Briefly, stage US B0 was characterized as a small hypoechoic area in the retro-areolar area. Unique to stage US B1 was the presence of hyperechoic tissue with a triangular shape. In stage US B2, corresponding to clinical thelarche, the internal breast was characterized by a small hypoechoic center (linear, round, or star-shaped) with surrounding glandular hyperechoic tissue. The presence of a hypoechoic center was a prerequisite for stages US B3 and US B4, and while the appearance in US B3 was spidershaped, the center was defined as increasingly roundish in US B4. The mature stage US B5 presented as a heterogeneous mass devoid of the hypoechoic center. As described previously (21), the intra-observer agreement of US B staging was "very good" (Cohen's kappa 0.84; 95% CI, 0.78-0.91). Presence of pubic hair was defined as a score of 2 or higher according to the Tanner PH scale (22).

Blood sample analyses
Samples were collected from venous blood between 8:20 am and 2:10 pm. Average time of blood draw was 10:57 am and cumulative proportion of samples collected according to time of day starting from 08:20 is provided: 09:00 am (9%); 10:00 am (30%); 11:00 am (53%); 12:00 pm (69%); 1:00 pm (88%); 2:00 pm (99%); 2:10 pm (100%). Serum was stored at −80 °C prior to analysis at the Hormone Laboratory, Department Of Medical Biochemistry and Pharmacology, Haukeland University Hospital Hormone Laboratory, where personnel were blinded for participant age and pubertal status. The laboratory and its analytical practice are accredited in accordance with NS-EN ISO 15189:2012. The following coefficients of variation (CV) refer to the inter-assay variation. Siemens IMMULITE 2000 XPi was used to analyze basal levels of LH (CV 7% at 10 IU/L), FSH (CV 5% at 17 IU/L) and sex hormone-binding Figure 1. Age reference curves for female puberty. Probit models for age of occurrence for indicated breast stages determined clinically (Tanner B; dashed lines) and by ultrasound (US B; solid lines). The US B1 stage was defined by prepubertal breast morphology, but it was radiologically distinct and more advanced than the baseline stage US B0. e4889 globulin (SHBG; CV 6% at 6.74 µg/mL). E 2 and E 1 were analyzed by an ultrasensitive LC-MS/MS method as recently described (23). Briefly, samples were subjected to liquid-liquid extraction before analysis and quantification by LC-MS/MS. Lower limit of detection (LOD) was 0.28 pmol/L for E 2 and 0.15 pmol/L for E 1 . Lower limit of quantification (LOQ) was 0.58 pmol/L for E 2 (CV 9.1% in the range 1.7-153.3 pmol/L) and 0.25 pmol/L for E 1 (CV 7.8%; range 1.7-143.1 pmol/L). To convert estradiol (E 2 ) to pg/mL, divide by 3.671. To convert estrone (E 1 ) to pg/mL, divide by 3.698. To convert SHBG to µg/mL, divide by 8.896.

Hormone reference intervals
Nonparametric reference intervals were established in accordance Clinical & Laboratory Standards Institute (CLSI) EP28-A3c guidelines (24) and the Canadian Laboratory Initiative on Pediatric Reference Intervals (CALIPER) framework (25). Briefly, the central 95% range was defined by the lower 2.5th and upper 97.5th percentile limits. Reference intervals were established from breast stage partitions by resampling to 500 bootstrapped data points using the Analyse-it (Analyse-it, Leeds, UK) extension in Microsoft Excel (Microsoft, Redmond, WA, USA). The Harris-Boyd standard deviate test (26)(27)(28) was applied to log-transformed data with approximate Gaussian distributions to determine statistically significant differences between pairwise, successive breast stage partitions. Justified partitioning criteria were met if the numeric Harris-Boyd z-score exceeded that of the critical z* sample power test, as mathematically outlined previously (26)(27)(28)(29). Continuous hormone reference intervals in relation to chronological age were generated using the CLSIcompliant referenceIntervals package in R (R Development Core Team, Vienna, Austria), based on a moving window of 120 observations as described previously (29).

Endocrine profiling
Data dimensionality reduction by PCA was applied to generate a composite endocrine profile score for serum level constellations of hormones E 1 , E 2 , LH, FSH, and SHBG. PCA was applied to hormone data from 403 premenarcheal girls (age interval, 6.2-15.7 years). Participants with missing data for one or more hormones were discarded. Postmenarcheal girls were excluded from PCA in order to omit menstrual cycle hormone fluctuations. The first principal component 1 (PC1) comprised the following loadings: E 1 (0.497), E 2 (0.494), LH (0.475), FSH (0.469), and SHBG (−0.252). This PC1 exhibited an Eigenvalue of 3.6 (ie, its standard deviation of 1.9 squared) and accounted for 70.0% of the hormone dataset variance. Secondary PCs returned Eigenvalues below 1.0 and were accordingly discarded. Participant PC1 scores were thus used to assign individual endocrine profiles in context of the total dataset variance. The PCA was computed in R with code operations provided as supplemental data (30).

Ethical considerations
The BGS2 was approved by the Norwegian Regional Ethics Committee, case references 2015/128/REK and 2015/235/ REK. The study design and conduct conformed to good clinical practice and the ethical decrees of the Helsinki Declaration. Children younger than 16 years were only examined with written and informed parental consent and child assent, both of which were instantly revocable. Participants were rewarded with a cinema voucher. Breast palpation or US performed in the study did not detect any pathology.

Statistical analyses
Transition to pubertal breast stage was estimated from age using probit regression for a generalized linear model in R, as described previously (7). Receiver-operating characteristics (ROC) curves were generated using R and the optimal ROC cutoff value (Youden index) was computed using with the pROC package, corresponding to the point of the ROC curve where the sum of sensitivity and specificity for distinguishing 2 groups is highest (31). Statistical significance was defined as: * P < 0.05, ** P < 0.01, *** P < 0.001 (2-tailed Mann-Whitney U test) and a z > z* (Harris-Boyd test).

Results
Hormone references in relation to chronological age Serum levels of E 2 , E 1 , LH, FSH, SHBG, and LH/FSH ratio were plotted against age and reference centiles defining the 95% normal range and median were computed from a moving window of 120 observations (Fig. 2).

Hormones reference intervals by puberty stages
Reference intervals according to breast evaluations by clinical examination (Tanner B stages) and US (US B stages) were established by bootstrapping breast stage partitions to 500 observations ( Table 1). The equivalent table with analytes annotated in conventional units is provided as Supplemental Table 3 (34). Justified partitioning of incremental breast stage reference intervals was verified using the pairwise Harris-Boyd standard deviate test (Z). Where indicated not significant (n.s.), the reference interval overlap of the current and the previous partition was too extensive to warrant partitioning despite the dichotomized segregation by breast stages. By the same logic, standard deviate testing of corresponding Tanner B and US B stage reference intervals (ie, Tanner B2 vs US B2, with matching increments) showed agreement between the 2 methods with no statistical evidence to support endocrinological distinctions for any of the 5 hormones. Median ages (95% CI) for the Tanner B stages were: B1, 8.7 (6.6-11.6); B2, 10.6 (8.

Endocrine profile detects thelarche
Next, we evaluated the predictive value of age, endocrine profile, and individual analytes, respectively, in determining the transition to puberty onset defined by attainment of breast stage Tanner B2+ or US B2+ (Table 2). For these analyses, we selected the age interval 8.0 to 12.0 years that defined both the earliest and latest occurrence of Tanner B2 in the dataset. Statistically significant differences were observed for all variables between prepubertal and pubertal groups (P < 0.001 for all, Mann-Whitney U). Notably, the endocrine profile index (ie, participant PC1 scores) returned the highest area under the ROC curve and best negative predictive value (NPV) to distinguish prepubertal and pubertal girls.

Ultrasound evaluation allows for refined characterization of early thelarche
Minor disagreements between Tanner and US breast staging were encountered on the intra-individual level, and we next decided to investigate the possible endocrine implications of these discrepancies. Stratifying each of the . All healthy participants were included, and the variable number of observations for the individual hormones were due to insufficient serum volume to determine the respective analytes. Filled dots represent prepubertal girls (Tanner B1) and open dots represent pubertal girls (Tanner B2+). Continuous median, lower limit (2.5th percentile) and upper limit (97.5th percentile) centiles with 90% CIs were estimated by nonparametric method from a moving window of 120 observations. e4891 Comparing the US B1 and US B2 strata by Mann-Whitney U tests, we observed statistically significant differences in serum levels of E 2 (P < 0.05), FSH (P < 0.01) and SHBG (P < 0.05), corresponding to a highly distinct endocrine phenotype indicated by the endocrine profile (P < 0.001). Importantly, these endocrine differences were observed despite no significant age difference between the US B1 and US B2 subgroups. In this comparison, the endocrine profile emerged as the most statistically significant variable.   Sample size (n) and analyte levels corresponding to the median and resampled 95% reference intervals are presented for indicated puberty breast stage partitions. Column "Z" summarizes Harris-Boyd standard deviate tests to determine justified partitioning of current and the previous partition ( a yes) or not (n.s., not significant). Abbreviations: B, Tanner breast stage; E 1 , estrone; E 2 , estradiol; FSH, follicle-stimulating hormone; LH, luteinizing hormone; SHBG, sex hormone-binding globulin; US B, ultrasound breast stage.  (7), which is comparable to other Western countries (35)(36)(37)(38). The cohort was representative of the general Norwegian demography, having a ~90% majority Caucasian population. Agreement between the 2 breast staging methods was satisfactory, as previously described (7,21). Interestingly, our participants perceived US as a less invasive procedure compared to palpation, and the majority (67.3%) of the girls favored the US evaluation (our unpublished observations). Although reference intervals for US and Tanner breast stage were biochemically comparable overall, US enabled detection of glandular tissue in a subset of clinically prepubertal girls with distinct biochemical baseline characteristics. From the stratification of clinically prepubertal (Tanner B1) girls by US B stages in Table 2, we observed significant endocrine profile differences between the substrata of prepubertal (US B1) and pubertal (US B2) girls. Although the subgroup of girls with detectable glandular tissue by US (US B2) was small, this finding implies that endocrine profiling by PCA may provide more sensitivity than any singular hormone in context of detecting thelarche. Further studies are warranted to explore the use-case for tailored endocrine profiles as predictive or diagnostic markers of endocrinopathies, including hypogonadism and disorders of sexual development.
The application of endocrine profiling in the current study has limitations and warrants discussion. PCA was applied to capture the total variance of 5 hormone concentrations from 403 premenarcheal girls. The first principal component, PC1, explained 70% of the total dataset variance and thus the PC1 scores associated with each study participant were leveraged as a composite endocrine profile index. The rationale for this endocrine profile was that its constituent hormones are integral components of the pubertal hypothalamic-pituitary-gonadal signaling axis and subject to reciprocal regulation. Loading other pertinent hormone dimensions into the endocrine profile would arguably generate additional complexity and depth. The application of endocrine profiling by PCA cluster analysis was recently applied to extract distinct endocrine phenotype clusters in a longitudinal study of female puberty (18). In contrast, our approach was to generate a reference endocrine profile for normal puberty development. As shown in Table 2, the composite endocrine profile index was an excellent marker of thelarche, but in terms of ROC diagnostic performance it was only marginally better than, or practically equivalent to that of E 2 levels alone. However, the occurrence of US thelarche morphology in clinically prepubertal girls in Table 3 was marked with a higher degree of statistical confidence by the endocrine profile than by E 2 alone. These findings imply that endocrinological investigations of altered puberty timing in some cases may benefit from PCA profiling. Automated computation for endocrine profiling may provide added value to existing laboratory systems or clinical decision trees. In a previous study, Fugl et al observed that baseline level of LH was the better analyte variable to determine thelarche (39). Notably, this study included a sample size of only 43 girls, with a late minimum inclusion age of 9.8 years.
A central undertaking in the current work was the construction of reference intervals compatible with previously defined US breast stages (7, 20, 21). Table 1 was configured to Our reference intervals should be interpreted with some precautions. The current references represent average hormone levels throughout the morning and early noon, and do not account for intra-day variation for gonadotropins and estrogen levels. In this regard, the diurnal rhythm for estrogen, LH and FSH secretion have been extensively studies by others (40)(41)(42). Due to logistical constraints, we were unable to obtain fasting morning samples and, in line with similar studies (25,(43)(44)(45)(46) our references rely on sample power to squelch diurnal secretion patterns and may thus be representative for outpatient clinics. Further, we did not account for latent hormone fluctuations associated with monthly cyclicity in premenarcheal girls, and postmenarche data were not stratified by menstruation cycle phase at the time of blood draw.
In conclusion, we have provided the first set of statistically robust hormone references for US breast staging, in comparison with traditional Tanner B stages for female puberty. Our results demonstrate a high degree of agreement between the 2 methods of puberty staging, both in terms of age at stage occurrences and endocrine parameters. However, US enabled detection of nonpalpable glandular tissue in a subset of clinically prepubertal girls, and this phenotype was corroborated by a pubertal endocrine profile. Furthermore, we have demonstrated that index scores from endocrine profiling by PCA represents a useful predictive marker of puberty onset with a possible use-case in detecting pediatric endocrinopathies associated with altered puberty onset.