Unique Metabolic Profiles Associate with Gestational Diabetes and Ethnicity in Low- and High-Risk Women Living in the UK

ABSTRACT Background Gestational diabetes mellitus (GDM) is the most common global pregnancy complication; however, prevalence varies substantially between ethnicities, with South Asians (SAs) experiencing up to 3 times the risk of the disease compared with white Europeans (WEs). Factors driving this discrepancy are unclear, although the metabolome is of great interest as GDM is known to be characterized by metabolic dysregulation. Objectives The primary aim was to characterize and compare the metabolic profiles of GDM in SA and WE women (at <28 wk of gestation) from the Born in Bradford (BIB) prospective birth cohort in the United Kingdom. Methods In total, 146 fasting serum metabolites, from 2,668 pregnant WE and 2,671 pregnant SA women (average BMI 26.2 kg/m2, average age 27.3 y) were analyzed using partial least squares discriminatory analyses to characterize GDM status. Linear associations between metabolite values and post–oral glucose tolerance test measures of dysglycemia (fasting glucose and 2 h postglucose) were also examined. Results Seven metabolites associated with GDM status in both ethnicities (variable importance in projection ≥1), whereas 6 additional metabolites associated with GDM only in WE women. Unique metabolic profiles were observed in healthy-weight women who later developed GDM, with distinct metabolite patterns identified by ethnicity and BMI status. Of the metabolite values analyzed in relation to dysglycemia, lactate, histidine, apolipoprotein A1, HDL cholesterol, and HDL2 cholesterol associated with decreased glucose concentration, whereas DHA and the diameter of very low-density lipoprotein particles (nm) associated with increased glucose concertation in WE women, and in SAs, albumin alone associated with decreased glucose concentration. Conclusions This study shows that the metabolic risk profile for GDM differs between WE and SA women enrolled in BiB in the United Kingdom. This suggests that etiology of the disease differs between ethnic groups and that ethnic-appropriate prevention strategies may be beneficial.


Introduction
During pregnancy, there is a natural increase in catabolism to ensure sufficient energy for the fetus (1,2). This increase is governed by maternal hormones, beginning as a mild change in insulin sensitivity and progressing through hyperinsulinemia to controlled insulin resistance by the third trimester (2)(3)(4)(5). For most pregnancies, these changes are safe and controlled, with insulin sensitivity returning to a healthy state following pregnancy. However, for approximately 1 in 7 pregnancies, insulin resistance exceeds normal "healthy" levels and enters a diabetic state, putting the mother and her growing offspring in danger of short-and long-term health risks (6,7). This pregnancy-induced state of diabetes, gestational diabetes mellitus (GDM), is a major global health concern with varying prevalence between populations.
In Middle Eastern, North Africa, and South Asian countries, GDM prevalence can exceed 20% of pregnancies, whereas in European countries, prevalence of GDM is commonly ∼5% (5). Numerous lifestyle, biological, and genetic factors are thought to contribute to this disparity of risk (5,8). Despite the numerous factors, diet is the mainstay of most prevention and treatment strategies because of its demonstrated efficacy for managing glucose concentrations (9)(10)(11). Nonetheless, we and others have demonstrated that the effects of dietary prevention strategies on maternal and offspring health are not generalizable across populations or ethnic groups, with dietary patterns demonstrating varied effects between ethnic groups in relation to both GDM prevention and birth weight (12)(13)(14)(15). These data suggest that the metabolism and pathology of GDM may differ across populations, where some ethnic groups have unique metabolic profiles that make them more susceptible to GDM (4,5,(16)(17)(18). Specifically, elevated concentrations of alanine, numerous fatty acids (e.g., myristic acid, palmitic acid, palmitoleic acid), and lower amounts of glutamate, proline, and phospholipids in blood have been identified as predictors of GDM risk in early pregnancy (i.e., before 16 wk) (4), with recent evidence demonstrating significant differences in the abundance of these metabolites between ethnic groups (19). Notably, evidence from Born in Bradford (BiB), a prospective multiethnic pregnancy and birth cohort, has demonstrated the need for potentially modified GDM assessment criteria for South Asian (SA) women because of increased risks of delivery complication and newborn macrosomia at significantly lower glucose thresholds compared with white European (WE) women (20). Indeed, currently, the United Kingdom's National Health Service (NHS) routinely screens all women of SA ancestry for GDM, whereas only high-risk WE women are screened (21). As a consequence of this, the Diabetic Pregnancy Study Group called for increased research into the role of the metabolome on GDM in 2018 (22). To date, however, the metabolic drivers of GDM remain unclear with numerous discrepancies within the field, likely due to small, heterogeneous cohorts of varying populations, cultures, and ancestral groups (23). Indeed, only 1 study has conducted an analysis of individual metabolites and GDM in an ethnic-specific fashion (1). This work investigated univariate associations between numerous metabolites in WE (n = 4,072) and SA ( n = 4,702) women and demonstrated that concentrations of lipoproteins and cholesterols are typically higher in WE women and are stronger predictors of GDM [i.e., have a higher variable importance in projection (VIP) score] compared with SA women. However, metabolite profiles are heterogeneous mixtures of metabolites, many of which are strongly correlated and may depend on other metabolites to exhibit an effect. In light of this, multivariate approaches that assess all variables simultaneously along with their intervariable correlations ( 24) can be used to identify 1) patterns of uncorrelated metabolites that associate with GDM risk, and 2) cardinal metabolites that independently associate with GDM risk. Therefore, this study aims to build upon existing work by applying multivariate MAZ  statistical techniques within an ethnically diverse population to 1) determine underlining metabolite patterns that correlate with GDM and 2) identify ethnic-specific metabolic drivers of GDM risk.

Population characteristics
The BiB cohort was established to examine determinants of health from pregnancy and childhood into adulthood in an ethnically diverse region in the north of England (25). Between 2007 and 2010, BiB recruited 12,453 women (26-28 wk of gestation, mean maternal age 27.8), collecting baseline data on 13,776 pregnancies and 13,858 births, with 45% of the cohort of SA origin (25,26). BiB aimed to recruit all mothers giving birth at the Bradford Royal Infirmary, the largest hospital within Bradford. Bradford is a northern English city with high levels of deprivation and a large SA population, the majority of whom have Pakistani ancestry. All women were invited to partake in an oral glucose tolerance test (OGTT) for GDM diagnosis at approximately 26-28 wk gestation during their standard antennal care. Almost all UK citizens use the NHS for antenatal care.
Of these, 11,480 women provided blood samples for metabolite analyses during the same visit as their OGTT. Written consent was gained from all participants and ethical approval was granted by the Bradford Research Ethnics Committee (ref07/H1302/112) (25).

Blood metabolite analysis
Full details of venous blood sample collection, preparation, metabolite quantification, and validation have previously been described in detail (1). In brief, fasted blood samples were taken at the Bradford Royal Infirmary by trained phlebotomists, processed within 2.5 h, and stored at -80 • C in the absence of freeze-thaw cycles (27). Samples were processed using a high-throughput automated NMR platform and have previously been validated (Nightingale Health). Metabolite values expressed as a percentage or ratio were excluded to minimize redundancy, resulting in a panel of 146 metabolite values expressed in absolute quantitative measures. This panel comprised measures of 97 lipoproteins, 9 amino acids, 2 apolipoproteins, 9 cholesterols, 8 fatty acids, 8 glycerides and phospholipids, 4 glycolysis-related metabolites, 2 ketone bodies, 3 measures of fluid balance and inflammation, and 3 measures of the mean lipoprotein particle diameter (Supplemental Table 1).

Participant selection
Of the 11,480 blood samples analyzed for metabolites, 54 samples were excluded because they failed 1 of 5 Nightingale quality control measures (low glucose, high lactate, high pyruvate, low protein concentration and plasma samples). Of the 11,426 available samples, ∼3% of mothers were missing ≥1 metabolite values. The structure of missing metabolite data was assessed via the visualization and imputation of missing values package within R (28) and multiple correspondence analysis. There was no evidence that the missing metabolite data occurred in a nonrandom pattern. It was therefore deemed appropriate to impute missing values. Optimized multiple imputation with iterative principal component analysis (100 simulations, K-fold cross-validation) based on the minimization of mean square error of prediction was performed using the missMDA package (29). A sensitivity analysis was performed to test the effect of mothers with higher rates of missingness (≥3% missing metabolite values) on imputation. No detectable difference in imputation quality was noticed. As such, the metabolite data of all available 11,426 maternal samples were included for imputation.
Imputed metabolite data were combined with descriptive BiB reported characteristics, including participant's ethnicity, age moved to the United Kingdom (if born abroad) GDM status, gestational age at sample collection (obtained from obstetric records), history of diabetes, age, BMI (in kg/m 2 ), smoking status, parity, and whether they were carrying a singleton/multiple pregnancy. Length of residence Ethnic-Specific Metabolic Profiles of GDM 2187 was calculated by subtracting the age the mother moved to the United Kingdom from maternal age. When an individual was born within the United Kingdom, length of residency was taken to be the mother's age.
All women were recruited prior to their scheduled GDM assessment (mean gestational age 26.1 wk) and prior to the 28th week of pregnancy. GDM was diagnosed using a modified version of the World Health Organization criteria (1,25).
Using these criteria, a woman was diagnosed with GDM if either her fasting glucose concentration exceeded ≥6.1 mmol/L or 2-h postload glucose concentrations was ≥7.8 mmol/L following a 75-g OGTT. The OGTT was completed in the morning following an overnight fast and involved the consumption of a standard solution over a 5-min period containing the equivalent of 75 g of anhydrous glucose (30). Following a GDM diagnosis, all SA and WE mothers receive the same standardized care following a GDM diagnosis. Initially, GDM management involves referral to a dietitian and the management of glucose concentration through diet and increased exercise. If unsuccessful, management by metformin or insulin injections will be prescribed. Women with GDM will also be offered additional antenatal appointments to monitor the health of both mother and baby throughout the pregnancy. Irrespective of GDM status, basic nutritional counseling is offered to all mothers as part of standard antenatal classes offered throughout pregnancy by the UK NHS (31).
Ethnicity was self-reported. If ethnicity was not collected, details were obtained from primary care records along with information on parity and the number of registered births. Maternal age was recorded at pregnancy booking (i.e., the first routine antenatal visit) and BMI was calculated using height measured at recruitment and maternal weight recorded at the first antenatal visit. When examined as a categorical variable, ethnic specific cutoffs were used to classify mothers into BMI (kg/m 2 ) groups (underweight: ≤18.5 in WEs and SAs; normal/healthy weight: 18.6-25 in WEs or 18.6-22.9 in SAs; overweight: 25-29.9 for WEs or 23-27.4 for SAs; obese: >30 for WEs or >27.5 for SAs) (32). When analyzed as a binary variable, women were grouped as having a "healthy" or "high" BMI if they were above/below the BMI cutoff for overweight status using these ethnic-specific cutoffs.
Smoking status was self-reported at baseline and during pregnancy. Recruitment and the baseline assessment of covariates were the same in both ethnic groups. Summary statistics for each variable were presented as a mean and SE. Differences in baseline characteristics were calculated between women with and without GDM for continuous variables via a Mann-Whitney (MW) test, whereas differences for categorical variables were tested using the Pearson χ 2 test.
Participants whose samples were collected after GDM diagnosis (28th week or later) were excluded from the analysis as well as mothers with a history of diabetes. Individuals who reported being of a South Asian origin other than Pakistani (SA) were also excluded due to the small sample size (therefore limited power) of other South Asian ancestry groups. In total, 5,339 participants, 2,671 SAs (all of Pakistani descent) and 2,688 WEs, were retained for analysis (Figure 1).
Ethnicity was self-reported and the homogeneity of the WE group has been confirmed in previous genetic analyses within the BIB (33). In total, 93.2% of the included WEs were born in the British Isles (i.e., the United Kingdom, Republic of Ireland, Channel Islands, or Isle of Man), with the majority in England (91.4%). Of these women, 95.5% reported that both of their parents were also born in the British Isles. Within the group of WE women not born in the British Isles, 3.7% were born in Eastern Europe (Czech Republic, Poland, Slovakia), with the remaining proportion reporting "other" or "unknown." Within the SA population, 43.7% were born within the United Kingdom. Of the SA women born in the United Kingdom, 93% reported that their mother was born in Pakistan (87.4%) or India (5.6%), and 95% reported that their father was born in Pakistan (88.6%) or India (6.7%). A small proportion did not know their mother's (1.4%) or father's (1.3%) place of birth. Of the women born outside of the United Kingdom, the average age of immigration to the United Kingdom was 18.8 y (IQR: 18-23 y).

Metabolite discriminatory analysis
Partial least squares discriminatory analysis (PLSDA) is a supervised dimensionality reduction technique that uses all included variables to discriminate group data based on predefined outcome groups. Included variables are then ranked by the degree to which they explain the variance between groups (i.e., GDM compared with non-GDM). These are known as VIPs, where VIPs ≥1 denote a variable with good discriminatory quality and predictive ability (34,35).
PLSDA allowed an overall assessment of the predictive capacity of metabolites for GDM, in models with and without known GDM risk factors (i.e., BMI, maternal age, parity, multiple pregnancy, and smoking status), with ethnicity added to visually assess its effect on the model. Following this, both sets of PLSDA models were performed within each ethnic group. To assess bidirectionality, models predicting ethnicity were also executed within the overall population and GDM cases/noncases separately using the same criteria as above.
The optimum number of components to include within the model was selected based on the component's ability to significantly predict group membership within the training (pR 2 Y ≤ 0.05) and validation (pQ 2 Y ≤ 0.05) data sets (7-fold cross-validation, "nipals" algorithm). When multiple components were significantly predictive, the predictive component that best discriminated between groups (i.e., maximization of outcome variance explained, R 2 Y) with the minimal error [root mean squared error of estimation (RMSEE)] was selected. Data were pareto scaled and mean centered prior to analysis. External validity was assessed via 7-fold cross-validation. PLSDA models were performed via the "ropls" package within R (36). When the size of the outcome groups differed by ≥1%, the larger group was randomly sampled (n = 20) to minimize error. VIPs were mean averaged and SEs calculated across all significant iterations (pR 2 Y ≤ 0.05, pQ 2 Y ≤ 0.05) for each metabolite following the removal of outlier VIPs, defined as 1.5 × IQR of VIP values. Differences in the distribution of VIP values between both ethnicities and case status were assessed for significant iterations via a MW test; this was possible because all comparisons were tested against the same panel of metabolite measures. To assess the impact of smoking on PLSDA results, PLSDA models predicting smoking in the overall study population were also performed.

Post hoc multivariate analyses
BMI, a suspected mediator along the casual pathway that links metabolism and GDM, was a significant driver of GDM within SA women and WE women. To explore this, the ethnic-specific impact of BMI on the metabolome and subsequent GDM diagnoses was investigated using sparse PLSDA (sPLSDA). sPLSDA is a supervised multivariate technique with the ability to predict group membership in multiclass problems (i.e., stratification by ethnicity, body weight, and GDM status) by simultaneously performing and balancing variable selection with group classification (37). Women were classified as "healthy" or "overweight" based on ethnic-specific cutoffs (BMI ≥25 for WE women and BMI ≥23 for SA women), which is the same approach used by the NHS (38). The analyses focused on low-risk WE (n = 872) and low-risk SA women (n = 864)-that is, only women 1) in their first pregnancy, 2) who did not smoke during pregnancy, and 3) were <35 y of age were included. This was done to prevent these covariates from overpowering the models and allowing the contributing roles of BMI on GDM to be more clearly appreciated within and between each ethnic group.
Metabolites selected by sPLSDA in each comparison were fed into PLSDA models (20 iterations) alongside highly correlated metabolites (Pearson correlation coefficient ≥0.9) in order to determine metabolite values contributing to the separation of the outcome groups while balancing dimensionality reduction and group discrimination. PLSDA models were adjusted for maternal age (continuous), BMI (continuous), smoking status, parity, and multiple pregnancies such as before. Differences in the distributions of metabolites within each group were also compared by a MW test.

Linear regression analyses for identified metabolite associations
Linear regression models investigating the relation between postoral OGTT measures (fasting glucose and 2-h post-OGTT) were performed on all metabolite values identified as important (VIP ≥1) in characterizing GDM status. Normality of glucose measures and metabolite values were assessed using histograms and Q-Q plots. Most metabolites (136/146) required normalization. Normality was most often achieved by log transformation (59 metabolite values); however, in some cases, square root and normal score transformation were implemented via the "rcompanion" package (39). All glucose measures were log normalized. Known GDM risk factors of maternal age (years), gestational age (days), parity, and smoking status during pregnancy (yes/no) were initially including in the models. When significant associations were observed between metabolite values and glucose in this exploratory analysis (P < 0.05), BMI was added to the models (initially as a continuous and then as a binary variable using ethnic-specific BMI cutoffs for overweight status) to assess the role of early pregnancy BMI as a mediator of metabolite-dysglycemia associations. Within SAs, a final additional adjustment of length of residency within the United Kingdom was made to account for any effects of acculturation.

Population characteristics
The mean age of participants was 26.7 y and had a mean BMI of 26 kg/m 2 . WE women were significantly older and had higher BMIs compared with SA women ( Table 1). Parity was significantly higher in SA women compared with WE women (P < 0.001), and parity was only significantly higher in GDM cases compared with noncases within SA women (Table 1).
Smoking during pregnancy was significantly more common in WE women compared with SA women (25% compared with 3%; P < 0.001). No difference in proportion of singleton pregnancies (>97%) was observed between WE women and SA women. Alcohol intake was not assessed because it was reported by only 1% of SA women. The mean time of sample collection was 187 gestational days.

Primary analysis Metabolite characterization of GDM.
In the first model, an overall analysis of the full cohort (i.e., both ethnic groups), PLSDA explained 21.7% of the variation between the GDM and non-GDM groups and confirmed maternal age and BMI as primary risk factors for GDM risk followed by parity, smoking status, and having a nonsingleton pregnancy as the primary drivers of GDM ( Table 2). In the full model, 7 metabolite values reported VIPs ≥1, including 4 fatty acid metabolite measures (total fatty acids, 18:2 linoleic acid, total MUFA and total SFA) and 1 glycolysis-related metabolite (lactate) (Figure 2). Modeled independently, the PLSDA with only covariates explained 12.4% of the variation in GDM status and significantly predicted GDM status, whereas the model with only metabolites explained 13.5% of the variance in GDM but was nonsignificant. The second model, which included ethnicity as a covariate, accounted for 26.6% of the variation between the GDM and non-GDM groups. The same 6 metabolites we reported as predictors of GDM with an additional cholesterol metabolite measure (total esterified cholesterol). Notably, model 2 confirmed ethnicity (SA compared with WE) as a major risk factor for GDM, after age and BMI. Modeled independently, "ethnicity" and other covariates explained 15.2% of the variance in GDM status; therefore, the addition of metabolites into the model increased the amount of variance explained by over 11%.

Ethnically stratified analysis of metabolites characterizing GDM.
In an ethnically stratified analysis (20 iterations), models only including metabolites accounted for a median average of 6.5% of the variation in GDM status in SA women and 5.8% of the variation in WE women in optimized models (i.e., minimization RMSEE and maximization of R 2 Y), although no model comprising metabolites alone was significant. Conversely, models only including established clinical risk factors (age, BMI, parity, smoking status and multiple pregnancy) were significantly predictive (P value R 2 < 0.05, Q 2 < 0.05) of GDM status and explained 13.3% of the variation in SAs and 12.8% of the variation in WEs. The addition of metabolites to these covariate models also resulted in the significant prediction of GDM. These models resulted in 26% of the variance in GDM status in WE women and 20% of the variance in SA women being accounted for, an increase of 13.6% and 6.8% when compared with covariate models in WEs and SAs, respectively. Following adjustments for maternal age, parity, BMI, and smoking status, GDM could be predicted within both ethnicities. Maternal age, parity, and BMI were predictors of GDM in both ethnicities (VIP ≥1), with BMI the most important predictor of GDM in SA women, whereas in WE women, maternal age was most important predictor (Supplemental Table 2). Smoking was a predictor of GDM only in WE women.     Table 3). Of these metabolites, the VIPs of 3 (lactate, glycoprotein acetyls, and linoleic acid) characterized GDM status comparatively well between ethnicities (VIP ≥1; MW P > 0.05), whereas 4 metabolite measures (total fatty acids, total MUFA, total SFA, and VLDL_D) characterized GDM in both ethnic groups but were significantly stronger markers of GDM in WE women (VIP ≥1; MW P < 0.05 between ethnicities). Additionally, alanine, glutamine, total cholesterol, total n-6 PUFA, total PUFA, and citrate were markers (VIP ≥1) of GDM status in WE women only. No markers of GDM were specific to SA women. On average, the optimized models explained 26% of the variance of GDM in WE women and 20% of the variance in SA women (Supplemental Table 4).

Metabolites characterized by ethnicity.
To explore underlying metabolic profiles within each ethnic group, we identified metabolites that most strongly distinguished WE women and SA women. In a PLSDA including known GDM risk factors as covariates (maternal age, smoking status, parity, BMI, and GDM status), 12 metabolic measures had a VIP ≥1 in statistically significant models (P value R 2 > 0.05 and Q 2 > 0.05) and therefore were believed to have characterized ethnicity in GDM and non-GDM women: total fatty acids, serum cholesterol, SFA, MUFA, total n-6, esterified cholesterol, linoleic acid (LA), LDL cholesterol, remnant cholesterol, phosphatidylcholine, and total cholesterol (Supplemental Table 5).

Post hoc analyses Characterization of GDM in low-risk women.
BMI was classified as an important variable (VIP ≥1) in the overall analysis and in both ethnic subgroup analyses. However, a greater mean VIP (± SE) was observed in SA women compared with WE women (VIP SA = 7.06 ± 0.22 compared with VIP WE = 4.33 ± 0.22; P < 0.001) (Supplemental Table  2), indicating that BMI may be a more important predictor of GDM status within WEs. Indeed, healthy-weight SA women who developed GDM (SA Healthy-GDM ) presented the most distinct metabolic profile [receiver operator curve (ROC) = 0.783] but were most similar to healthy WE women who developed GDM (WE Healthy-GDM ; ROC = 0.691) (Supplemental Figure 2). The reason for this shared and distinct pattern of metabolites in "healthy"-weight women who developed GDM is unclear, and many hypotheses are possible. One hypothesis may be that the pattern is an artifact of their fetal programming. Adult offspring from GDM pregnancies are at increased risk of dysglycemia, diabetes, and GDM that has been attributed to metabolic dysregulation and early dysglycemia that progresses in later life (40)(41)(42)(43).
Future work in established cohorts that investigate transgenerational pregnancy risks (such as Born in Bradford, Generation R, and Nutrigen) is integral to unravel the source of this unique metabolic profile that distinguishes healthy-weight GDM cases of SA ancestry from noncases, overweight SA cases, and WE cases (25,44,45). Due to the higher proportion of underweight mothers of SA ancestry, a sensitivity analysis was performed where underweight mothers were removed (n removed = 93, BMI ≤18.5) to determine if their profiles were unique. No difference in the outcome was observed following the removal of these individuals.
Metabolites selected by sPLSDA in each comparison were fed into PLSDA models (20 iterations) alongside highly correlated metabolites (Pearson correlation coefficient ≥0.9) to identify key metabolic drivers of this separation (Supplemental Figure 3). Alanine, glutamine, and glycerol were important to distinguish healthy-weight SA women who developed GDM (SAC-N) from all others, whereas fatty acids were important to distinguish SAC-N from other GDM cases. Interestingly, in healthy women, aromatic and branched-chain amino acids distinguished GDM and non-GDM women between (but not Ethnic-Specific Metabolic Profiles of GDM 2191

Characterization of GDM in low-risk women by BMI and ethnicity.
Orthogonal partial least squares discriminant analysis (oPLSDA) is a supervised multivariate technique that separates variation within each predictor variable based upon its linear (correlated) and orthogonal (uncorrelated) association with the outcome variable (46,47). This can provide better separation along fewer components when a large proportion of variance within the data set does not directly correlate with the outcome variable. Furthermore, through the creation of shared and unique structure (SUS) plots, it is possible to determine shared and unique factors separating the main group of interest (SAC-N) with the 2 most relevant biological comparisons [healthy-weight SA noncases (SANC-N) and healthy-weight WE cases (WEC-N)].
No significant separation of the SAC-N compared with SANC-N, SAC-N compared with high-weight South Asian cases, and SAC-N compared with WEC-N groups was identified via SUS plots with oPLSDA. Following the inclusion of BMI and age within the models, the SAC-N group was found to separate from all other groups (Supplemental Figure 4). BMI was the only variable found to be responsible for this separation with a high magnitude and reliability. Pyruvate, large-high density FIGURE 3 Circular bar plot of ethnically stratified analyses identifying key metabolites variable importance in projection (VIP) ≥1] that distinguished South Asian women with gestational diabetes mellitus (GDM) from South Asian women without GDM (n cases = 286) and white Europeans (n cases = 128). Mean average VIP scores across 20 partial least squares discriminatory analysis (PLSDA) model iterations (n casesSA = 286, n casesWE = 128). Bars represent SEs. The PLSDA was run separately for SA (blue) and WE (red) women and included maternal age (years), BMI (continuous), smoking status, parity, and multiple pregnancy status. Red circular line denotes VIP cutoff of 1. No lipoproteins demonstrated a VIP >1 and were not included in the figure to preserve space. Units mmol/L unless stated. 18.2 LA, 18.2 linoleic acid; GRM, glycolysis-related metabolite; LPS, lipoprotein particle size; Tot FA, total fatty acids; VLDL_D, mean diameter of VLDL.
lipoproteins (L-HDL), and extra-large HDLs (XL-HDL) had a small impact on the separation of the SAC-N group but with a low reliability, as shown within SUS plots.

Association between important metabolites and gestational dysglycemia.
Overall, 8 of 146 metabolite measures were associated with fasting glucose or 2-h postglucose ( Table 3), all of which were identified as GDM predictors via PLSDA or sPLSDA. The analysis in WE women demonstrated the greatest number of associations between metabolite and glucose measures. Six metabolites positively associated fasting glucose concentration (albumin, lactate, histidine, apolipoprotein A1, HDL cholesterol, and HDL2 cholesterol), whereas 1 negatively associated with fasting glucose (mean density of LDL) ( Supplemental  Tables 6 and 7). Only DHA associated with 2-h post-OGTT in WEs, where a 1-mmol/L increase in DHA associated with a 0.20-mmol/L increase in 2-h postglucose. In the analysis of SA women, only albumin was associated with dysglycemia, where higher albumin associated with lower concentration of fasting glucose and 2-h post-OGTT. In an additional analysis, length of residency within the United Kingdom was added to the fully adjusted model to evaluate the role of UK acculturation as a modifier of the association between albumin and postprandial glucose measures. In both models, significant associations were identified with albumin (P fasting = 0.031, β = -0.79, SE = 0.37; P 2-h = 0.028, β = -1.75, SE = 0.80). Length of residency was not found to be a significant variable in the model, but the magnitude of associations decreased slightly following its inclusion (Supplemental Tables 6 and 7). In the ethnic-combined analysis, associations between albumin, lactate, and mean diameter of LDL with fasting glucose retained significance. Adjusting for  BMI as a continuous or binary variable had no impact on the associations.

Discussion
Using a prospective birth cohort with an equal proportion of WE and SA women, we identified 7 metabolite measures that characterized GDM in both WE and SA women-4 of which were more predictive in WE women. These results agree with the Omega cohort (78.5% non-Hispanic white; nested case-control; 46 cases, 47 controls) that highlighted a distinct metabolic profile at 16 wk of gestation (comprising fatty acids, sugars, alcohols, amino acids, and organic acids), associated with future GDM diagnosis (48). Although the metabolite patterns identified by the Omega study were not predictive, our predictive multivariate analysis (and a previous univariate analyses) (1) found similar associations between GDM and many of these metabolites (i.e., amino acids, glycolysis-related metabolites, and fatty acids) and offers further evidence of ethnic-specific associations. Given the overall elevated risk of GDM observed in SA women compared with WE women, even at a healthy BMI (i.e., OR ≈3) (49), and the role of ethnicity in predicting GDM, in the present study, we sought to characterize distinct metabolic profiles of SA and WE women. Of the 146 metabolite values tested, 7 were important for stratifying GDM and non-GDM women in the overall population (lactate, mean density of VLDL particles, total fatty acids, total MUFAs, 18:2 linoleic acid, total SFA, and esterified cholesterol). Following stratification by ethnicity, alanine, glutamine, total serum cholesterol, n-6 fatty acids, PUFAs, and citrate distinguished GDM and non-GDM in WE women, whereas no metabolite values were predictive solely within SA women.
Although no metabolite value identified solely within WE women was associated with post-OGTT measures of glucose in post hoc analyses, our evidence agrees with previous work from 1) a small case-control study (26 type-2 diabetics compared with 7 controls) that reported alanine, glutamine, and citrate to characterize GDM and controls, with citrate being a key marker of diabetics with underlying complications (e.g., cardiovascular disease) (50), and 2) a cohort study of 431 pregnant Chinese women (12-16 wk of gestation), where alanine and glutamine were associated with GDM (51). Biologically, alanine, glutamine, and citrate are connected and could moderate dysglycemia through their interaction with the tricarboxylic acid cycle (TCA) to promote the formation of TCA intermediates, develop fatty acid synthesis, and modulate glucagon and insulin secretion (52,53). Taken together, it may be that alanine and glutamine are more robust markers of dysglycemia, whereas citrate is a marker of metabolic or physiologic stress in diabetic individuals-such as pregnancy. The role of total cholesterol is uncertain as it is not convincingly associated with dysglycemia (a meta-analysis of 73 observational studies found no association) (54), suggesting that associations between total cholesterol and GDM are complex and/or subject to confounding.
In the ethnic subgroup analyses, fatty acids were identified as the most important family (i.e., VIP ≥1) to characterize GDM status. In WE women and SA women, respectively, 75% and 50% of the fatty acids included within the metabolite panel were considered "important" to characterize GDM within WE women. Furthermore, in SA women, fatty acids constituted more than half of all metabolites with a VIP ≥1.
This reflects earlier work by Taylor et al. (1), which identified some evidence of ethnic-specific associations between fatty acids and GDM and agrees with molecular analyses that demonstrate that fatty acids alter insulin resistance and insulin secretion during pregnancy (55,56).
Furthermore, fatty acids (total MUFAs, total n-3 PUFAs, total n-6 PUFAs, total PUFAs, and DHA) were identified as key metabolic factors to distinguish healthy-weight SA and WE women who developed GDM. Interestingly, we highlighted associations between n-6 PUFA and total PUFAs with GDM that were specific to WE women. Given the equal sample sizes between groups and that fatty acids were important to characterize ethnicity, it is suggestive of ethnic differences in PUFA metabolism (57)(58)(59) and a role in ethnic-associated GDM risk (57,60,61). Indeed, n-6 PUFA-derived eicosanoids show discriminatory qualities between type 2 diabetics and controls with good accuracy (R 2 X = 0.824, R 2 Y = 0.995, Q 2 = 0.779) and were identified as proposed mediators of dysglycemia within a Chinese population (62). Longitudinal analyses to evaluate the association between changes in PUFA and eicosanoid concentrations on dysglycemia during pregnancy are required to better understand this association.
The association between VLDL_D and dysglycemia is supported by a recent hypothesis linking insulin resistance, triglyceride synthesis, and increased VLDL_D (63,64). Although we cannot disregard that VLDLs are sensitive to level of fasting (65) (as our participants were subjected to prior to blood collection), evidence also suggests that ethnic-specific genetic variants associate with ethnic-specific differences in VLDL_D (66). Although there has been less work on the possible association between glycoprotein acetyls (a marker of systemic inflammation) and GDM, future work is required in this area.
Lactate was one of the strongest predictors of GDM within both groups, in agreement with evidence from a case-control study in China (n = 12 GDM; n = 10 controls) (67) and pathway analyses that propose lactate as a regulator of insulin resistance and a marker metabolic syndrome severity (68,69). Post hoc analysis demonstrated no association between glycoprotein acetyls and glucose concentrations, whereas lactate and mean diameter of VLDL were associated with fasting glucose in WE women but not SA women. The multiethnic Hyperglycemia and Adverse Pregnancy Outcome (HAPO) cohort demonstrated a similar ethnic-specific association between lactate and fasting glucose within individuals of northern European ancestry but not minority ethnic groups (48,70,71).
Of the numerous fatty acid measures that were associated with GDM, only DHA was associated with a post-OGTT measure of glucose and only in WE women. Overall, DHA is considered a protective metabolite against insulin resistance (e.g., HOMA-IR); however, recent evidence suggests high heterogeneity (56,72,73). As we did, researchers investigating the Camden pregnancy cohort (n = 1368) reported a significant positive linear association between DHA and HOMA-IR (0.303 ± 0.152 per unit DHA %; P < 0.05) (56), whereas conversely, the DHA to Optimize Mother Infant Outcome (DOMINO) trial (n = 1990 pregnant women) reported no difference in 1-h post-OGTT glucose concentrations between DHA-supplemented mothers and controls (74). The reason for such discrepancies is unclear but may be that n-3 PUFAs (such as DHA) require interactions with other metabolites (e.g., vitamin D) (75) to impart an effect, concentrations of which vary considerably between populations, seasons, and geographic region (76)(77)(78).
The study aimed to increase and test generalizability of results within a diverse population; however, our results may not be generalizable across other ethnic groups or geographic regions. Nonetheless, this study has 4 main limitations. First, samples were taken at a single time point before 28 wk of gestation; therefore, 1) we were unable to account for differences in fasting duration and diurnal variation, and 2) our results are not generalizable across the full term of pregnancy. Second, as with all observational studies, the effect of confounding cannot be disregarded and causality cannot be inferred. Despite this, to our knowledge, this is the first study to use a panel of multivariate statistical techniques to characterize GDM within a large prospective cohort with an equal representation of WE women and women from a non-WE population, meaning that statistical power to measure the same effect size is comparable between groups. Third, the biological validity of the identified metabolites was tested and many correlated with postprandial glucose measures, and although confounding cannot be eliminated, all models included known GDM confounders, and modeling characterizing the overall metabolic differences between ethnicities was also performed to test whether differences in metabolite profiles were found between ethnicities in relation to GDM status. Finally, diet is a contributor to metabolite concentrations, but comprehensive dietary data were not available for our analysis. Future work with comprehensive dietary records is needed to evaluate the presence of a moderating effect of diet on metabolism and GDM risk.
In conclusion, this study has identified unique and shared metabolic profiles that characterize GDM in WE and SA women. Future work exploring the moderating role of lifestyle on the metabolome and the underlying biological mechanisms driving the identified associations will provide a better understanding of the etiologic factors responsible for the heightened level of GDM risk experienced by SA women and shed light on improved prevention strategies.

Acknowledgments
BiB is only possible because of the enthusiasm and commitment of the children and parents in BiB. We are grateful to all the participants, health professionals, schools, and researchers who have made BiB happen. The authors' responsibilities were as follows-MI and MAZ: designed research and provided essential materials; HF and MAZ: conducted research and has the primary responsibility for the final content; HF: performed statistical analyses; HF, MI, JBM, and MAZ: wrote the paper; and all authors: read and reviewed the final manuscript.

Data Availability
Data described in the manuscript will not be made available but can be requested through Born in Bradford (https://borninbrad ford.nhs.uk). Analytic code will be made available upon request pending approval by the research team.