Nongenetic Determinants of Risk for Early-Onset Colorectal Cancer

Abstract Background Incidence of early-onset (younger than 50 years of age) colorectal cancer (CRC) is increasing in many countries. Thus, elucidating the role of traditional CRC risk factors in early-onset CRC is a high priority. We sought to determine whether risk factors associated with late-onset CRC were also linked to early-onset CRC and whether association patterns differed by anatomic subsite. Methods Using data pooled from 13 population-based studies, we studied 3767 CRC cases and 4049 controls aged younger than 50 years and 23 437 CRC cases and 35 311 controls aged 50 years and older. Using multivariable and multinomial logistic regression, we estimated odds ratios (ORs) and 95% confidence intervals (CIs) to assess the association between risk factors and early-onset CRC and by anatomic subsite. Results Early-onset CRC was associated with not regularly using nonsteroidal anti-inflammatory drugs (OR = 1.43, 95% CI = 1.21 to 1.68), greater red meat intake (OR = 1.10, 95% CI = 1.04 to 1.16), lower educational attainment (OR = 1.10, 95% CI = 1.04 to 1.16), alcohol abstinence (OR = 1.23, 95% CI = 1.08 to 1.39), and heavier alcohol use (OR = 1.25, 95% CI = 1.04 to 1.50). No factors exhibited a greater excess in early-onset compared with late-onset CRC. Evaluating risks by anatomic subsite, we found that lower total fiber intake was linked more strongly to rectal (OR = 1.30, 95% CI = 1.14 to 1.48) than colon cancer (OR = 1.14, 95% CI = 1.02 to 1.27; P = .04). Conclusion In this large study, we identified several nongenetic risk factors associated with early-onset CRC, providing a basis for targeted identification of those most at risk, which is imperative in mitigating the rising burden of this disease.


Abstract
Background: Incidence of early-onset (younger than 50 years of age) colorectal cancer (CRC) is increasing in many countries. Thus, elucidating the role of traditional CRC risk factors in early-onset CRC is a high priority. We sought to determine whether risk factors associated with late-onset CRC were also linked to early-onset CRC and whether association patterns differed by anatomic subsite. Methods: Using data pooled from 13 population-based studies, we studied 3767 CRC cases and 4049 controls aged younger than 50 years and 23 437 CRC cases and 35 311 controls aged 50 years and older. Using multivariable and multinomial logistic regression, we estimated odds ratios (ORs) and 95% confidence intervals (CIs) to assess the association between risk factors and early-onset CRC and by anatomic subsite. Results: Earlyonset CRC was associated with not regularly using nonsteroidal anti-inflammatory drugs (OR ¼ 1.43, 95% CI ¼ 1.21 to 1.68), greater red meat intake (OR ¼ 1.10, 95% CI ¼ 1.04 to 1.16), lower educational attainment (OR ¼ 1.10, 95% CI ¼ 1.04 to 1.16), alcohol abstinence (OR ¼ 1.23, 95% CI ¼ 1.08 to 1.39), and heavier alcohol use (OR ¼ 1.25, 95% CI ¼ 1.04 to 1.50). No factors exhibited a greater excess in early-onset compared with late-onset CRC. Evaluating risks by anatomic subsite, we found that lower total fiber intake was linked more strongly to rectal (OR ¼ 1.30, 95% CI ¼ 1.14 to 1.48) than colon cancer (OR ¼ 1.14, 95% CI ¼ 1.02 to 1.27; P ¼ .04). Conclusion: In this large study, we identified several nongenetic risk factors associated with early-onset CRC, providing a basis for targeted identification of those most at risk, which is imperative in mitigating the rising burden of this disease.
For the past several decades, early-onset colorectal cancer (CRC; in persons younger than 50 years of age) has been increasing in incidence in many countries (1)(2)(3)(4)(5)(6)(7)(8)(9)(10). In the United States, incidence rates of early-onset CRC differ by geographic location and have nearly doubled between 1992 and 2013 (from 8.6 to 13.1 per 100 000 persons) (5), with a preponderance of this increase due to early-onset cancers of the rectum (5,11). The recent rise in early-onset CRC has been observed particularly among individuals born during and after the 1960s in studies from the United States (5,12,13), Canada (3), Australia (1), and Japan (14), suggesting that the differential rates over time are largely attributable to changes in risk factor patterns throughout successive generations.
There is a great need to understand the factors driving the increased incidence of early-onset CRC, because approximately 1 in 10 diagnoses of CRC in the United States occurs in this age group, and these early-onset cancers tend to present with higher pathologic grade and a greater risk of recurrence and metastatic disease (7). Although genetic syndromes (15) and common genetic variants (16) are important in early-onset CRC, the prevalence in young adults of anthropometric, dietary, lifestyle, and pharmacological risk factors for CRC may contribute greatly to the secular trends in early-onset CRC, overall (1,3,5,13) and by anatomic subsite (5,11,13,(17)(18)(19). Research in electronic health record databases and small-scale interview-based epidemiologic studies has pointed to potential risk factors for earlyonset CRC, including greater consumption of processed meat (20), reduced consumption of vegetables and citrus fruit (20), greater body mass index (BMI) (21)(22)(23)(24), sedentary lifestyle (25), greater alcohol use (20,21,24), smoking (21,22,24), reduced aspirin use (26), and diabetes mellitus (21). However, a comprehensive, large-scale evaluation that compares the magnitude of these risks with those for late-onset CRC (50 years of age and older) and assesses whether the risks for early-onset CRC correlate with specific CRC anatomic subsites has yet to be conducted.
By pooling data from 3 large CRC consortia, we studied whether established anthropometric, dietary, lifestyle, and pharmacological risk factors for late-onset CRC were also linked to early-onset CRC and whether these risks differed from risks for late-onset CRC. Furthermore, we explored whether these risk factors may explain the rising incidence of early-onset CRC by site-specific patterns.

Study Participants
From 3 large consortia-the Colon Cancer Family Registry, the Colorectal Transdisciplinary study, and the Genetics and Epidemiology of Colorectal Cancer Consortium-including 67 168 CRC cases and 710 377 controls, we identified epidemiologic studies that surveyed for detailed CRC risk factors and included a minimum of 20 early-onset CRC cases (younger than 50 years of age at diagnosis). The 13 studies included 3767 CRC cases and 4049 participant controls aged younger than 50 years at diagnosis of the first primary CRC for cases and age at selection for controls (Supplementary Table 1, available online) [for additional study information, see earlier publications (27)(28)(29)(30)(31)(32)(33)(34)(35)(36)]. These same studies also included 23 437 CRC cases and 35 311 controls with a diagnostic or control selection age of 50 years and older (Supplementary Table 2, available online). Cases were confirmed by medical record, pathology report, or death certificate. Controls were identified based on study-specific eligibility and matching criteria, if applicable, which consisted predominantly of age and sex. Participant recruitment across all studies occurred between the 1990s and the early 2010s. Analyses were restricted to participants of genetically defined European descent. All study participants provided written informed consent, and the research was approved by their respective institutional review boards.

Statistical Analysis
Risk Factors and Overall Early-Onset Disease. Risks for colorectal cancer were assessed for 16 self-reported anthropometric, dietary, lifestyle, and pharmacological risk factors. All selfreported variables were ascertained at the reference time for each study, defined as patient selection or blood collection for cohort studies and 1-2 years prior to selection for case-control studies, to ensure exposures were assessed before cancer diagnoses. For studies that assessed height and BMI via direct measurement, variables were captured at the reference time of each respective study. To ensure comparability of variables across studies, all data underwent a multiphase, iterative harmonization process (see the Supplementary Methods, available online) (27,37). Briefly, variables were grouped into a single dataset with universal definitions, standardized coding, and acceptable values. Quality-control checks were implemented, and any values deemed outliers were truncated to a designated range for each respective variable. To address missing data for the examined risk factors, we performed sex-and study-specific mean imputation across the complete consortia dataset (Supplementary Table 3, available online).
Educational attainment was defined as the highest level completed and categorized as the following: less than high school graduate, high school graduate or completed general education development, some college or technical school, and college graduate and higher. Height was represented in increments of 10 cm and captured through either self-report or direct measurement at baseline. BMI, per 5 kg/m 2 , was estimated based on body weight (kg) and height (m 2 ) via either selfreport or direct measurement at baseline. History of diabetes was characterized as diagnosis of type 2 diabetes at baseline.
Smoking was defined using pack-years of smoking among current and former smokers and modeled as study-and sexspecific quartiles. Presence of a sedentary lifestyle was defined as yes (binary) if moderate and/or vigorous physical activity, leisure time, and undifferentiated activities took place less than 1 hour per week. Alcohol intake was categorized according to the grams of alcohol intake per day (14 grams is equivalent to 1 drink): less than 1 g/day (ie, nondrinker), 1-28 g/day, and more than 28 g/day. Aspirin and nonaspirin nonsteroidal anti-inflammatory drug (NSAID) use was defined as yes (binary) if regular use was reported. Dietary factors were captured using food frequency questionnaires or diet histories and included fruit intake (servings/day), vegetable intake (servings/day), red meat intake (servings/day), processed meat intake (servings/day), total calcium intake (mg/day), total folate intake (mcg/day), and total dietary fiber intake (g/day). All dietary variables were modeled as sex-and study-specific quartiles. For all variables, the referent level was the category linked to the lowest risk for CRC based on previously published studies such that the effect estimates for each factor would represent an increase in CRC risk (27,37). Family history of CRC was defined as having 1 or more first-degree relatives with CRC.
We used logistic regression to assess the association between each risk factor and early-onset CRC, adjusting for age, sex, study, family history, and total energy consumption (for dietary factors) (ie, minimally adjusted models). To evaluate the independent effect of these factors on early-onset CRC risk, we used logistic regression incorporating all 16 risk factors, adjusting for age, sex, study, family history, and total energy consumption (ie, multivariable model). We also assessed these relationships for late-onset CRC following the same procedures as for early-onset CRC but additionally accounting for history of screening in the models. Notably, screening for individuals aged 50 years or younger was not standard practice in these regions during the period in which these patients were ascertained, except for possible high-risk families, thus screening history was not accounted for in early-onset models.
Potential heterogeneity across studies was accounted for using random-effects logistic regression; however, results were nearly identical to those from traditional logistic regression models, thus the simpler models were presented here. Statistical assumptions and outliers were evaluated for all models and addressed when necessary. Analyses were completed using the R statistical software program version 3.5.1. All tests were 2-sided, and a P value of less than .05 was considered statistically significant.

Risk Factors and Disease Site
Because time trend analyses for early-onset CRC suggest that increases in rectal cancer tend to predominate (5,11,13,17), we used multinomial logistic regression to assess the association of risk factors with early-onset rectal cancer and early-onset colon cancer. To test for differences in associations between disease subsites, we applied v 2 tests to assess for contrasts in coefficients. Models were adjusted for age, sex, study, family history, and total energy consumption (for dietary factors). Further stratification by anatomic subsite, namely distal colon, proximal colon, and rectum, were also explored for associations with risk factors using a similar approach as described above.
Sensitivity Analyses. Sensitivity analyses were performed to evaluate robustness of the results using the mean imputation approach to the presence of missing data. We ran minimally adjusted logistic models for each individual risk factor without imputation (limited to study participants with complete data for that factor); we also ran similar multinomial logistic models to assess these risks by anatomic subsite. In addition, we applied multiple imputation with chained equations (38) to the entire early-onset study group as a second sensitivity analysis.

Risk Factors and Overall Early-Onset Disease
Early-onset CRC cases and controls were similar in reference age (45.0 years and 44.7 years, respectively), and men and women were approximately equally distributed across the 2 groups, as expected because of matching on these variables for many of the included studies (Table 1). Cases aged younger than 50 years were predominantly located in the rectum (39.8%), followed by the distal colon (32.3%) and the proximal colon (27.9%).
We found that early-onset CRC was associated with several factors previously linked to CRC overall, in minimally adjusted ( Table 2) and multivariable models (Table 2 and Figure 1). In multivariable models, early-onset CRC was associated with not regularly using NSAIDs (OR ¼ 1.43, 95% CI ¼ 1.21 to 1.68), greater red meat intake (OR ¼ 1.10, 95% CI ¼ 1.04 to 1.16), lower educational attainment (OR ¼ 1.10, 95% CI ¼ 1.04 to 1.16), and alcohol abstinence (OR ¼ 1.23, 95% CI ¼ 1.08 to 1.39) and heavier alcohol use (>28 g/day of alcohol; OR ¼ 1.25, 95% CI ¼ 1.04 to 1.50). Several other CRC risk factors trended toward an association with early-onset CRC in multivariable models, including history of diabetes and lower folate, dietary fiber, and calcium intake. Comparing risk factors between early and late-onset CRC in multivariable models, we found that no factors appreciably exhibited a greater excess in effect size for early-onset compared with late-onset cancer (Supplementary Table 4, available online; Figure 1). However, several risk factors were suggestive of carrying greater risk for late-onset compared with early-onset CRC, including BMI, smoking, and no use of aspirin. To account for possible confounding by indication due to inflammatory bowel disease in the relationship between NSAID use and risk for early-onset CRC, a sensitivity analysis restricted to individuals without a confirmed inflammatory bowel disease diagnosis (n ¼ 4220) was carried out, and results remained unchanged (Supplementary Table 5, available online).

Risk Factors and Disease Site
Evaluating risks for early-onset CRC by cancer subsite (Table 3), we found that not regularly using NSAIDs, greater red meat intake, lower dietary fiber intake, lower folate intake, lower calcium intake, alcohol abstinence and heavier alcohol use (>28 g/ day of alcohol), and lower educational attainment were all linked to greater risk for both rectal and colon early-onset disease. Further contrasting these associations between subsite, lower total dietary fiber intake was associated more strongly with rectal (OR ¼ 1.30, 95% CI ¼ 1.14 to 1.48) than colon cancer (OR ¼ 1.14, 95% CI ¼ 1.02 to 1.27; P ¼ .04). Several other risk factors tended toward a greater risk for rectal cancer, including no regular use of NSAIDs and lower folate intake. After further stratification across anatomic subsites (Supplementary Table 6, available online), lower total fiber intake was more closely associated with cancers of the proximal colon (OR ¼ 1.24, 95% CI ¼ 1.08 to 1.43) compared with those of the distal region (OR ¼ 1.06, 95% CI ¼ 0.94 to 1.21; P ¼ .05).

Sensitivity Analyses
By comparing risk estimates from minimally adjusted logistic models produced using data with mean imputation ( Table 2) with those generated using multiple imputation or the reduced complete case data (Supplementary Table 7, available online), we found the effect sizes were almost identical in magnitude. Similarly, effect estimates from minimally adjusted multinomial logistic models produced using data with mean imputation (Table 3) and those generated using complete case data (Supplementary Table 8, available online) were almost identical in magnitude.

Discussion
Our study, including 3767 early-onset CRC and 4049 controls, demonstrated that several nongenetic factors known to be  involved in late-onset CRC (27,37) are also relevant for early-onset disease. In particular, not regularly using NSAIDs, greater red meat intake, alcohol abstinence and heavier alcohol use, and lower educational attainment were statistically significantly associated with early-onset CRC. Notably, this study is novel in that it statistically examined how associations between risk factors and early-onset CRC differ by subsite. In doing so, we provide the first evidence that no use of NSAIDs, lower intake of dietary fiber, and lower intake of folate may be more strongly associated with early-onset cancers of the rectum, compared with those of the colon. Pharmacological, dietary, lifestyle, and anthropometricrelated risk factors for CRC have been clearly established for late-onset disease (27,37); however, research on these factors in early-onset CRC is less developed, relying often on smaller studies and examination of a limited number of risk factors. Evidence on pharmacological factors and early-onset CRC is limited, although lower aspirin use was related to greater risk of CRC in 1 study (26). As diets have shifted considerably over the past several decades, several researchers hypothesize that dietary factors are largely driving the higher rates of CRC in younger individuals. Reduced intake of folate (20), calcium (20), citrus fruits (20), and greater processed meat (20) has demonstrated a positive association in some studies with greater risk of earlyonset CRC. Certain lifestyle factors have also been suggested to increase one's risk for early-onset CRC, including smoking (21,22,24,39,40), a sedentary lifestyle (25), abstinence or heavy alcohol use (20,24,39), and a history of diabetes (22,40). Lastly, associations between greater BMI and risk of early-onset CRC have been inconsistently shown (22-24,26,39,40). Our larger, comprehensive study generally tended to replicate previous reports, although some differences were noteworthy. In particular, neither BMI nor smoking were risk factors in our early-onset series, in contrast to the late-onset group.
The recent rise internationally in early-onset CRC incidence is related, to a substantial degree, to increases in rectal cancer (5,11,13,17). Although prior work has shown that select dietary factors, including calcium and fiber intake (41), and aspirin (18,41) tend to exert greater risk over all ages combined for rectal cancer compared with colon cancer (18,19,42), studies have yet to reveal such differences for early-onset disease. However, previous studies were small or included a broader definition of early-onset CRC up to 60 years of age (18,41). Thus, our study is the first to identify statistically significant differences in earlyonset CRC by disease subsite, particularly for dietary fiber and possibly for no use of NSAIDs and lower intake of folate.
Whereas early-onset CRC has been characterized by a greater preponderance of rectal cancer, temporal increases associated with birth cohort effects have also been noted (1,3,5,13), thus suggesting that risk factors strongly linked with rectal cancer and increasing in prevalence may explain the increasing rates of early-onset disease. Major shifts in dietary consumption in the past decades among younger generations are well established for the United States (43) and internationally (44) characterized typically by decreases in consumption of fruits, non-potato vegetables, and calcium-rich dairy sources, coupled with an increase in processed foods (eg, meats, pizza, macaroni and cheese) and soft beverages. Concurrent with changes in foods consumed, nutrient intakes of fiber, folate, and calcium are lower than dietary recommendations among US adolescents (43), although current folate intake likely has increased recently because of folic acid fortification of all enriched cereal-grain products by the Food and Drug Administration beginning in 1998 (45). Furthermore, adolescent use of NSAIDs has decreased over recent generations (46). Consistent with these trends, we identified several factors, including no use of NSAIDs and lower intake of several dietary factors, that tended toward greater association with rectal compared with colon cancer. These findings may provide the first clues that generational changes in risk-related exposures may contribute to the increases observed internationally in early-onset CRC.
Our study is among the first to comprehensively assess the relationship of well-established CRC risk factors in the development of early-onset CRC. We leveraged multiple studies from heterogeneous populations, and we included rigorous harmonization across these studies of risk factors and disease phenotypes (27,37). Despite these strengths, this research also has limitations. Anthropometric, dietary, lifestyle, and pharmacological risk factors were self-reported, which may result in misclassification, although prior work has shown that self-reported lifestyle and diet are relatively accurate (47,48). Second, sexand study-specific mean imputation for addressing missing data reduced the variance of distributions, potentially resulting in biased estimates; however, sensitivity analyses using complete case data or multiple imputation did not produce substantial differences. As with all studies using pooled data, presented from multivariable models, which were adjusted for age, sex, study, family history, and total energy consumption; the late-onset model was additionally adjusted for history of screening. Dietary variables were harmonized across studies by sex-and study-specific quartiles, and assigned values 0, 1, 2, and 3 in the order of increasing risk. These variables were treated as continuous variables in the analysis. Error bars indicate 95% confidence intervals. BMI ¼ body mass index; CI ¼ confidence interval; NSAID ¼ nonsteroidal anti-inflammatory drug; OR ¼ odds ratio. heterogeneity stemming from study design is a potential concern; this points to the need for additional large cohort studies to assess these relationships. For case-control studies, risk factors were assessed after cancer diagnosis, which therefore makes their data susceptible to recall bias. Nevertheless, relative risks for each known risk factor (Table 2) were relatively comparable to those previously reported throughout the literature. Further, measurement error in the dietary assessment of energy may have had a noteworthy impact on the presence of residual confounding for dietary factors. Prior weight loss due to CRC manifestation may have biased BMI ascertainment and likely may explain our null findings for BMI risk; additional analyses using prospective cohorts or Mendelian randomization methods are warranted to elucidate this association. Additionally, we note that the observed differentials in risk by disease subsite may be influenced by multiple testing and require further independent validation. Lastly, only individuals of European ancestry were included, thus limiting the generalizability of the findings. Associations may differ across racial and ethnic populations, emphasizing the need for racially and ethnically diverse cohorts, particularly as early-onset CRC occurs more commonly among Black, Asian, Pacific Islander, and Hispanic communities (49)(50)(51).
In summary, we found that a subset of established nongenetic risk factors for late-onset CRC were additionally related to early-onset CRC. Our research also provided the first evidence linking CRC risk factors to early-onset anatomic subsite patterns, specifically for lower intake of dietary fiber. These results present key insights concerning risk factors that contribute to CRC manifestation in younger individuals, providing a basis for identification of those most at risk, which is imperative in mitigating the rising burden of this disease.

Funding
This work was funded by the National Cancer Institute under R03-CA215775-02, awarded to Dr Richard Hayes, and through the Genetics and Epidemiology of Colorectal Cancer Consortium (GECCO) funded by the National Cancer Institute, National Institutes of Health, US Department of Health and Human Services (U01 CA164930, R01 CA201407), awarded to Dr Ulrike Peters. This research was funded in part through the NIH/NCI Cancer Center Support Grant P30 CA015704 and training grant T32HS026120, from the Agency for Healthcare Research and Quality. The Colon Cancer Family Registry (CCFR, www.coloncfr.org) is supported in part by funding from the National Cancer Institute (NCI), National Institutes of Health (NIH) (award U01 CA167551). The CCFR Set-1 (Illumina 1 M/1M-Duo) and Set-2 (Illumina Omni1-Quad) scans were supported by NIH awards U01 CA122839 and R01 CA143247 (to GC). The CCFR Set-3 (Affymetrix Axiom CORECT Set array) was supported by NIH award U19 CA148107 and R01 CA81488 (to SBG). The CCFR Set-4 (Illumina OncoArray 600 K SNP array) was supported by NIH award U19 CA148107 (to SBG) and by the Center for

Notes
The role of the funders: The funders had no role in the design of the study, the writing of the manuscript, the decision to submit the manuscript for publication, and the collection, analysis, and interpretation of the data.
Disclosures: The authors have no conflicts of interest to report and assume full responsibility for all aspects of this study.