Prospective and longitudinal evaluations of telomere length of circulating DNA as a risk predictor of hepatocellular carcinoma in HBV patients

Prospective and longitudinal epidemiological evidence is needed to assess the association between telomere length and risk of hepatocellular carcinoma (HCC). In 323 cancer-free Korean-American HBV patients with 1-year exclusion window (followed for >1 year and did not develop HCC within 1 year), we measured the relative telomere length (RTL) in baseline serum DNAs and conducted extensive prospective and longitudinal analyses to assess RTL–HCC relationship. We found that long baseline RTL conferred an increased HCC risk compared to short RTL [hazard ratio (HR) = 4.93, P = 0.0005). The association remained prominent when the analysis was restricted to patients with a more stringent 5-year exclusion window (HR = 7.51, P = 0.012), indicating that the association was unlikely due to including undetected HCC patients in the cohort, thus minimizing the reverse-causation limitation in most retrospective studies. Adding baseline RTL to demographic variables increased the discrimination accuracy of the time-dependent receiver operating characteristic analysis from 0.769 to 0.868 ( P = 1.0 × 10 −5 ). In a nested longitudinal subcohort of 16 matched cases–control pairs, using a mixed effects model, we observed a trend of increased RTL in cases and decreased RTL in controls along 5 years of follow-up, with a significant interaction of case/control status with time ( P for interaction=0.002) and confirmed the association between long RTL and HCC risk [odds ratio [OR] = 3.63, P = 0.016]. In summary, serum DNA RTL may be a novel non-invasive prospective marker of HBV-related HCC. Independent studies are necessary to validate and generalize this finding in diverse populations and assess the clinical applicability of RTL in HCC prediction. family history of cancer and cirrhosis status, where appropriate. Dose- dependent effect was analyzed using fractional polynomial regression model adjusting for all the host variables (32). The cumulative incidence of HCC by follow-up years was derived using the Nelson–Aalen method (33). Discrimination accuracy for predicting HCC risk within 10 years after initial sample collection using RTL and/or major host characteristics was assessed by the area under the curve (AUC) of time-dependent receiver operating characteristic (ROC) curves for censored survival data (34). The differences in discrimination accuracy between different ROC models were assessed by an internal validation using 10 000 bootstrap resampling. In the pilot longitudinal analysis, to analyze the relationship between the dynamic RTL change and HCC risk, RTL values measured within 5 years after initial sample collection for 16 matched case–control pairs were plotted against follow-up time. Mixed effects model was used to analyze RTL data through an interaction of the case/control status with time, and test the hypothesis that HCC risk is a function of longitudinal RTL change by estimating odds ratio (OR) and 95% CI. All statistical tests in this study were two-sided, and a P value of <0.05 was considered statistically significant.


Introduction
Telomeres consist of small tandem nucleotide repeats (TTAGGG in humans) that form the physical ends of eukaryotic chromosomes (1). The main functions of telomeres are to stabilize and protect the linear chromosome ends (1). In normal somatic cells, telomeres shorten by  base pairs within each cell division due to end replication inefficiency of DNA polymerase (2). The homeostasis of telomeres is regulated by telomerase. The relationship between telomerase activity, telomere length maintenance and tumorigenesis is complex and remains a hot research topic (3)(4)(5).
Retrospective case-control studies have shaped the perception that short relative telomere length (RTL) confers an increased cancer risk; however, this perception has been increasingly challenged in recent studies that used prospective approaches (6)(7)(8)(9)(10)(11)(12)(13)(14)(15). Although telomere dysfunction has been significantly implicated in hepatocarcinogenesis in various basic studies (16)(17)(18)(19), to date, there has not been a report prospectively and longitudinally evaluating RTL and liver cancer risk in hepatitis B virus (HBV) patients. A recent epidemiological study reported an association of long telomeres with increased risk of hepatocellular carcinoma (HCC) (20). Consistently, we found that longer telomere length in circulating cell-free serum DNA was significantly associated with an increased risk of both cirrhosis and HCC (21,22). Nonetheless, all these studies were all based on a retrospective case-control design and thus were limited by the reverse-causation issue inherent in case-control studies. Interestingly, two recent meta-analyses indicated that the significant associations between telomere length and cancer risk were observed mostly in retrospective but not the a few prospective studies (23,24). Taken together, these contradictory findings highlight the importance of prospective evaluations related to the role of telomere length in cancer risk assessment.
Another challenge associated with using telomere length as a surrogate cancer biomarker is to assessing the dynamic changes of telomere length over time in blood cells (25)(26)(27). Several recent longitudinal studies suggested cross-sectional measurements limited the analyses of blood cell telomere attrition or change rate that are affected by a variety of factors like age of individuals and environmental stimuli (25)(26)(27)(28)(29). As yet, no longitudinal study has been reported to assess the effect of time-dependent telomere length change on cancer development. Using a retrospective approach, we recently reported the associations between circulating RTL and the risk of cirrhosis and HCC (21,22). In the present study, we conducted a prospective cohort analysis and a pilot longitudinal case-control analysis to further elucidate the role of telomere length in predicting HCC risk in patients with chronic HBV infection.

Study population
Subjects in this study were identified from an existing clinic-based cohort. Patient enrollment in this cohort started from 1988 and is still ongoing. Patients were those who visited the Liver Disease Prevention Center at Thomas Jefferson University Hospital for treatment of various liver diseases, such as chronic HBV or HCV infection, fibrosis, cirrhosis or HCC. Demographic and clinical data were obtained for each patient through medical chart review and/or consultation with the treating physicians. The details on data collection and patient diagnosis were described previously (21,22). Liver cirrhosis was diagnosed mainly through imaging studies, which showed medial segment atrophy of the left lobe, caudate lobe hypertrophy, or liver nodularity for early stage disease, and signs of portal hypertension such as varices, splenomegaly, patent paraumbilical vein or ascites in advanced stage disease. Other criteria were used to complement to imaging studies, including liver biopsy, laboratory tests such as thrombocytopenia, serum albumin and prolonged prothrombin time and clinical presentations such as ascites, encephalopathy and gastrointestinal bleeding, etc. The majority of patients have available serum samples collected at initial study entry as well as subsequent follow-up visits. Among more than 2600 patients in this cohort, >90% were of Korean ancestry and had chronic HBV infection. To minimize the confounding of patient ethnicity and disease etiology, we restricted this study to Korean HBV patients. The patients included in the primary prospective cohort also met the following two additional criteria: (i) had a 1-year exclusion window (followed for >1 year and did not develop HCC within 1 year); and (ii) had available major demographic data including age, gender, smoking status, drinking status, family history of cancer and cirrhosis status. For the pilot longitudinal case-control analysis, from the 37 patients that were cancer-free at the time of initial sample collection but subsequently developed HCC, we selected 16 who had at least two serum samples available within 5 years of follow-up before HCC diagnosis and the first sample was collected >2 years before HCC diagnosis. Sixteen controls were selected from 286 HBV patients who remained cancer-free at their last follow-up and met the same criteria as the cases. Controls were frequency-matched to cases on age of first sample collection, age of last sample collection, gender, smoking status, drinking status, family history of cancer, cirrhosis status at first sample collection and follow-up time. This study was approved by the Institutional Review Board of Thomas Jefferson University. A written informed consent was obtained from each patient.

DNA isolation and RTL measurement
Circulating cell-free serum DNA was isolated from 200 ul serum sample using QIAamp DNA Blood Mini kit (Qiagen, Hilden, Germany) according to the manufacturer's protocol. The RTL of each DNA sample was determined by quantitative real-time polymerase chain reaction (qRT-PCR), which measured the ratio of the copy number of telomere repeats to the copy number of a human single copy gene (36B4). The detailed procedure of RTL measurement was described previously (21,22), with the following minor modifications in the present study: if the quantity of single RTL or single copy gene 36B4 was out of the acceptable range of the standard curve, or the cycle threshold (Ct) was >35, the sample was repeated. If the Ct value of RTL was >35 but the Ct value of single copy gene 36B4 was ≤35 and within the acceptable range of the standard curve, the sample was assigned to the category of short RTL when analyzed as a categorical variable, or a small value of 0.0001 when analyzed as a continuous variable. Dot blot assay was used to assess the specificity of the telomere DNA measured by qRT-PCR and was conducted as previously described (30). We quantified telomere length of one sample in hexaplicate in a single run, and determined that the intra-assay coefficient of variation (CV) for our RTL analysis was 2.58%. We also repeated the experiments in duplicate samples in four independent runs and determined that the inter-assay CV was 4.67%. These data were comparable to the reports of others using a similar method (31).

Statistical analysis
Statistical analyses were performed using the SAS software version 9.3 (SAS Institute, Cary, NC) and Stata 12.0 (StataCorp, College Station, TX). To include the largest possible sample size, we performed the primary prospective analysis with a 1-year exclusion window. To minimize the confounding effects of those patients who actually had HCC, but were not diagnosed at initial sample collection, we further restricted the analyses to subcohorts of patients with a 2-year or 5-year exclusion window. The distributions of host variables and HCC risk were analyzed by chi-square test. In prospective analyses, the RTL-HCC association was estimated as hazards ratio (HR) and 95% confidence interval (95% CI) by Cox proportional hazards regression model, using a univariate analysis and a multivariate analysis adjusting for age, gender, smoking status, drinking status,  (32). The cumulative incidence of HCC by follow-up years was derived using the Nelson-Aalen method (33). Discrimination accuracy for predicting HCC risk within 10 years after initial sample collection using RTL and/or major host characteristics was assessed by the area under the curve (AUC) of time-dependent receiver operating characteristic (ROC) curves for censored survival data (34). The differences in discrimination accuracy between different ROC models were assessed by an internal validation using 10 000 bootstrap resampling. In the pilot longitudinal analysis, to analyze the relationship between the dynamic RTL change and HCC risk, RTL values measured within 5 years after initial sample collection for 16 matched case-control pairs were plotted against follow-up time. Mixed effects model was used to analyze RTL data through an interaction of the case/control status with time, and test the hypothesis that HCC risk is a function of longitudinal RTL change by estimating odds ratio (OR) and 95% CI. All statistical tests in this study were two-sided, and a P value of <0.05 was considered statistically significant.

Characteristics of the study population
There were 344 cancer-free HBV patients who met the criteria of 1-year exclusion window as described earlier. After excluding 21 patients that failed RTL measurement, 323 patients with a median age of 43.9 (interquartile 25-75% range, 37.6-51.7) were included in the primary prospective cohort. During a median follow-up of 4.5 (interquartile 25-75% range, 2.4-8.0) years, 37 of the 323 HBV patients developed HCC (incidence rate, 11.5%). Detailed distributions of host characteristics of these patients are summarized in Table 1.

The prospective association of RTL with HCC risk in HBV patients
We analyzed the association between baseline RTL and HCC risk using univariate and multivariate Cox model by categorizing RTL into two levels (dichotomization analysis using a cut-off of the median RTL value in all subjects) and three levels (tertile analysis using cut-offs of tertile RTL values in all subjects) ( Table 2). In the primary prospective cohort of patients with a 1-year exclusion window, dichotomization analyses showed a significant association between longer RTL and HCC risk in both univariate analysis (HR = 6.43, 95% CI 2.68-15.43, P = 3.1 × 10 −5 ) and multivariate analysis (HR = 4.93, 95% CI 2.00-12.13, P = 0.0005). In the univariate tertile analysis, we observed a significant dose-dependent increase in HCC risk along with increasing baseline RTL. Using RTL value in the first tertile as reference, RTL in the second and third tertile was associated with an HCC risk of 3.31 (95% CI 0.92-11.91, P = 0.067) and 10.09 (95% CI 3.03-33.69, P = 0.0002), respectively, (P for trend=8.3 × 10 −6 ) ( Table 2). Multivariate analysis yielded very similar results (Table 2). A similar result was also noted when RTL was analyzed as a continuous variable using fractional polynomial regression model (P value of 4.5 × 10 −14 ) ( Figure 1). However, the trend appeared to be unstable, as reflected by a wide confidence interval, in the few patients with the highest RTLs, likely due to the small patient number in this RTL range ( Figure 1). We further restricted the prospective analyses to patients with a 2-year (258 patients) or 5-year (153 patients) exclusion window. The longer exclusion window helped minimize the confounding effect of undiagnosed HCC patients at the time of baseline sample collection. The results of analyses with longer exclusion windows were highly consistent with that of the analysis with a 1-year exclusion window (  Figure 2). Taken together, these lines of evidence suggested that RTL of baseline serum DNA might be an independent prospective HCC predictor in HBV patients. To assess the specificity of the telomeric DNA measurement in our serum samples, we conducted dot blot analysis to quantify the telomere content in all the 323 samples tested in this study according to a published method (30), and analyzed the correlation between the results of dot blot and qRT-PCR methods. We found that dot blot experiment had lower sensitivity than qRT-PCR. Twelve samples were labeled by ImageQuant as undetectable (no signal) and 151 as having high background noise. Among samples with a detected dot blot signal, we observed a high correlation between the two methods (r = −0.658, P = 6.4 × 10 −40 ) (Supplementary Figure 1, available at Carcinogenesis Online).

HCC risk prediction models incorporating baseline RTL
We constructed 10-year time-dependent ROC curves and calculated the AUC in different models to evaluate their discrimination accuracy (Figure 3). In the primary prospective cohort with a 1-year exclusion window, the AUC was 0.678, 0.802 and 0.837 for models with RTL only, demographic variables (age, gender, smoking status, drinking status, family history of cancer and cirrhosis) only (Demo model) and RTL plus demographic variables (Full model), respectively. The increase of AUC from Demo to full model was statistically significant (P = 0.0126) by bootstrap analysis ( Figure 3A). Very similar findings were observed when the analyses were done in subcohorts with a 2-year ( Figure 3B

Longitudinal analysis
Several recent longitudinal studies reported a feature of dynamic change in telomere length of peripheral blood leukocytes (25)(26)(27)(28)(29). The concentration and components of circulating cell-free DNA vary in a wide range that is affected by physiological and pathological status of individuals (37). Therefore, a single baseline RTL measurement may not be reliable enough to yield robust findings. To help address this concern, we conducted a pilot longitudinal case-control analysis by measuring all the available serum samples collected during 5 years after baseline RTL measurement before HCC diagnosis, from 16 matched case-control pairs. The cases and controls were adequately matched on age at first sample collection (P = 0.66), age at last sample collection (P = 0.80), follow-up time (P = 0.10), gender (P = 0.37), smoking status (P = 0.72), drinking status (P = 0.26), family history of cancer (P = 1.00) and number of RTL measurements (P = 0.12). All cases and controls had cirrhosis at sample collection (P = 1.00) ( Figure 4A). Using mixed effects model, we observed a trend of increased RTL in cases and decreased RTL in controls along 5 years of follow-up, with a significant interaction of case/control status with time (P for interaction, 0.002) ( Figure 4B). We also found that, compared to patients with a longitudinal trend of decreased RTL, those with a longitudinal trend of RTL increase had a significantly higher HCC risk in both univariate (OR = 3.88, 95% CI 1.32-11.39, P = 0.014) and multivariate (OR = 3.63, 95% CI 1.27-10.35, P = 0.016) analysis ( Figure 4C).

Discussion
In this study, we used prospective and longitudinal approaches to evaluate the predictive role of serum DNA RTL in HCC risk in a clinic-based HBV patient cohort. Our results indicated that long baseline RTL conferred an elevated HCC risk in a dose-dependent manner. The RTL-HCC association remains highly significant when the analysis was conducted with a strict 5-year exclusion window which largely eliminated the confounding effects of undiagnosed early-stage HCC patients. We also found that adding baseline RTL to common demographic and clinical variables increased the power of predicting HCC risk. Collectively, our data suggest that serum DNA RTL may be a potential novel noninvasive HCC risk predictor in HBV patients.
There is a large body of evidence suggesting that shortened telomeres may play a causal role in cancer development by inducing chromosomal instability and promoting neoplastic transformation (38). However, an increasing number of recent studies reported that abnormal telomere length, either shortened or lengthened, was linked to various cancers (23,39). Nonetheless, most of these studies adopted a retrospective case-control design using samples collected at the time after cancer diagnosis. Thus, their conclusions were constrained by the reverse causation limitation (23). Indeed, several more recent prospective studies failed to replicate the RTL-cancer associations reported in previous case-control studies in the same populations (40)(41)(42). These controversial findings highlight the need of prospective and longitudinal evaluations of the causal relationship between RTL and cancer risk. We conducted an extensive literature search and identified 25 studies that prospectively evaluated the associations between telomere length and cancer risk (Supplementary Table 1, available at Carcinogenesis Online). Among which, 12 studies reported an association of increased cancer risk with long telomeres and 4 studies reported the opposite. Two studies reported a U-shape association indicating that both very short and very long telomeres were associated with an increased cancer risk. The rest studies did not observe a significant association. Of these studies, an elegant one (with the largest population size and longest follow-up) was reported recently by Weischer et al. (9) that prospectively followed 47 102 Danish general population participants for up to 20 years and evaluated the associations between telomere length and 24 major cancer types. They found the risks of breast cancer and sarcoma increased with long telomere length but there was no significant association for the rest 22 cancers. Taking these lines of evidence together, prospective studies seemed to suggest that the RTL-cancer risk associations are in a cancer-type specific manner and each cancer should be analyzed separately. However, it is worth noting that the Danish study did not observe a significant association between RTL and liver cancer risk either, which is inconsistent with our study and might be due to the different population characteristics. For instance, the Danish study was based on a population-based cohort of general healthy white participants with low HBV infection and low liver cancer incidence (0.15%) whereas our study was based on a clinic-based cohort of Asian American HBV patients with high HCC incidence (11.5%). The liver cancer etiologies are likely to be different, because unlike other common causes of HCC such as hepatitis C viral (HCV) infection and chronic alcoholic liver diseases that are primarily mediated by progression through liver cirrhosis, a major mechanism by which HCC may arise from chronic HBV infection is the integration of HBV genome into the host genome, resulting in genomic aberrations that lead to oncogene activation, tumor-suppressor gene inactivation, or other predispositions to chromosomal instability (43). In two recent casecontrol studies, we reported that long telomeres in circulating DNAs were significantly associated with increased risks of cirrhosis and HCC in HBV patients, which was consistent with the findings of another case-control study that assessed RTL in DNA of peripheral blood leukocytes (20)(21)(22). Our finding in the present study took a step further to confirm a prospective predictive role of circulating DNA telomeres in HCC risk. However, the mechanism underlying the paradoxical observations between blood and tissue telomere length in the development of HCC in HBV patients, as well as whether our conclusion can be generalized to other ethnic groups (non-Asian ancestry), or HCC etiologies (HCV-HCC, alcoholic HCC, etc.) remains to be further evaluated.
In the pilot longitudinal analysis, we observed a decreasing trend of RTL along with age in HBV patients who did not develop HCC, which was not surprising because a negative correlation between telomere length and age has been well-documented. In comparison, a significant trend of increasing RTL was noticed in patients who developed HCC after 5 years of follow-up ( Figure 4). This data echo the findings of the prospective analysis, further establishing a temporal link between RTL and HCC risk. However, the result of this analysis needs to be interpreted with caution due to its unplanned nature and small size.
This study has several strengths. The unique and homogenous Korean American HBV patient cohort enrolled in a single  institute is a major strength of this study, which eliminated the confounding effects of patient ethnicity and disease etiology. Another strength is the restriction of the prospective analyses to sub-cohorts with an exclusion window of as long as 5 years that largely minimized the confounding resulting from the possible inclusion of undetected HCC cases. Moreover, the non-invasive nature of measuring circulating cell-free DNA derived from routine blood tests makes it possible for repetitive monitoring of the dynamic change of telomere length during patient follow-up and treatment.
Our study also has apparent limitations. First, although our study benefited from a unique prospective and longitudinal design, our number of HCC patients is modest and the findings still warrant independent validations. Second, there is a large body of basic studies linking shortened telomere length and tumorigenesis. However, there is very scarce experimental evidence supporting a link between elongated telomere length and cancer development. Although this may be partially accounted for by different characteristics and functions between tissue and blood cells (19,(44)(45)(46), mechanistic studies are needed to elucidate the molecular mechanisms underlying the associations between long RTL and elevated cancer risk observed in various epidemiological studies. Third, recent studies reported that demographic and clinical variables such as obesity, glucose intolerance, des-gamma carboxyprothrombin, transaminases and AFP, were associated with HCC risk (35,(47)(48)(49). Since the data of our study were obtained through medical chart review, these variables were not available because they were either not completely recorded, or not measured at the time of blood collections. It is also worth noting that, compared to the study of Wen et al. (35). which reported an AUC of >0.9 in a model that included age, gender, ALT and AST, our current study reported much lower AUCs on models based on ALT, AST or AFP individually (Supplementary Figure 2, available at Carcinogenesis Online). The differences in predictive powers were likely due to the differences between the two studies in population characteristics, sample sizes, variables that were included in the models, and cutoffs used for the variables. Therefore, it remains to be further evaluated if RTL adds additional predictive value to a model that includes a more comprehensive panel of variables. Fourth, circulating cell-free DNA is a heterogeneous mixture including DNA molecules from a wide spectrum of cell types with broad variations in content and concentrations (50). Whether the telomere DNA measured in this study is from a combination of all or some of these cell types is an important question worthy of further investigations. In our current study, because the majority of the patients are either cancer-free HBV patients or HCC patients with early-stage tumors, it is reasonable to conjecture that the circulating DNAs of these patients should mainly be derived from normal blood DNAs, thus reflecting the genetic background of the subjects. However, normal or tumor hepatocytes may also significantly contribute to circulating DNAs because in HBV patients, chronic inflammation tends to lead to repeated cycles of destruction and regeneration of liver tissues, a process that releases a large amount of apoptotic or necrotic hepatocytes into circulation. Thus, to more accurately determine the cell type origin of circulating DNA RTLs, it is important to compare RTLs between different cell types such as serum, leukocytes, tumor tissue and normal tissue that are obtained at the same time from the same patients. Another possible solution is to use high-depth next-generation sequencing technology to sequence the telomeric and subtelomeric regions using in vivo models and/or human samples (51). These directions are important to future in-depth mechanistic investigations of telomere length alteration in HCC development.
In summary, our study suggests that telomere length in circulating cell-free serum DNA may potentially be used as a prospective non-invasive marker of HCC risk in HBV patients. Independent populations are necessary to validate and generalize this finding in diverse populations and assess the clinical applicability of RTL in HCC prediction.