Liquid Biopsy Hotspot Variant Assays: Analytical Validation for Application in Residual Disease Detection and Treatment Monitoring.

BACKGROUND
Analysis of circulating tumor DNA (ctDNA) in plasma is a powerful approach to guide decisions in personalized cancer treatment. Given the low concentration of ctDNA in plasma, highly sensitive methods are required to reliably identify clinically relevant variants.


METHODS
We evaluated the suitability of 5 droplet digital PCR (ddPCR) assays targeting KRAS, BRAF, and EGFR variants for ctDNA analysis in clinical use.


RESULTS
We investigated assay performance characteristics for very low amounts of variants, showing that the assays had very low limits of blank (0% to 0.11% variant allele frequency, VAF) and limits of quantification (0.41% to 0.7% VAF). Nevertheless, striking differences in detection and quantification of low mutant VAFs between the 5 tested assays were observed, highlighting the need for assay-specific analytical validation. Besides in-depth evaluation, a guide for clinical interpretation of obtained VAFs in plasma was developed, depending on the limits of blank and limits of quantification values.


CONCLUSION
It is possible to provide comprehensive clinical reports on actionable variants, allowing minimal residual disease detection and treatment monitoring in liquid biopsy.


Introduction
In the era of precision medicine, liquid biopsy is a promising tool for identification of genetic tumor status and real-time monitoring of evolutionary tumor dynamics in plasma (1). Detection and quantification of pathogenic variants in key oncogenes including KRAS proto-oncogene (KRAS), B-Raf proto-oncogene (BRAF), and epidermal growth factor receptor (EGFR) may be used to detect therapy resistance, residual disease, and recurrence before clinical evidence, in different types of solid cancer (2)(3)(4)(5). In liquid biopsy, appropriate methods including droplet digital PCR (ddPCR) are required to detect lowest amounts of variants in plasma, and to distinguish their signals from inherent background noise, since these are relevant for individual treatment decision (6,7). For instance, it could be shown that the presence of tumor variants, even at very low variant allele frequencies (VAFs) of <1%, serve as indicator of residual disease (8) and response to EGFR-directed therapy (9,10). When applying these methods for clinical interpretation, it is critical to understand and assess their performance characteristics at the lower end of the measurement scale (2,11,12). Two important performance characteristics are the limit of blank (LOB) and the limit of quantification (LOQ), each of which have a distinct definition according to guidelines (13). The LOB serves as cutoff for a sample to be defined as positive or negative for a variant. The LOQ is the lowest VAF that can reliably be detected and accurately quantified. Additionally, to accurate assessment of very low VAFs, further, it is important to acknowledge assay variability, which is expressed by measurement uncertainty or total measurement error.
Here, we provide a large study assessing the performance characteristics of various ddPCR assays, which cover clinically relevant variants in BRAF, KRAS, and EGFR. We clearly determined for each assay whether a result was truly negative, positive, and quantifiable, and evaluated further parameters including total measurement error, trueness, precision, and linearity. We further developed a guide for clinical interpretation of obtained VAFs, depending on the LOB and LOQ values for each assay, which were evaluated in the course of this work, to enable clinical interpretation of pathogenic variants in plasma for liquid biopsy-based residual disease and treatment monitoring.

SAMPLES AND DDPCR ANALYSES
Information on reference sample generation and VAF determination, DNA extraction, and ddPCR protocols are provided in the Methods in the online Data Supplement.

RELEVANT LIQUID BIOPSY VARIANTS
Liquid biopsy samples with very low measured VAFs above the LOB were detected with !95% specificity, and are reported as mutant positive in compliance with CLSI guidelines on quantitative clinical tests (Supplemental Fig. 1, A and B) (13). Detection of lowest variant amounts is crucial when assessing ctDNA status after surgery for prognostic purposes. Detection of ctDNA in plasma is a strong indicator of residual disease (Supplemental Fig. 1, C) (8). Liquid biopsy samples with measured VAFs !limit of detection (LOD) were detected with !95% sensitivity, while measured VAFs !LOQ were detected with !95% sensitivity and acceptable measurement uncertainty (i.e., total error 50%) (Supplemental Fig. 1, A and B) (13,14). Due to precise VAF measurements at LOQ, quantification at the LOQ and higher frequencies is possible. This enables identification of minimal but true changes in VAF abundances !LOQ, thereby providing information on tumor progression and response to treatment, respectively (2, 3, 8) (Supplemental Fig. 1, C).
LOB The LOB was determined by measuring the fractional abundance, which represents the positive relative droplet counts for the tumor variant in all droplets of all detected alleles, in at least 60 negative control (NC) measurements (Supplemental Table 1) (13). The value of the rank of the 95th percentile of sorted results from the NC samples was determined as the LOB (Supplemental Methods, Supplemental Fig. 1, B).
LOD Determination of the LOD is critical for identifying the clinically relevant parameter, the LOQ (see next). Measured VAFs at the LOD were detected with !95% sensitivity. To obtain the LOD, at least 60 measurements of low mutant VAF positive control (PC) samples were performed (13) (Supplemental Table 2). Distribution of fractional abundance was assessed using probability quantile-quantile plots. Bartlett's test was used to test the consistency of the standard deviation of the fractional abundances across samples with different mutant VAFs (i.e., if samples had equal variances) (Supplemental Table 3) (15). Depending on the result distribution and pooled standard deviations (SD S ) of each independent test assay, a parametric or nonparametric approach was selected to determine the LOD (Supplemental Methods).
LOQ In addition to criteria at the LOB (specificity !95%) and the LOD (sensitivity !95%), at the LOQ, the total error of these measurement results was required to be 50% with a 95% confidence interval to allow accurate VAF quantification during treatment monitoring (13,14). The LOQ was determined by measuring the mutant VAFs in >40 replicate measurements according to guidelines (13).

LINEARITY
Confirmation of assay linearity was critical to ensure precise VAF quantification across the entire measurement scale (i.e., from LOQ to 100%). This way, the entire VAF range in plasma of cancer patients was covered (16) and under-or overestimation of VAFs was avoided. Linear range was established by VAF measurements of 7 PC samples in 2 to 3 replicates with evenly distributed mutant VAFs (from the lowest VAF determined by the LOQ up to 100% VAF) for each of the 5 assays (17) (Supplemental Table 5). Linearity was assessed using polynomial regression analysis for first-, second-, and third-order polynomials. To assess if the fractional abundance equaled the mutant VAF the difference between the best fitting polynomial and the ideal linearity (y ¼ x) must have been 10%. Further, to assess whether replicate measurements used for the determination of linearity were representative, repeatability of these measurements was assessed by calculating the pooled SD (SD r ) (Supplemental Methods) (17).

TRUENESS, PRECISION AND TOTAL ERROR
Trueness, representing the closeness of measurement results to the true value, was indirectly determined by calculating the bias for replicate measurements of PC samples with mutant VAFs at the LOQ of each independent assay (Supplemental Methods). To provide reliable results, the bias should not exceed 10% leading to trueness !90% (14).
The precision of each ddPCR assay, representing the closeness of repeated measurement results, was determined in terms of repeatability and intermediate precision, considering intra-and interapproach precision. Both parameters were calculated for replicate measurements of PC samples with mutant VAFs at and above the LOQ of the intended assay (Supplemental Table 6).
Repeatability, intermediate precision, and total error was calculated for 3 different VAF intervals: at the LOQ, from the LOQ to 10% mutant VAF, and from 10% to 100% mutant VAF. Repeatability was indirectly determined based on the SD r (Eq. 6, Supplemental Methods) of all replicate measurement results obtained within one experiment. Intermediate precision was indirectly determined based on the intermediate SD (SD IP , Eq. 10, Supplemental Methods). To provide reliable results, both SD r and SD IP should be less than 20% for mutant VAFs at the LOQ and less than 15% for mutant VAFs above the LOQ, leading to both repeatability and intermediate precision !80% for mutant VAFs at LOQ and !85% for mutant VAFs above LOQ (18).
The total error was calculated for a 95% confidence interval by combining bias with 2 times the pooled SD (SD S ) (13) (Supplemental Methods). The goal for the total error was set according to the acceptance criteria for trueness and precision and should therefore not exceed 50% for replicate measurements of PC samples at the LOQ and 40% for low, medium, and high mutant VAFs (14,18).

Results
In this study we focused on 5 Bio-Rad ddPCR TM Mutation Detection Assays for analysis of BRAF p. V600E, EGFR p. T790M, EGFR p. L858R, KRAS p. G12[A/C/D/R/S/V], and KRAS p. G13D variants, and the 15 most frequent EGFR exon 19 small deletions, which are clinically relevant for patients with colorectal cancer and nonsmall cell lung cancer (4,5). Determination of their key performance characteristics, LOB, LOD, LOQ, and linearity was performed in accordance with the current guidelines for analytical validation of quantitative tests (13,17,19,20) (Fig. 1, online Data Supplement).

LOB OF ASSAYS
To determine the LOB, a nonparametric approach was used, since fractional abundance measured with ddPCR did not show a normal distribution. Accepting 95% specificity, the LOB was obtained by calculating the 95th percentile of obtained values. For EGFR p. L858R and EGFR exon 19 deletions assays, the LOB was confirmed at $0.00%, for BRAF p. V600E at 0.02%, for EGFR p. T790M at 0.08%, and for KRAS p. G12/p.G13 at 0.11% mutant VAF (Fig. 2, Table 1, Supplemental  Fig. 2). Taken together, these results showed that the background noise of the assays would be expected to be 0 or close to 0 for EGFR p. L858R, EGFR exon 19 deletion, and BRAF p. V600E assays, low for EGFR p. T790M assay and modest (but still acceptable) for KRAS p. G12/p.G13 assay. As mentioned above, measured VAFs > LOB are interpreted as "detected" in plasma, and indicate residual disease (8).

LOD OF ASSAYS
The LOD is the VAF at which 95% sensitivity was achieved (Supplemental Fig. 1, B). From a clinical point of view, the LOD is no cutoff for interpretation, in contrast to LOB and LOQ (13) (Supplemental Fig. 1, A and C). Actually, the LOD was a first step in identifying the clinically relevant LOQ. The EGFR p. L858R assay had the lowest LOD of all 5 assays (0.08%) followed by the EGFR exon 19 deletions assay (0.1%), the EGFR p. T790M assay (0.18%), the KRAS p. G12/p.G13 assay (0.26%), and was highest for the BRAF p. V600E assay (0.4%) (Fig. 2 showing an up to 5-fold difference between the LODs across assays.

LINEARITY OF ASSAYS
Linearity describes the ability of a method to provide results that are directly proportional to the VAF of a mutant variant (17). Briefly, the coefficients of variation for these replicate fractional abundance measurements were 6.30% in case of the EGFR p.  Fig. 4).

TRUENESS, PRECISION, AND TOTAL ERROR OF ASSAYS
We further determined trueness at the LOQ, as well as precision and total error within the linear range from LOQ to 100% VAF for each ddPCR assay by using As expected, repeatability and intermediate precision were higher for higher mutant VAFs than for lower mutant VAFs, which is accordingly reflected in the total error.  (Fig. 3). The analytical validation performed within this study enabled precise differentiation between negative and positive results (LOB) and quantification of the VAF (above the LOQ). Using the liquid biopsy, we observed tumor progression (Fig. 3, A) and response to treatment (Fig. 3, B). Figure 4 provides an overview at which time points during treatment measurement of mutant variants by liquid biopsy is of clinical relevance. In clinical practice, determination of residual disease after surgery with liquid biopsy assays improves classification into patients at high-and low-risk, especially for borderline patients.
Here, the detection of VAFs above or below the LOB, is an indicator of the presence or absence of residual disease, respectively (Supplemental Fig. 1, A) (21). Similarly, during follow-up, liquid biopsy enables a more accurate determination of prognosis. Thus, follow-up intervals or new therapies can be planned more precisely. During chemotherapy, the liquid biopsy can provide an indication of treatment response or tumor progression by assessing tumor burden during therapy by monitoring VAFs !LOQ (Supplemental Fig. 1, A).

Discussion
In this study, we evaluated the ability of 5 ddPCR-based assays to reliably detect and quantify clinically relevant BRAF, EGFR, and KRAS tumor variants in ctDNA. We performed validation of the BRAF p. V600E, EGFR exon 19 deletions, EGFR p. T790M, EGFR p. L858R, and KRAS p. G12/p.G13 assays, and confirmed the reliability and accuracy of variant detection and quantification for clinical use. Thereby, we focused on international guidelines for quantitative tests (13,14,17,19,20). We further provide a guide for how to apply the determined detection and quantification limits in the liquid biopsy reports on residual disease and monitoring approaches and highlight the clinical meaning of measured VAFs (Fig. 4).
Although other studies have applied ctDNA-based ddPCR for the detection of tumor variants occurring at low VAFs in plasma, they did not address all the parameters required for VAF quantification. The most detailed validation protocol for ddPCR-based liquid biopsy assays was provided by Milosevic et al. (22), but there were some limitations: first, they analyzed mutant copies/mL plasma for determining the performance parameters, instead of the fractional abundance of mutant alleles in a background of wild type alleles, which is the Fig. 3. Disease monitoring in 2 patients with colorectal cancer. Several samples were analyzed throughout the course of disease of 2 patients with colorectal cancer. Day 0 indicates the start of therapy. Measured mutant variant allele frequencies (VAFs) greater than the limit of quantification are labeled with the respective VAF and 95% confidence interval. Measured mutant VAFs between the limit of blank and the limit of quantification are labeled with "pos." to emphasize that the mutant variant was detected but the VAF could be quantified. (A), the variant allele frequency (VAF) of KRAS proto-oncogene (KRAS) single nucleotide variants (SNVs) analyzed in 9 circulating-free DNA (cfDNA) samples from a patient with KRAS p. G13D mutant variant. Plasma samples from this patient were collected over a period of 1.5 years. (B), the VAF of B-Raf proto-oncogene (BRAF) p. V600E was analyzed in 11 cfDNA samples from a patient with the respective variant present in tumor tissue. Plasma samples of this patient were collected over a period of 2.5 years. LOB, limit of blank; LOQ, limit of quantification.
crucial clinical parameter for the interpretation of findings. Second, all performance parameters were obtained based on the assumption that data were normally distributed, which is not true for the LOB since no values less than zero can be obtained. Third, linearity determination was performed based on DNA concentration, but not on the mutant VAF, which should be reported in disease monitoring (23).
Our validation involved analysis of reference samples with known mutant VAFs from 0% to 100% (Supplemental Methods) to stratify its detection and quantification. We observed substantial differences between the 5 ddPCR assays in the assessment of the LOB. While the EGFR exon 19 deletions, the BRAF p. V600E and the EGFR p. L858R assays produced none or almost no false positive signals, the EGFR p. T790M and the KRAS p. G12/p. G13 assays generated numerous false positive signals (i.e., LOB of 0.08% and 0.11%, respectively). Also, differences in LOD and LOQ were observed between the 5 ddPCR assays. The EGFR p. L858R and EGFR exon 19 deletions assays revealed the lowest LOD (0.08% and 0.1%, respectively), which combined with a LOB of zero indicates the capacity of both assays to detect true positive signals at very low VAFs. In contrast, the BRAF p. V600E assay showed the highest LOD (0.4%) despite an LOB of 0.02%. We found that the fractional abundance measured by the BRAF p. V600E assay was lower than the mutant VAF of the PC sample used, which could explain the higher LOD of this assay. Finally, the KRAS p. G12/p. G13 and the EGFR p. T790M assays showed intermediate LOD thresholds (0.26% and 0.18%, respectively), which combined with the high LOB (0.11% and 0.08%, respectively) indicate that it is possible to reliably detect relatively low VAFs, although both assays will accumulate false positive signals. Consequently, we conclude that having a high sensitivity is not directly associated with the ability to detect variants at low VAFs.
The lowest LOQ was determined at a mutant VAF of 0.41% for the KRAS p. G12/p. G13 assay, while the highest LOQ was determined at a mutant VAF of 0.7% for the EGFR exon 19 deletions assay. These results again emphasize that the sensitivity of an assay (expressed by the LOD) is not an indicator of the assay's suitability for the quantification of variants at low VAFs. To the best of our knowledge, this has not been emphasized in other publications using ddPCR liquid biopsy assays. Thus, the differences in LOB, LOD, as well as LOQ across assays highlight the need for determining these performance characteristics for each single mutant variant assay individually to enable precise clinical interpretation for each variant. Notably, many available assays covering multiple mutant variants provide only mean values across all variants, which does not for differences in performance across mutant variants (24)(25)(26). For instance, Woodhouse et al. determined an LOB of a next-generation sequencing panel at 0.013% VAF, although the exact LOB was either higher or lower for most variants (26). This leads to false positive or false negative results on mutant variant detection and hence interpretation on residual disease status.
Since mutant VAFs will vary during disease progression and/or treatment (27), we addressed whether precise measurement results could be repeatedly obtained for the 5 assays (expressed by trueness, precision, and linearity). Trueness and precision were determined as the variability in the total error of replicate measurements of reference samples with known mutant VAFs from the LOQ to 100% VAF (Supplemental Methods). This is crucial since wrong error estimation would lead to misinterpretation of the VAF during disease monitoring. The highest precision was detected within the interval from 10% to 100% mutant VAF (>98% for all 5 assays) and the lowest was detected at the LOQ values (>70.5% for all 5 assays). Overall, quantification of the mutant VAF within the acceptable confidence interval is recommended by the European Medicines Agency guidelines for bioanalytical method validation (18), and was possible in all 5 assays.
Finally, all 5 assays showed linearity of the measurements. Measured fractional abundances were directly proportional to the actual mutant VAFs, meaning the mutant VAF of the samples can be reliably determined. These important findings verify that direct comparison of measurements performed at different time points during treatment of the patient is feasible. Yet, besides the analytical variability, the biological variability might also influence the results obtained. Therefore, knowledge of potential confounding factors, such as exact sampling time, is critical, since for example an increase in cfDNA concentrations has been shown during and after surgery (28). To minimize the effect of biological variability, analysis of follow-up samples from a patient should ensure correct interpretation of results.
In the future, it will be important to prove the clinical validity of each applied liquid biopsy assay. Therefore, longitudinal patient samples should be analyzed using the assays validated within this study and results should be compared to clinical evidence of recurrence and status of disease progression. In this context, numerous studies suggest that generally predictions about residual disease and tumor progression based on liquid biopsy assays reflect the actual disease status (Fig.  4) (2,3,6,21,29).
Taken together, the hotspot variant-specific ddPCR assays are well designed genetic tests to reliably detect and quantify EGFR exon 19 deletions, EGFR p. L858R, EGFR p. T790M, BRAF p. V600E, and KRAS p. G12/p. G13 variants at very low VAFs. ddPCR is a fast and cost-effective method for the analysis of tumor variants, which can be easily implemented into clinical practice. We demonstrate that analytical validity is essential and needs to be independently established for all clinical grade assays to identify assay-specific performance thresholds. Overall, with our findings we are able to provide a comprehensive actionable report for patients and enable disease monitoring.

Supplemental Material
Supplemental material is available at Clinical Chemistry online.
Author Contributions: All authors confirmed they have contributed to the intellectual content of this paper and have met the following 4 requirements: (a) significant contributions to the conception and design, acquisition of data, or analysis and interpretation of data; (b) drafting or revising the article for intellectual content; (c) final approval of the published article; and (d) agreement to be accountable for all aspects of the article thus ensuring that questions related to the accuracy or integrity of any part of the article are appropriately investigated and resolved.