Importance of genotype for risk stratification in arrhythmogenic right ventricular cardiomyopathy using the 2019 ARVC risk calculator

Abstract Aims To study the impact of genotype on the performance of the 2019 risk model for arrhythmogenic right ventricular cardiomyopathy (ARVC). Methods and results The study cohort comprised 554 patients with a definite diagnosis of ARVC and no history of sustained ventricular arrhythmia (VA). During a median follow-up of 6.0 (3.1,12.5) years, 100 patients (18%) experienced the primary VA outcome (sustained ventricular tachycardia, appropriate implantable cardioverter defibrillator intervention, aborted sudden cardiac arrest, or sudden cardiac death) corresponding to an annual event rate of 2.6% [95% confidence interval (CI) 1.9–3.3]. Risk estimates for VA using the 2019 ARVC risk model showed reasonable discriminative ability but with overestimation of risk. The ARVC risk model was compared in four gene groups: PKP2 (n = 118, 21%); desmoplakin (DSP) (n = 79, 14%); other desmosomal (n = 59, 11%); and gene elusive (n = 160, 29%). Discrimination and calibration were highest for PKP2 and lowest for the gene-elusive group. Univariable analyses revealed the variable performance of individual clinical risk markers in the different gene groups, e.g. right ventricular dimensions and systolic function are significant risk markers in PKP2 but not in DSP patients and the opposite is true for left ventricular systolic function. Conclusion The 2019 ARVC risk model performs reasonably well in gene-positive ARVC (particularly for PKP2) but is more limited in gene-elusive patients. Genotype should be included in future risk models for ARVC.


Introduction
Arrhythmogenic right ventricular cardiomyopathy (ARVC) is a heritable heart muscle disorder characterized by ventricular arrhythmia (VA) that can lead to sudden cardiac death (SCD). 1 The clinical diagnosis of ARVC is based on consensus criteria that have been developed to capture the right ventricular abnormalities typically manifested by patients. 2 The genetic architecture of ARVC is diverse, with pathogenic variants first described in genes coding for desmosomal proteins and then, more recently, in genes encoding a number of non-desmosomal proteins. 3 Although limited, current data suggest that clinical outcomes may differ between genotypes. [4][5][6] Implantable cardioverter defibrillator (ICD) therapy in patients with ARVC and documented sustained VA is associated with improved survival 7,8 and is considered a Class 1 indication in current practice guidelines. However, the indications for primary prevention ICDs in individuals with ARVC that have no history of sustained VA are less certain and until recently were based on a subjective evaluation of clinical risk markers supported by evidence of variable quality. 9 In a recent landmark study, a risk tool designed to provide individualized risk estimates was proposed 10 but the model was agnostic to the underlying genetic cause of the disease.
In this study, we hypothesized that disease aetiology influences the risk of VA in ARVC and sought to determine the performance of the 2019 ARVC risk model in a large multicentre cohort of patients stratified by genotype. During the review of this paper, a correction to the 2019 ARVC risk model was published. 11 Our analysis is based on the corrected risk score but data on the original model are presented in the Supplementary material online to illustrate the impact of the correction on its performance.

Study design and participating centres
This is an international, multicentre retrospective observational cohort study using data on consecutively evaluated patients with ARVC recruited from 17 centres in 7 countries (see Supplementary material online, Table S1). All participating centres specialize in the clinical management of cardiomyopathy patients. The study conforms to the Declaration of Helsinki and all centres have local ethical approval.

Study population
Patients were enrolled according to pre-specified inclusion criteria. Specifically: (i) a definite diagnosis of ARVC according to the 2010 taskforce criteria (TFC) 2 ; (ii) no history of sustained VA before first assessment at the participating centre; (iii) a follow-up period of at least 1 month; and (iv) age of diagnosis of 14 years or more.

Data collection and study variables
Study data were collected independently by each centre and managed using REDCap (Research Electronic Data Capture) electronic data capture tools hosted at University College London. 12 Standard data collection procedures, in accordance with general data protection regulation, were followed (see Supplementary material online, Table S2). Data were collected at each centre following review of medical and death records using variables derived from those used by Cadrin-Tourigny et al. 10 The time of diagnosis was set as the baseline timepoint. The baseline phenotypic data comprised the primary dataset used for most analyses. As not all individuals had cardiac magnetic resonance (CMR) imaging at their baseline timepoint, we collected a second dataset of all the phenotypic data at the time of a first CMR if performed during follow-up (CMR dataset). The latter was used only for data imputation and sensitivity analysis purposes (see Supplementary material online).

Genetic analysis
Clinical genetic testing is performed routinely at all participating centres using next-generation sequencing or direct sequencing of candidate genes associated with ARVC. 3 In all probands, genes that have been shown to have a moderate or strong association with ARVC were analysed 3 : plakophilin-2 (PKP2), desmoplakin (DSP), plakoglobin (JUP), desmoglein-2 (DSG2), desmocollin-2 (DSC2), transmembrane protein 43 (TMEM43), desmin (DES), and phospholamban (PLN). Genetic variants were classified according to the American College of Medical Genetics and Genomics guidelines following independent review (P.S. and A.P.). 13 Where additional evidence of pathogenicity (e.g. segregation data) for novel variants was available from the contributing centre, they were re-classified accordingly. A list of all the identified variants and their respective classification are reported in Supplementary material online, Table S3. Gene-elusive and gene-positive genetic status was defined according to the absence or presence of a pathogenic or likely pathogenic variant in any of the genes tested in each patient.

Outcomes
The primary outcome of this study was the first VA during follow-up and was a composite of: (i) spontaneous sustained ventricular tachycardia (VT), defined as VT lasting ≥30 s or with haemodynamic compromise at ≥100 b.p.m. or terminated by electrical cardioversion; (ii) ICD intervention, defined as ICD shock or anti-tachycardia overdrive pacing delivered in response to a ventricular tachyarrhythmia confirmed by intracardiac ECG data; (iii) SCD, defined as death of cardiac origin that occurred unexpectedly within 1 h of the onset of new symptoms or a death that was unwitnessed and unexpected; (iv) aborted sudden cardiac arrest, defined as SCD, that is reversed by cardiopulmonary resuscitation and/or defibrillation or cardioversion. Death from any other cause and heart transplantation was also recorded.

Predictors
Potential predictors similar to those used by Cadrin-Tourigny et al. 10 were studied. All predictor variables were determined at the baseline timepoint or within 1 year of baseline but always before the arrhythmic outcome. Recent syncope was defined as cardiac syncope during the last 6 months before baseline.

General statistical methods
All data manipulation and analyses were performed using the Python programming language (Version 3.8, Python Software Foundation, https:// Importance of genotype for risk stratification in ARVC www.python.org/). Continuous variables were tested for normality of distribution by visual inspection of histograms and statistical normality tests (Shapiro-Wilk). Normally distributed variables are expressed as mean + SD and non-normally distributed variables as median (25th, 75th percentiles). Categorical variables are reported as counts and percentages, as appropriate. The TableOne and Scipy libraries were used for the construction of summary statistics tables and for all comparisons. 14 The Seaborn and Matplotlib libraries were utilized for data visualization. 15 The zEpid library was used to calculate incidence rates. 16 Follow-up time was calculated as the difference in age between the baseline (specific to each dataset) and the age when the study endpoint or censoring was reached. The annual event rate was calculated by dividing the number of patients reaching the endpoint by the total follow-up period for that endpoint. The cumulative probability for the occurrence of an outcome was estimated using the Aalen-Johansen estimate in order to take into account competing risks. 17,18 Competing events were defined as the occurrence of heart transplantation or non-arrhythmic death. The Lifelines library was used for all time-to-event analyses. 19 Fine-Gray regression was used to model the impact of clinical predictors on the arrhythmic outcome, in the context of competing risks. 20 Fine-Gray regression was performed through the cmprsk library from the R-project through an R to Python interface. 21 Hazard ratios and 95% confidence intervals (CIs) were reported. Bonferroni correction was used to correct P-values when multiple comparisons were made. P-values , 0.05 were considered significant.

Model validation
The corrected 5-year ARVC risk score was calculated according to the proposed formula 10,11 : where S o (t) is the baseline survival probability at the time t, which is 0.8396 at the 5-year mark (t = 5). The linear predictor was calculated as 0.488 × sex − 0.022 × age + 0.657 × history of recent cardiac syncope + 0.811 × history of NSVT + 0.170 × ln(24 h PVC count) + 0.113 × sum of anterior and inferior leads with TWI − 0.025 × RVEF where NSVT is non-sustained ventricular tachycardia; PVC, premature ventricular complex; TWI, T-wave inversion; RVEF, right ventricular ejection fraction.
Binary parameters (male sex, history of recent cardiac syncope, and history of NSVT) were considered as 1 = positive and 0 = negative. The original 5-year ARVC risk score differs in regard to the S o (t) parameter (see Supplementary material online).
Model validation has been developed according to standard practices, assessing both discrimination and calibration. 22 The discriminatory performance of the model was assessed using the Uno concordance index as obtained by the sksurv.metrics.concordance_index_ipcw function. 23 Due to the time dependency of the outcomes, we also opted to use a timedependent receiver operating characteristic curve analyses for the 5-year ARVC risk score. 24 The sksurv.metrics.roc_auc_score function was used to calculate the time-dependent area under the curve at 5 years. In order to assess the model's calibration, calibration plots were constructed using the sklearn.calibration.calibration_curve and seaborn.regplot functions. 25 Bins of equal number of patients were created. All the model-validation analyses were repeated for each gene group. 95% CIs were obtained using a bootstrap procedure with 10 000 iterations of random sampling with replacement.

Missing data
Missing data were addressed for the model-validation analyses. The Missingno library was utilized to visualize missing data (Supplementary material online, Figure S1 ). 26 Missing data were assumed to be missing at random and were imputed using the multiple imputation with chained equations method. 27 The Sklearn library (impute.IterativeImputer) was utilized to perform data imputation. 28 A total of 10 imputation rounds were performed before returning the imputations computed during the final round. A round is a single imputation of each feature with missing values. Sensitivity analyses were performed (Supplementary material online, Methods).

Results
A total of 554 patients were enrolled from 17 centres. Demographic, genetic, clinical, outcome characteristics of patients and missing data in either dataset are reported in Table 1.

Arrhythmogenic right ventricular cardiomyopathy risk score validation
The corrected ARVC risk score was calculated in all 554 individuals. The median calculated corrected 5-year risk score was 17.2% (9.5%-34.3%) (see Supplementary material online, Figure S3). Overlapping cumulative incidence of VA in the estimated-risk strata is shown in Figure 2. When fitting a multivariable Fine-Gray regression model with the same predictors as the 2019 ARVC risk model, sex (P = 0.021), recent syncope (P = 0.001), number of TWIs (P = 0.001), and log value for PVC count (P = 0.004) were found to be significant predictors, whereas age at baseline (P = 0.15), NSVT (P = 0.16), and RVEF from CMR (P = 0.16) were not significant predictors of VA as shown in Figure 2. Non-sustained ventricular tachycardia was identified more commonly in patients with ICD than those without (143/234, 61% vs. 120/320, 38%; P , 0.001). Uno's concordance index was 0.75 (95% CI 0.70-0.81). Calibration curve revealed a slope of 0.52 (95% CI 0.37-0.71) and an intercept of −0.01 (95% CI −0.05 to 0.02) suggestive of risk overestimation ( Figure 2). As expected, the corrected version of the 2019 ARVC risk model resulted in lower risk estimates and less risk overestimation as compared with the original 2019 ARVC risk model (see Supplementary material online, Figure S6).

Impact of genotype and sex on risk stratification
Genetic analysis was available in 447 (80.7%) of the 554 patients enrolled in the study; a pathogenic or likely pathogenic variant was identified in 290 patients. For the purposes of this analysis, four major gene groups were studied: PKP2 (n = 118, 21.3%), DSP (n = 79, 14%), other desmosomal gene (n = 59, 11%), and gene-elusive patients (n = 160, 28.8%). We did not create a subgroup for nondesmosomal gene carriers due to the limited numbers of patients.  Continued Importance of genotype for risk stratification in ARVC
The variable performance of the corrected 2019 ARVC risk model in different genotypes led us to hypothesize that there may be differences in the performance of individual parameters as predictors of risk among the different gene groups. Due to the limited number of events in each gene group we opted not to perform multivariable analyses, but instead conducted multiple univariate analyses of all the available clinical characteristics in each gene group. Significant differences were present among the gene groups ( Figure 6). The PKP2 group had the most significant predictors and the DSP group the least among the clinical variables that were studied ( Figure 6). Sex was mainly a PKP2 group-related risk predictor ( Figure 6, see Supplementary material online, Figure S5). Left ventricular parameters reached statistical significance almost only within the DSP group. Similarly, the utility of right ventricular parameters was limited in the non-PKP2 groups ( Figure 6).

Discussion
In this multicentre cohort of patients with ARVC, we show that the corrected 2019 ARVC risk score 10,11 has a reasonable discriminative ability for VA, but it suffers from risk overestimation for VA. The analysis by genotype shows that the corrected 2019 model performs best in patients with pathogenic or likely pathogenic gene variants, especially PKP2, but is more limited in patients with an elusive genetic status. Analyses of individual potential risk predictors among the gene groups revealed significant variability reflecting differences in clinical phenotypes (Structured Graphical Abstract).

Risk stratification in arrhythmogenic cardiomyopathy
Although the notion that all patients presenting with tolerated sustained VA have a high risk of SCD has been challenged, it remains standard practice to offer them ICDs. 29 There has been less certainty about patient selection for primary prevention ICDs.
A large number of clinical predictors have been suggested as risk markers in ARVC, but supporting evidence is inadequately validated and often based on small heterogeneous cohorts. 30,31 The original 2019 ARVC risk model was the first systematic attempt to develop a validated clinical tool that provides individual risk estimates for sustained VA in patients with definite ARVC and no prior sustained VA. 10 A correction of the baseline survival probability has recently been published. 11 External validation of the corrected 2019 ARVC  Importance of genotype for risk stratification in ARVC risk score in our cohort revealed a generally good accuracy, but we observed a much lower annual event rate and a significant overestimation of risk compared with the original paper. 10 Risk overestimation was even more pronounced in the original 2019 ARVC risk model. Other, smaller studies have reported a good performance of the model. [31][32][33] Similarly, when all the risk predictors used in the 2019 ARVC risk score were fitted to a Fine-Gray regression model, only sex, syncope, the number of TWIs, and log PVC count remained independent risk factors, whereas NSVT, age at diagnosis, and RVEF on CMR were non-significant.
To provide close comparison, we sought to replicate the characteristics of the 2019 development cohort. Nevertheless, there may be some important biases in patient selection. For example, the participating centres in our study were predominantly cardiomyopathy units compared with the electrophysiology focused units that participated in the 2019 study, which could predispose to a more arrhythmia-prone population. 10 This can potentially explain the lower annual incidence rate of VA and thus the risk overestimation that we have observed compared with the 2019 model development study. 10 In addition, the genotype composition between the two studies differs; for example, the prevalence of PKP2 variants in our cohort is 21.7% compared with 48.9% in the 2019 study 10 . The fact that PKP2 variant carriers exhibited a higher cumulative incidence of VA than gene-elusive patients might also affect the corrected 2019 ARVC risk model performance in our cohort.

Utilizing genotype information for precision arrhythmic risk stratification
While ARVC has been uniformly defined using the 2010 TFC diagnostic criteria, it is recognized that specific genotypes associate with different phenotypic features. 2,34 For example, prognostic markers such as TWI and early age of disease onset are more commonly seen in patients with ARVC caused by desmosomal gene variants. 35 Although some rare genotypes associated with particularly arrhythmic profiles exist, 36,37 there has been limited evidence to link specific genes to arrhythmic outcome prediction. [4][5][6]38 In a recent study by the Nordic ARVC Registry, PKP2 mutation carriers showed decreased arrhythmia-free survival. 39 This is consistent with our findings suggesting a higher cumulative incidence for VA in PKP2 vs. gene-elusive patients (Figure 4).
In our cohort, we observed that gene-elusive patients have the lowest incidence of the primary outcome, whereas the incidence was similar between the major gene groups. Application of the corrected 2019 ARVC risk model in the specific gene groups revealed good performance within the gene-positive patients and particularly the PKP2 group with least overestimation of risk compared with the total cohort (mostly identified in the lower risk strata) in comparison with the gene-elusive group where the model had modest performance with significant risk overestimation across all risk strata. Model performance was intermediate for the DSP and other desmosomal groups but did not differ significantly with the other subgroups, possibly due to smaller patient numbers. The corrected 2019 ARVC risk model should be used with caution in non-PKP2 patients.
Due to restricted subgroup sizes, we were unable to perform multivariable analyses to study individual risk markers in different genotype clusters, but in a univariable analysis we did observe gene-specific differences in the association of some of the variables used by the 2019 ARVC risk model with VA. For example, significant risk was conferred by right ventricular parameters mainly in the PKP2 group. Interestingly, sex did not reach statistical significance in any of the non-PKP2 groups ( Figure 6, see Supplementary material online, Figure S5). To our knowledge, this is the first demonstration of this phenomenon and as such generates hypotheses for future studies.
The main implication of our findings is that the incorporation of genotype is vital in future iterations of risk models in ARVC. Variables such as variant type and location and the presence of multiple variants have all been shown to affect phenotype and would likely demonstrate unique risk profiles. 4,5,39 Ethnic differences might also add to this complexity. Similarly, forms of ARVC that affect predominantly the left ventricle may require bespoke risk prediction models. 40

Limitations
The limited size of the cohort did not allow for gene-specific analyses for all genes harbouring causative variants. Even in the groups with significant differences in model performance, a limited number of events was observed and the results should be interpreted cautiously. The lack of CMR at the baseline timepoint in a significant part of the cohort challenged the application of the corrected 2019 ARVC risk score, which was addressed using data imputation. Our study population is overwhelmingly Caucasian, and therefore extrapolation to other ethnic backgrounds should be done with caution. Information on ICD programming was not analysed but all centres are expert in ICD implantation and management and offered contemporary programming strategies to all patients. That said, considering the multicentre and retrospective nature of this study, patients are likely to have been exposed to varying ICD programming practices that may have influenced the frequency of the primary arrhythmic outcome. Non-sustained ventricular tachycardia has been defined as an 'any exam diagnosis' and therefore patients with an ICD have a higher likelihood of NSVT detection than those without.

Conclusion
The corrected 2019 ARVC risk score has a reasonable discriminative ability but suffers from risk overestimation. It performs best among gene-positive patients and especially in the PKP2 subgroup, but its utility is limited in gene-elusive patients. The predictive power of individual risk markers also varies by genotype. Future iterations of risk models in ARVC should incorporate genotype data.

Supplementary material
Supplementary material will be available at European Heart Journal online.