SAIL study of stroke, systemic embolism and bleeding outcomes with warfarin anticoagulation in non-valvular atrial fibrillation (S4-BOW-AF)

Abstract Aims In patients with non-valvular atrial fibrillation (NVAF) prescribed warfarin, the association between guideline defined international normalised ratio (INR) control and adverse outcomes in unknown. We aimed to (i) determine stroke and systemic embolism (SSE) and bleeding events in NVAF patients prescribed warfarin; and (ii) estimate the increased risk of these adverse events associated with poor INR control in this population. Methods and results Individual-level population-scale linked patient data were used to investigate the association between INR control and both SSE and bleeding events using (i) the National Institute for Health and Care Excellence (NICE) criteria of poor INR control [time in therapeutic range (TTR) <65%, two INRs <1.5 or two INRs >5 in a 6-month period or any INR >8]. A total of 35 891 patients were included for SSE and 35 035 for bleeding outcome analyses. Mean CHA2DS2-VASc score was 3.5 (SD = 1.7), and the mean follow up was 4.3 years for both analyses. Mean TTR was 71.9%, with 34% of time spent in poor INR control according to NICE criteria. SSE and bleeding event rates (per 100 patient years) were 1.01 (95%CI 0.95–1.08) and 3.4 (95%CI 3.3–3.5), respectively, during adequate INR control, rising to 1.82 (95%CI 1.70–1.94) and 4.8 (95% CI 4.6–5.0) during poor INR control. Poor INR control was independently associated with increased risk of both SSE [HR = 1.69 (95%CI = 1.54–1.86), P < 0.001] and bleeding [HR = 1.40 (95%CI 1.33–1.48), P < 0.001] in Cox-multivariable models. Conclusion Guideline-defined poor INR control is associated with significantly higher SSE and bleeding event rates, independent of recognised risk factors for stroke or bleeding.


Introduction
Historically, vitamin K antagonists (VKA) therapy has been the anticoagulant of choice to reduce the risk of stroke in patients with nonvalvular atrial fibrillation (NVAF). 1 However, successful VKA therapy has important practical limitations, including regular monitoring of patients' international normalised ratio (INR) due to variability of control. 2,3 The target INR range is 2-3 (unless otherwise indicated) [4][5][6][7][8] with net clinical benefit closely related to the proportion of time that INRs remain in this range [time in therapeutic range (TTR)]. [9][10][11] Subtherapeutic INRs are associated with increased risk of stroke and systemic embolism (SSE), while supratherapeutic INRs increase bleeding risk. 9,12,13 Guidelines stress the importance of assessing INR control, achieving acceptable TTR, and re-evaluating therapy if adequate control cannot be achieved. The UK National Institute for Health and Care Excellence (NICE) defines poor anticoagulation as any of the following: (i) TTR <65%; (ii) two INR values >5 or one >8 within the past 6 months; and (iii) two INR values <1.5 within the past 6 months. 6 The European Society of Cardiology (ESC) 14 and United States (US) 5 guidelines recommend a TTR of ≥70%. Major clinical guidelines now recommend that anticoagulation with direct oral anticoagulants (DOACs) should also be considered where appropriate. 5,6,14 We have previously demonstrated in a large population that a considerable proportion of patients exhibited suboptimal INR control according to NICE and ESC guideline criteria. 15 However, the magnitude of the impact on major adverse outcomes of suboptimal control according to NICE clinical guideline criteria (which also include 'High/Low' criteria for effective control as well as TTR) has not been demonstrated. We were also interested to see if outcomes differed in those with evidence of poor control according to NICE criteria to those with inadequate control according to ESC and US criteria, which only consider the TTR.
We aimed to (i) determine SSE and bleeding event rates in patients prescribed warfarin for NVAF, and (ii) estimate the incremental risk of these adverse events associated with poor INR control (defined using NICE and ESC/US guideline criteria), accounting for patient clinical and demographic characteristics.

Methods
A population-scale retrospective observational cohort study was conducted using individual-level linked anonymised routine electronic health record data sources for patients prescribed warfarin for NVAF in Wales, United Kingdom, between January 2006 and December 2017 using the Secure Anonymised Information Linkage (SAIL) Databank. [16][17][18]

Cohort selection
Patients eligible for the study had a diagnosis of AF/atrial flutter recorded in their primary care record at any point before or during the study period (2006)(2007)(2008)(2009)(2010)(2011)(2012)(2013)(2014)(2015)(2016)(2017) and were ≥18 years old at time of diagnosis. Patients were excluded if they had valvular AF (AF in the presence of mitral stenosis, rheumatic mitral valve disease, prior mitral valve surgery, metallic prosthetic heart valve) or venous thromboembolism (VTE) including deep vein thrombosis (DVT) and pulmonary embolism (PE) before, or within 6 months of inclusion, or were pregnant during the study period. Patients who were subsequently diagnosed with a VTE or valvular AF after 6 months from entry into the study were censored at the point of new diagnosis.
The cohort was restricted to those prescribed warfarin and with ≥6 months of recurrent INR tests recorded in their primary care record during the study period (excluding the first 6 weeks after commencing treatment, whilst the warfarin dose is typically being tailored to patient requirements).

Temporal calculation of INR control
Individual TTR was calculated at each INR result within 6-month rolling windows using the modified Rosendaal method. 19 We also stipulated that there be: (i) at least four INR readings within each 6-month period; (ii) no gap >12 weeks between consecutive INR readings, and (iii) a gap of at least 3 months between the first and last INR reading within each 6-month window (supplementary material Temporal INR control, Figure 1 & 2A-D); and (iv) at least one warfarin prescription in any 12-week window. 15 Based on these criteria, an algorithm was developed to allow the temporal calculation of INR control, and assign patients to 'unknown,' 'adequate,' or 'poor' INR control status at each INR reading. Two established guideline criteria for poor INR control were assessed: Firstly, NICE criteria for poor INR control, one (or more) of the following (i) TTR <65%; (ii) two INR values higher than 5 within the prior 6 months; or a single INR value higher than 8; and (iii) two INR values less than 1.5 within the past 6 months. Secondly, ESC/US criteria for poor INR control, defined as periods of TTR <70% (secondary analysis).
Patients could move between adequate, poor, and unknown control status; assignment of patients to unknown status would result in temporary exclusion from analysis for any period during which there were insufficient INR results or no warfarin prescriptions available. Patients could re-enter the analysis when another six-month window became available with sufficient INR results and warfarin prescriptions for evaluation.
An index date was assigned to each patient when all inclusion criteria were first met. The number of days a patient was in adequate and/or poor control was calculated to the end of 2017. Patients were censored at death or when an adverse event occurred (see below), including those adverse events occurring outside of periods of INR calculation or when lost to follow-up (end of primary care record).

Adverse events
Adverse events comprised (i) SSE and (ii) bleeding events (categorised as gastrointestinal, urinary, respiratory, intracranial, gynaecological, ocular, or miscellaneous bleeds in other organ systems) recorded in either the primary or secondary care records (see supplementary material online, Tables S1A-D for diagnostic codes).
Two cohorts were created to analyse the association between poor INR and SSE events and bleeding events. Adverse events occurring during periods of INR calculation and within 84 days of the last warfarin prescription were included, and patients were classified as having poor or adequate INR control based on the preceding 6 months of INR data. Finally, events occurring during periods where it was not possible to calculate INR control due to insufficient INR readings were excluded from our analysis.

Medical history, demographic information, and prescriptions
Demographic and clinical data (reflecting standard stroke and bleeding risk classification, 20,21 and comorbidities of major organ systems) prior to the index date for each patient were identified. Age and deprivation quintile 22 were assigned at the index date. Heart failure, hypertension, vascular disease [defined as prior myocardial infarction (MI) or peripheral vascular disease (PVD) including peripheral artery disease and aortic plaque], prior stroke [including transient ischaemic attack (TIA)], diabetes, sex, and age were used to calculate the individual CHA 2 DS 2 -VASc score at the index date. 20

Statistical methods
Baseline characteristics of patients experiencing SSE or bleeding events during periods of INR calculation were compared to those without respective events using chi-squared and ANOVA tests as appropriate.
SSE and bleeding event rates were calculated during periods of adequate or poor INR control using each guideline's thresholds for INR control. Relationships between INR control, SSE and bleeding events were then calculated using NICE criteria. Since our data allowed estimation of times during which patients move between poor and adequate INR control status, we considered INR control as a time-dependent variable and estimated hazard ratios (HRs) representing the risk at any specific time point. Initial multivariable-models utilized Cox-regression to estimate the risk of SSE or bleed according to INR control status, adjusting for the baseline individual CHA 2 DS 2 -VASc score and deprivation quintile. Secondary analyses were then conducted using ESC/US guideline criteria for poor control, using the same statistical approaches. Further Cox-regression models examined relationships between INR control (time-dependent) and SSE and bleeding outcomes, adjusting for relevant individual risk factors, including components of CHA 2 DS 2 -VASc score [including age, sex, and the presence or absence of following: heart failure, hypertension, age, diabetes mellitus, stroke (including TIA), sex, and vascular disease (defined as prior MI, PVD, or aortic plaque)]. Analyses were performed using IBM SPSS v26 and R version 3.5.3.

Missing data
Comparisons were made between those included in the final cohorts for analysis and (i) those with NVAF prescribed warfarin but with inadequate or no INR test results for analysis, and (ii) those with insufficient INR tests recorded in the primary care dataset to classify INR control prior to either a SSE or bleed. Finally, within the final cohort, comparisons were made between those with and without deprivation quintile data available. Differences in these characteristics between groups were summarised using chi-squared tests for categorical variables and independent t-tests for continuous variables.

Results
Over 4 million patient records were identified in the SAIL Databank during the study period; 124324 patients had a diagnosis of AF and aged over 18 at diagnosis of which 10 633 had a diagnosis of DVT, PE, or valvular prior to or within 6months of the study. A total of 37638 were prescribed warfarin with ≥6 months INR data ( Figure 1) (see supplementary material online, Table S2).
We excluded 1747 from the final analyses who had an SSE and 2603 that bled during the study period but had inadequate number of INR results in the 6 months prior to the event to allow calculation of INR control. A total of 410 patients were censored during the study period due to receiving a diagnosis of valvular AF and a further 2024 patients were censored due to a DVT or PE.
A total of 35 891 patients had sufficient data to analyse associations between INR control and SSE, and 35 035 patients had sufficient data to be included in the bleeding analysis. Both cohorts had a mean follow up of 4.3 years, mean TTRs of 71.9%, and mean CHA 2 DS 2 -VASc score of 3.5 (SD = 1.7). The percentage of time spent with poor INR control using the NICE criteria was 34.0% and using ESC/US criteria was 40.9%, for both SSE and bleeding outcome analyses.

SSE cohort
Over the study period, a total of 2802 SSE events occurred in 2422 patients. During periods where there were sufficient data to allow calculation of INR control, 1868 SSE events occurred in 1837 patients. Of those with SSE events, 1650 were strokes and 218 were non-stroke systemic emboli.
Patients experiencing SSE during the study period tended to be older, have higher CHA 2 DS 2 -VASc score, and higher prevalence of hypertension, prior ischemic stroke and PVD ( Table 1) than those who did not. Females were less likely to suffer SSE than males.

Estimates of the effect of INR control on risk of SSE
In univariable Cox analyses, the HR for SSE associated with poor INR control according to NICE criteria was 1.84 [(95%CI 1.68-2.02), P < 0.001]; using the ESC/US criteria for poor control the HR for SSER was1.84 (95%CI 1.68-2.02).
In the first multivariable model, poor INR according to NICE criteria was independently associated with an increased risk of SSE after adjustment for CHA 2 DS 2 -VASc score and deprivation quintile ( Table 2). CHA 2 DS 2 -VASc score was also independently associated with SSE, after mutual adjustment, while there was no significant association with deprivation level.
Poor INR control according to NICE criteria was again associated with SSE independently of other individual risk factors in the second set of models ( Table 3). Increasing age was also associated with SSE, as was diabetes, prior ischaemic stroke and prior bleeding events, hypertension, PVD, and female sex, after mutual adjustment.
Very similar relationships were seen between poor INR control according to ESC/US and SSE events in the multivariable models (see supplementary material online, Tables S3 & 4).

SSE event rate
The SSE event rate (per 100 patient years) was 1.3 (95%CI 1.2-1.4) in the overall population; with a rate of 1.0 (95%CI 0.9-1.1) during periods of adequate INR control rising to 1.8 (95%CI 1.7-1.9) during periods of poor INR control according to NICE criteria. Very similar event rates were seen in those meeting ESC/US criteria for poor INR control (see supplementary material online, Table S5).

Bleeding cohort
Across the entire study period, a total 7220 bleeds occurred in 6304 patients. During periods where there were sufficient readings to allow calculation of INR control, 5766 bleeds occurred in 5039 patients.

Figure 1 Inclusion criteria for study cohort.
Patients who had bled tended to be older, have higher CHA 2 DS 2 -VASc score, had a higher prevalence of hypertension, prior ischemic stroke, ischemic heart disease and a history of prior bleeding events. (Table 4). Females were less likely to bleed, and deprivation index was also associated with bleeding risk in univariable analysis.

Estimates of the effect of INR control on risk of bleed
In univariable Cox analyses, the HR for bleeding associated with poor control according to NICE criteria was 1.43 [(95% CI 1.35-1.51), P < 0.001]. Using the ESC/US criteria of poor INR control as a univariable the HR for bleeding was 1.45 (95%CI 1.38-1.54).
In the first multivariable models, poor INR (according to NICE criteria) was associated with both an increased risk of bleeding ( Table 5) after adjustment for CHA 2 DS 2 -VASc score, and deprivation quintile. CHA 2 DS 2 -VASc score was also independently associated with bleeding events, after mutual adjustment but deprivation quintile was no longer associated.
Poor INR control according to NICE criteria was again associated with bleeding independently of other risk factors in the second multivariable model ( Table 6). After mutual adjustment, increasing age was also associated bleeding events, as was diabetes, prior ischaemic stroke, and prior bleeding events, ischaemic heart disease, heart failure, and chronic kidney disease (stage 4+) was associated with bleeding events. Female sex was associated with fewer bleeds.
Very similar relationships were seen between poor INR control according to ESC/US and bleeds in the multivariable models (see supplementary material online , Tables S3 & S4).

Bleeding event rate
The bleeding event rate was 3.9 (   Table S6).

Discussion
This is the first real-world study from a national cohort that has assessed the association between INR control according to major clinical guideline criteria and both SSE and bleeding events from the clinical records of individual patients prescribed warfarin for NVAF. Evidence of poor INR control was present in over one-third of the evaluated monitoring period at a population level and independently associated with a significant increased risk of both SSE and bleeding events. Despite a greater proportion of time spent in 'poor' INR control with the ESC/US (40.9%) compared to the NICE criteria (34.0%), the relationships between SSE or bleeding event rates and poor INR control were very similar in magnitude with both guidelines.
The overall SSE and bleed rate in this study was 1.3 and 3.9 per 100 patient years, respectively. This was similar to rates recently reported in another real-world study of INR control in patients with AF, 12 as well as those reported in randomised controlled studies where VKA therapy was included as a comparator against the DOACs (SSE range 1.50-2.2% and bleed range 3.1-3.4% in VKA arms). [23][24][25][26] While the mean TTR in this study was 72% compared to 55-64% in randomized controlled trials (RCTs), the lower bleeding event rate observed in the RCTs may be explained by differences in definition of bleeding events, selective patient enrolment and enhanced observation of patients compared to real-world studies. [23][24][25][26] Differences in methods of calculating TTR, and the absence of reporting INR control assessed by very low or very high individual INRs, limits the comparisons that can be made between studies, and health care systems. Lastly, the comparison in bleeding event rate between studies is further limited by the lack of consistent reporting of bleeding events and severity.
This study evaluated the impact of multiple clinical and demographic factors as well as temporal INR control according to multiple criteria in one of the largest real-world studies of INR control in patients with NVAF. The use of a population-scale, individual-level rich linked anonymised data sources is a particular strength. The linked primary and secondary care data held by SAIL enable the investigation of a very large cohort of individuals longitudinally over a period of years and across multiple data sources, giving a much more complete picture of patient treatment, health, and clinical characteristics in a diverse and representative population than in previous studies.
Increasing stroke risk, assessed by the CHA 2 DS 2 -VASc score was, as expected, associated with an increased risk of both SSE and bleeding, as were many individual characteristics commonly associated with stroke or bleeding. Increasing age (≥75 years) was associated with the highest risk of SSE and bleeding. Prior bleeding events were also strongly associated with further bleeding events and prior ischaemic stroke was associated with SSE, as demonstrated in previous studies. [27][28][29] We found that females had a lower likelihood of bleeds, but a slightly higher likelihood of SSE independently on INR control, in keeping with previous studies. 29,30 Notably, INR control was worse in females than males, as we demonstrated previously in this cohort. 15 It is therefore uncertain whether the sex imbalance in observed SSE outcomes could be diminished by improving INR control in women.
Surprisingly, we observed no independent association between deprivation and INR control, SSE or bleeding in this study. The data for this study were obtained from routine data sources following patient interactions within the Welsh National Health Service, where prescriptions and INR monitoring are free to patients at the point of delivery, potentially mitigating important financial access barriers to healthcare, as seen in more economically-disadvantaged individuals or populations. This should be an important consideration when comparing the findings of our study to other healthcare systems.
Deprivation index data were missing in ∼1.9% of patients and were therefore excluded from the first multivariable models (see supplementary material online, Table S7A & B). These patients had slightly lower prevalence of hypertension and a slightly higher prevalence of heart failure, but otherwise had very similar characteristics to the overall cohort. Importantly, all major comorbidities were well represented in the multivariable models and the inclusion of this small group would not be expected to   have materially influenced the strong associations between poor INR control and adverse outcomes. We excluded patients who also had a history of valvular AF and/or VTE as they may have had 'individualised' INR targets, which may not have been identifiable in the SAIL Databank and may potentially have biased the study towards a greater number of patients with 'poor INR control'. Furthermore, our clinical experience suggests that these more complex patients are more often managed via specialist secondary care haematology-led anticoagulation services and their INR results may not have been available for analysis in this study.
We only evaluated patients prescribed warfarin during periods where their INR results were documented in the primary care record in this study. Patients monitored in hospital clinics and those who were undertaking home INR testing (albeit rarely) are unlikely to have their individual INR data entered consistently in their primary care record. It was also not possible to identify periods of temporary discontinuation of warfarin for patients undergoing surgery which if recorded in the primary care record may have resulted in periods of INR control recorded as out of range.
We identified a number of patients prescribed warfarin, with at least 6-month follow up time but who had inadequate number of INR results during the study period who were therefore excluded from the analyses. A further group were excluded who had an had inadequate number of INR results in the 6 months prior to the event to allow calculation of INR control (see supplementary material online, Table S8A & B). It cannot be determined why these patients had insufficient INR readings recorded in the respective period. However, these patients had a significantly higher prevalence of most SSE and bleeding risk factors, and it is likely that most of these patients were managed in secondary care, although it is also possible that they may not have been receiving warfarin nor having regular INR checks.
In this study, we identified the individual components of the CHA 2 DS 2 -VASc score as well as risk factors associated with SSE or bleeding at the index date of admission into the study, including at the time of diagnosis of AF for incident patients. It was beyond the scope of this study to recalculate the CHA 2 DS 2 -VASc score or identify new risk factors/comorbidities dynamically throughout the follow up period. It is unknown  whether this may have added some incremental benefit or improved the accuracy in the associations between these variables and bleeding events. Regardless, the results in this large real-world population study are compelling; poor INR control and increasing stroke risk are independent markers of increased risk of both SSE and bleeding events. The HASBLED score was not calculated in this study for several reasons; poor INR control (a component of the HASBLED score) was measured independently; pathology results, alcohol and illicit drug use are less robustly documented in the WLGP datasets, and aspirin and non-steroidal antiinflammatory are frequently purchased without a prescription in the UK. Finally, the HASBLED score, unlike CHA 2 DS 2 -VASc, is at least partially modifiable, likely to change dynamically throughout the study period and does not provide significantly greater discrimination of bleeding risk than CHA 2 DS 2 -VASc at a population level. 31 The temporal calculation of INR control allowed us to assign patients to adequate or poor INR control at each INR result based on the previous 6 months of INR data; periods where there were insufficient INR results were excluded from INR calculation but allowed patients to re-enter the analysis when there were sufficient INR results to recalculate INR control. While this conservative approach had the potential to exclude periods of INR calculation in patients who had planned extended periods between INR tests, fewer than 1.4% of INR tests had an interval of greater than 84 days. This approach provided greater surety that only periods of warfarin administration and monitoring were included in the calculation of INR control. Furthermore, the recalculation of INR control and assignment to adequate or poor INR control at each INR test allowed us to test the association between adverse events and INR control in the directly preceding period. Our approach provides a methodological improvement over previous studies that have reported the association between bleeding events and mean TTR, which may have been calculated over a period of years prior to an event. 9 We found similar event rates in those with evidence of inadequate control by ESC criteria as those with inadequate control by NICE criteria at the population level. Whilst one could argue that the ESC guidelines could be adopted by the UK due to their greater simplicity, where clinical computer software can reliably identify those with 'high' or 'low' INR levels, this could still help guide individual patient management approaches and we would not advocate to change the UK guidance. However, our findings suggest that where those data are not readily available, following the ESC guidelines is acceptable in UK practice.
This study was conducted prior to the COVID-19 pandemic when a greater proportion of patients were prescribed warfarin. Many healthcare providers have since moved patients to DOAC therapy which requires less intensive monitoring and patient contact. However, we are aware that across many healthcare systems large numbers of patients are still prescribed warfarin. The data in this study may provide valuable insight into selecting patients who are at the highest risk of bleeding with warfarin, who may require the greatest effort in improving INR and could be considered for alternative anticoagulation strategies where clinically appropriate.

Conclusion
Periods of poor INR control, as well as increasing stroke risk and specific comorbidities for stroke and bleeding, were associated with a considerable increase in the risk of SSE and bleeding events in warfarin treated patients with NVAF. The potential to reduce these adverse events through improvement in INR control at a population level should be closely considered to help improve patient outcomes. For individuals, a detailed risk assessment, considering factors leading to poor INR control and comorbidities that increase SSE and bleeding risk, remains essential. However, it is clear that improved measures to optimise effectiveness of anticoagulation are likely to improve clinical outcomes.

Lead author biography
Dr Daniel E. Harris has over 20 years' experience in clinical pharmacy practice.
He has recently been appointed as a clinical academic at Hywel Dda University Health Board, Wales, and as an Honorary Associate Professor in Pharmacoepidemiology at Swansea University. His clinical and scientific interests are focused on cardiovascular risk prevention and management. He is currently delivering a research programme using retrospective population healthcare data to inform and test the delivery of optimal risk factor management of patients with or at high risk of cardiovascular disease.

Data availability
The data used in this study are available in the SAIL Databank at Swansea University, Swansea, UK, but as restrictions apply they are not publicly available. All proposals to use SAIL data are subject to review by an independent Information Governance Review Panel (IGRP). Before any data can be accessed, approval must be given by the IGRP. The IGRP gives careful consideration to each project to ensure proper and appropriate use of SAIL data. When access has been granted, it is gained through a privacy protecting safe haven and remote access system referred to as the SAIL Gateway. SAIL has established an application process to be followed by anyone who would like to access data via SAIL at https:// www.saildatabank.com/application-process.

Supplementary material
Supplementary material is available at European Heart Journal Open online.