Case-ascertainment of acute myocardial infarction hospitalizations in cancer patients: a cohort study using English linked electronic health data

Abstract Aims To assess the recording and accuracy of acute myocardial infarction (AMI) hospital admissions between two electronic health record databases within an English cancer population over time and understand the factors that affect case-ascertainment. Methods and results We identified 112 502 hospital admissions for AMI in England 2010–2017 from the Myocardial Ischaemia National Audit Project (MINAP) disease registry and hospital episode statistics (HES) for 95 509 patients with a previous cancer diagnosis up to 15 years prior to admission. Cancer diagnoses were identified from the National Cancer Registration Dataset (NCRD). We calculated the percentage of AMI admissions captured by each source and examined patient characteristics associated with source of ascertainment. Survival analysis assessed whether differences in survival between case-ascertainment sources could be explained by patient characteristics. A total of 57 265 (50.9%) AMI admissions in patients with a prior diagnosis of cancer were captured in both MINAP and HES. Patients captured in both sources were younger, more likely to have ST-segment elevation myocardial infarction and had better prognosis, with lower mortality rates up to 9 years after AMI admission compared with patients captured in only one source. The percentage of admissions captured in both data sources improved over time. Cancer characteristics (site, stage, and grade) had little effect on how AMI was captured. Conclusion MINAP and HES define different populations of patients with AMI. However, cancer characteristics do not substantially impact on case-ascertainment. These findings support a strategy of using multiple linked data sources for observational cardio-oncological research into AMI.


Introduction
Multimorbid patients make up the majority of hospital admissions and account for significant secondary care healthcare costs but are frequently excluded from clinical trials. [1][2][3] For example, clinical trials on hospital management of acute myocardial infarction (AMI) may exclude cancer patients. This is of particular concern because cardiovascular issues, including cardiotoxicity, are common in cancer patients. 4 The identification of AMI outcomes using routinely collected electronic health record databases is a promising approach that provides a scalable means for large 'big data' studies that limits both research costs and patient inconvenience. 5 However, the recording of AMI in databases amongst patients with a previous history of cancer has not been explored.
The Myocardial Ischaemia National Audit Project (MINAP) is a large clinical audit with detailed, patient-level data on patients admitted to hospitals in England, Wales, and Northern Ireland with AMI since 2000. 6,7 It is apparent however that some AMI hospitalizations are not captured by the registry. Previous researchers studying MINAP records prior to 2009, found that only around half of patients with at least one record of non-fatal AMI were captured in MINAP when compared with AMI capture in hospital episode statistics (HES) data (routinely collected secondary care data coded by non-clinical coding clerks), and primary care data from the Clinical Practice Research Datalink (CPRD). 8 This may partly be due to incomplete hospital-level case-ascertainment but also likely arises because MINAP is targeted primarily to capture AMI caused by atherothrombotic coronary artery disease or Type I AMI. 9 Because MINAP is an audit predominantly recorded by cardiology services, patients whose AMI care is primarily supported by another specialty (e.g. cancer or palliative care) may not be referred to cardiology services and as a result not be captured by MINAP. This may be more likely for patients with more advanced disease. On the other hand, HES is a clinical coding database and ascertainment of AMI, particularly through recording in the first diagnostic position, could be impacted by the presence of another dominant disease at the time of coding (e.g. cancer). This might be particularly the case for more advanced cancer (higher stage and grade).
In this article, we sought to investigate ascertainment of hospitalized AMI in an English cancer population utilizing data from the Virtual Cardio-Oncology Research Initiative (VICORI), which is a research platform that links existing national English cancer registry and cardiovascular audit data through a unique identifier. 10 Through linkage of the National Cancer Registration Dataset (NCRD) 11 with both MINAP and HES we investigate firstly, whether individual or multiple data resources are required for the ascertainment of AMI in cancer patients, secondly whether cancer characteristics (site, stage, and grade) impact on ascertainment of AMI and thirdly to investigate the differences in characteristics and survival of cancer patients with AMI captured by MINAP, HES, or both data sources.

Study population
AMI hospital admissions were identified from English MINAP and/or HES Admitted Patient Care (APC) datasets. The study population included all AMI admissions to hospitals in England between 1 January 2010 and 31 December 2017 in patients aged > _40 who had a previous diagnosis of cancer (up to 15 years prior to the AMI admission) recorded in the NCRD held by the National Cancer Registration and Analysis Service (NCRAS). 11 Cancer diagnoses were identified using ICD-10 codes C00-D48. A patient could have more than one AMI admission included in the study if they occurred on different days.
Linkage between NCRD and HES APC was performed using matching criteria based on NHS number, date of birth, sex and postcode, with any matches ranked 1-5 included (see Supplementary material online, Appendix A). Linkage between NCRD and MINAP meanwhile required an exact match by both NHS number and date of birth.
AMI hospital admissions in MINAP were identified based on discharge diagnosis, cardiovascular biomarkers, and electrocardiographic findings 8,9 (Supplementary material online, Appendix B). MINAP contains detailed diagnostic information from electrocardiograms and cardiac biomarkers, not available in other routinely collected datasets, which allows accurate phenotyping of acute myocardial infarction (AMI) into ST-segment elevation myocardial infarction (STEMI) and non-ST-segment elevation myocardial infarction (NSTEMI). Only MINAP admissions that could be classified as STEMI or NSTEMI were included in our study. AMI admissions in HES were identified by ICD-10 codes I21-I23 as the primary diagnosis for the first episode (continuous period of care under one consultant) of a spell (admission). In HES, STEMI and NSTEMI admissions were identified using ICD-10 codes (Supplementary material online, Appendix C). Admissions captured in both MINAP and HES were assigned the phenotype indicated in MINAP. We performed a sensitivity analysis to assess the influence on case-ascertainment results of AMI codes in secondary diagnosis positions or in subsequent episodes.
Follow-up for mortality was through linkage with the Office of National Statistics (ONS) Mortality Registry. The last date of follow-up for all patients was 31 December 2018.

Matching AMI admissions from MINAP and HES
AMI admissions were considered to represent a match or part of the same acute episode if the admission date in MINAP corresponded to an AMI admission in HES within 30 days. MINAP admissions were considered the gold standard, therefore, if multiple admissions in HES were within 30 days of a MINAP admission date the HES admission with the closest date was selected as the single matched event, to prevent double counting. MINAP admissions without a HES record within 30 days were classified as 'MINAP only' and HES admissions without a MINAP admission within 30 days were classified as 'HES only'. For matched events, the MINAP admission date was used. We performed a sensitivity analysis to determine case-ascertainment if the match window between HES and MINAP admissions was increased to 60 days, 90 days, and no time restriction.

Covariates
Charlson comorbidity score was calculated from comorbidities in inpatient diagnostic HES fields within 1 month-15 years before the first AMI admission using previously defined coding algorithms. 12,13 Index of multiple deprivation income quintiles, a relative measure of deprivation at small area level, was assigned based on the postcode of residence closest to the cancer diagnosis date. Ethnicity was categorized as White, Mixed, Asian, Black, other, or unknown. Age at time of AMI admission was categorized as 40-59, 60-69, 70-79, 80-89, 90þ years. Geographical region reflects the English region based on postcode of residence when the cancer was diagnosed, categorized as London, Midlands and East, North, South. Other cancer characteristics were obtained from the NCRD. .

Statistical analysis
Patient characteristics for first AMI admission were reported for all patients and stratified by case-ascertainment source. Pearson's Chisquared tests (categorical) and Kruskal-Wallis tests (continuous) were performed to examine the association between patient characteristics and case-ascertainment source for first admissions. The percentage of AMI admissions captured by each source were displayed in stacked bar charts.
Annual ICD-10 coding trends for AMI admissions in HES were examined over time. Furthermore, we examined how ICD-10 codes for HES admissions correlated with case-ascertainment in MINAP. For AMI admissions captured in MINAP only we investigated whether a HES admission with a non-myocardial infarction (MI) primary diagnosis occurred within 30 days. The non-MI primary diagnosis ICD-10 chapters were reported, and a more detailed examination was performed for chapters that commonly featured in the admission data.
Survival analyses were performed based on the time from the first AMI admission for each patient. Kaplan-Meier survival estimates were presented for all patients, and separately by first AMI phenotype. Flexible parametric survival models were implemented to assess if differences in survival between case-ascertainment sources could be explained by patient characteristics. A restricted cubic spline was used to model the baseline cumulative hazard of mortality, with 4 degrees of freedom selected based on Akaike's information criterion. Case-ascertainment source was included in the model as a three-level categorical covariate interacted with time to allow for non-proportional hazards between the levels. This was achieved by including two additional restricted cubic splines for levels 2 and 3 of the covariate. All other covariates included in the model were considered as main effects only. For the survival analysis, Mixed, Asian, Black, and Other ethnic groups were combined into non-White. Standardized survival curves were obtained by caseascertainment source, standardizing across the covariate distribution for all patients. Analyses included the following covariates: age group; sex; ethnicity; comorbidity score; AMI phenotype; year of AMI admission; deprivation; geographical region; years between cancer diagnosis and AMI admission; number of previous tumours; and cancer subtype categorized as invasive or non-invasive/non-melanoma skin cancer. Analyses were repeated stratified by AMI phenotype, where standardization was across the covariate distribution for patients in the specified strata. All analyses were conducted on records with no missing data in the relevant covariates (a complete case analysis).

Results
Between 2010 and 2017, we identified 112 502 hospital admissions for AMI across MINAP and HES for 95 509 patients who had had a previous cancer diagnosis up to 15 years prior to the admission. There was an average of nearly 4 years between cancer diagnosis and first AMI admission, with 8364 (7.4%) of cancer diagnoses occurring within 3 months prior to the AMI admission. Just over half of all AMI admissions were captured in both MINAP and HES, 57 265 (50.9%), with a further 26 104 (23.2%) ascertained only in MINAP and 29 133 (25.9%) ascertained only in HES ( Figure 1).

Factors affecting case-ascertainment
Compared with patients whose first AMI admission was captured only in MINAP, HES only patients had a higher median age at admission (1.8 years older); were more likely to be female, have NSTEMI, reside in London, Midlands and East, or South; and had worse 30-day survival (P < 0.001 for all, Table 1). Patients captured in both MINAP and HES had a longer median time between cancer diagnosis and AMI admission and fewer primary tumours. Other cancer characteristics were similar between all three case-ascertainment sources.
Case-ascertainment in MINAP decreased markedly for patients aged 80 or above with MINAP failing to capture over a quarter of octogenarians and over a third of nonagenarians ( Figure 2, Supplementary material online, Table S1). Only 62% of admissions for nonagenarians with NSTEMI were captured in MINAP (Supplementary material online, Figure S1). MINAP captured 84% of STEMI hospital admissions but only 71% of NSTEMI admissions in the cohort. The percentage of AMI admissions captured in both data sources improved over time rising from 45% of admissions captured in 2010 to 56% in 2017, whilst the percentage of cases captured in HES only remained stable. This improvement in ascertainment was most evident for STEMI admissions (Supplementary material online, Figure  S1). Patients with multiple comorbidities were less likely to be captured in MINAP ( Figure 2). Case-ascertainment percentages were generally similar across cancer site, stage, and grade of disease though the percentage of AMI captured in both MINAP and HES combined decreased slightly for patients with more tumours, late stage disease or those diagnosed more recently (45% ascertainment of admissions with 4 or more previous tumours, 46% of admissions with a previous stage 4 tumour, and 48% of admissions with a cancer diagnosis in the previous year) (Figure 3).  Figure S2).

Mortality after AMI admission
Significant differences in survival remained between patients ascertained by the different sources after adjustment. Survival estimates, standardized to the demographics of the full AMI population, were lower for patients whose first AMI admission was captured in HES only, throughout follow-up ( Figure 4, Supplementary material online,  Figure S3).  Table S3). I23 codes, indicating 'complications from MI', were rarely used across all years. AMI admissions captured in HES only were more likely to have the 'site unspecified' (47.4% coded as I21.9) compared to admissions captured by both MINAP and HES (34.8% coded as I21.9) (Supplementary material online, Table S4).

Further examination of MINAP-only admissions
Of 26 104 admissions captured only in MINAP the majority (25 539, 97.8%) could be matched to a HES admission with a non-MI primary diagnosis within 30 days. The most frequent ICD-10 chapters were for diseases of the circulatory system (I00-I99) (13 122, 51.4% of matched admissions) followed by symptoms not elsewhere classified (R00-R99) (5389, 21.1% of matched admissions) (Supplementary material online, Table S5a). Specifically, chronic ischaemic heart disease (I25) and angina pectoris (I20) were the most common circulatory-related primary diagnoses, while pain in throat and chest (R07) was the most common symptom-related primary diagnosis (Supplementary material online, Table S5b). These accounted for 15.3%, 15.1%, and 14.4% of matched admissions, respectively.
In a separate analysis of 26 104 admissions captured only in MINAP, the majority (15 118, 57.9%) could be matched to a HES admission with an AMI diagnosis in the second diagnosis position or higher and/or the second episode or higher within 30 days (Supplementary material online, Table S6).

Sensitivity analyses
Widening the time window in which admissions were deemed to match between data sources made minimal difference to the caseascertainment percentages, with admissions captured in both MINAP and HES increasing to 52.1% if a 90-day window either side of the MINAP admission was used (Supplementary material online, Table  S7). Additionally, broadening the criteria from which AMI admissions were identified from HES resulted in admissions captured by both MINAP and HES increasing to 54.1% if secondary diagnoses in the first episode were also considered, and to 57.3% if AMI captured in any episode of the spell in the primary diagnosis position was considered (Supplementary material online, Table S8). Using HES records that captured AMI in any diagnostic position of any episode resulted  in a greater number of MINAP records being captured in HES, but a lower case-ascertainment percentage overall given the greater number of HES only cases also identified.

Discussion
We present the first investigation of case-ascertainment for AMI in a large national observational linked dataset of cardio-oncology patients from VICORI. We describe firstly, a large population of more than 95 000 patients hospitalized with AMI who have a prior cancer diagnosis. Secondly, overlap between MINAP and HES capture of AMI is incomplete and both data sources are needed for a full understanding of hospitalized AMI in a cancer population. Thirdly, episodes ascertained in MINAP only, HES only and in both MINAP and HES identified different types of AMI patients with markedly differing prognoses. Finally, cancer characteristics (site, stage, and grade) had little effect on how AMI was captured. These findings support a strategy of using multiple linked data sources for observational cardio-oncological research into AMI. We report that, whilst in a cancer population, the overlap between HES and MINAP coding has improved over time, there remain important AMI populations who can be identified from one or another dataset only. Overall, 51% of AMI admissions in cancer patients were captured in both MINAP and HES, a slight improvement on the 46% captured in the same datasets reported by Herrett et al., 8 who first demonstrated from 2003 to 2009 data that MINAP and HES (and the primary care data source CPRD-Clinical Practice Research Datalink) defined incompletely overlapping AMI populations. We have confirmed that this issue remains pertinent in a more contemporary dataset that is over six times larger, and specifically is of relevance to cancer patients. Furthermore, differences in ascertainment source identified distinct AMI populations. Patients captured in both MINAP and HES were on average the youngest, most likely to be male, have a STEMI presentation, and had the least prevalence of comorbidity, lowest cancer burden, and the lowest 30-day mortality. MINAP only cases were on average older, more likely to be female, more likely to have an NSTEMI presentation, more comorbid and had a higher 30-day mortality whereas cases ascertained in HES only were the oldest, most likely female, most likely to have NSTEMI, most comorbid and had the highest 30-day mortality. Importantly the large differences in survival persisted even after adjustment for patient characteristics.
The likely explanation lies in the different approach and aims of data collection for MINAP and HES. MINAP is a disease-specific audit whose primary aim is to improve the quality of specialist cardiac care for the management of acute coronary syndromes (Type I AMI 9 ) Ascertainment in MINAP is therefore at its best for STEMI but will be reduced for AMI cared for exclusively by non-cardiac specialists such as patients being managed palliatively or those who have AMI (particularly NSTEMI) in the context of another, significant diagnosis or comorbidity. It will also be lower for Type II AMI, which is not the primary focus of MINAP. HES is derived from the coding of clinical episodes and aims to capture all comorbidity for England's National Health Service administrative and funding purposes. HES is less sensitive to hospital speciality but also less specific to refined definitions. It will therefore potentially capture AMI occurring in non-cardiac care hospital locations but also encompass more Type II AMI. It is  . . . . . . . . . . .
interesting, for example, to note that the majority of AMI cases ascertained in only MINAP had a corresponding coded HES episode but often with a less specific or non-AMI cardiovascular disease code. It is possible that variations in the permissiveness of coding for AMI, particularly in HES, also account for the regional differences noted in ascertainment between MINAP and HES. It is interesting that patients captured in both MINAP and HES had the lowest risk profile and the lowest mortality. It may be that this group is the simplest to identify as AMI and therefore the least likely to be differentially coded in HES or not included in MINAP.
Taken together these findings strongly support the use of both HES and MINAP data for the ascertainment of AMI in cancer patients. In addition, key cancer characteristics such as site, stage, and grade had a low impact on AMI ascertainment suggesting audit recording and coding of AMI is similar across the cancer population. This implies that valid comparisons can be made between different cancer patients (across the full spectrum of disease) within this linked dataset. These findings suggest this novel linked data resource VICORI can identify a large and representative cardio-oncological cohort enabling robust statistical power coupled with data on a variety of relevant risk factors and comorbidities. Further investigation of cancer treatment cardiotoxicity leading to AMI using the VICORI dataset is necessary.

Strengths and limitations
This study is subject to a number of limitations. Firstly, we studied only secondary care data sources, and therefore do not have a complete picture of AMI case-ascertainment. Those recorded only in primary care records could make up around one-fifth of non-fatal AMI, 8 and it has been found that about a half of all fatal AMIs, as recorded on a death certificate, do not have a hospital admission within the preceding 28 days. 14 Secondly, we could not compare case-ascertainment directly with a non-cancer population. Whilst the percentage of AMI captured in both MINAP and HES of 51% is similar to a previous population-based estimate, 8 there is no guarantee that these results can be generalized to other diseases, with recent investigations in a population with mild-severe chronic kidney disease or those with known risk factors for kidney dysfunction showing a far lower percentage captured by both data sources (23%, Bidulka et al., manuscript under review). Thirdly, AMI phenotype was determined differently in MINAP and HES. It is possible that this could have resulted in misclassification, particularly in HES as only ICD-10 codes were used to identify phenotype. Finally, the results could be affected by residual confounding. Although we present predominantly descriptive results, we have attempted to understand survival differences by case-ascertainment source through adjustment and standardization. Given that large survival differences remained after adjustment we suspect residual confounding, perhaps via lifestyle,

Data sharing
Patient-level electronic health records obtained through VICORI can only be obtained by successfully applying for access to linked VICORI data by contacting vicori@le.ac.uk. An application for data access is subject to approval of a project proposal, analysis plan, and data request by the VICORI Project Review Panel. If approved, a formal application is made to the Office for Data Release at Public Health England. 15 The authors will share programming code and aggregate statistics if requested.

Transparency statement
The lead authors affirm that this manuscript is an honest, accurate, and transparent account of the study being reported; that no important aspects of the study have been omitted; and that any discrepancies from the study as planned have been explained.

Ethics approval
This study was reviewed and approved by the VICORI Consortium Project Review Panel and the National Cancer Registration and Analysis Service Project Review Panel. The VICORI research programme has received favourable ethical opinion from the North East-Newcastle & North Tyneside 2 Research Ethics Committee (REC reference 18/NE/0123).

Patient and public involvement
Our patient representatives were co-applicants on the British Heart Foundation and Cancer Research UK grant which funds the Virtual Cardio-oncology Research Initiative (VICORI). We have therefore maintained lay oversight from study conception onwards. Our patient representatives have direct experience of heart disease, cancer, or both conditions. Their insights have provided the study team with information on the experience of patients with cancer and heart disease, guiding the framing of the key questions which form the basis of the VICORI programme. The lead patient representatives attend the study management group meetings and provide guidance on study design and prioritization of research questions. They also help us to ensure study information and findings are disseminated, available, and accessible to patients and the public.

Dissemination to participants and related patient and public communities
We will disseminate this work through presentations at scientific meetings and conferences. We will issue a press release on this work and engage with media and social media outlets as relevant. A lay summary of this paper will be prepared and shared on the VICORI website.