Impact of Potential Case Misclassification by Administrative Diagnostic Codes on Outcome Assessment of Observational Study for People Who Inject Drugs

Abstract Introduction Initiation of medications for opioid use disorder (MOUD) within the hospital setting may improve outcomes for people who inject drugs (PWID) hospitalized because of an infection. Many studies used International Classification of Diseases (ICD) codes to identify PWID, although these may be misclassified and thus, inaccurate. We hypothesized that bias from misclassification of PWID using ICD codes may impact analyses of MOUD outcomes. Methods We analyzed a cohort of 36 868 cases of patients diagnosed with Staphylococcus aureus bacteremia at 124 US Veterans Health Administration hospitals between 2003 and 2014. To identify PWID, we implemented an ICD code–based algorithm and a natural language processing (NLP) algorithm for classification of admission notes. We analyzed outcomes of prescribing MOUD as an inpatient using both approaches. Our primary outcome was 365-day all-cause mortality. We fit mixed-effects Cox regression models with receipt or not of MOUD during the index hospitalization as the primary predictor and 365-day mortality as the outcome. Results NLP identified 2389 cases as PWID, whereas ICD codes identified 6804 cases as PWID. In the cohort identified by NLP, receipt of inpatient MOUD was associated with a protective effect on 365-day survival (adjusted hazard ratio, 0.48; 95% confidence interval, .29–.81; P < .01) compared with those not receiving MOUD. There was no significant effect of MOUD receipt in the cohort identified by ICD codes (adjusted hazard ratio, 1.00; 95% confidence interval, .77–1.30; P = .99). Conclusions MOUD was protective of all-cause mortality when NLP was used to identify PWID, but not significant when ICD codes were used to identify the analytic subjects.

Medications for opioid use disorder (MOUDs), formulations of methadone, buprenorphine, and naltrexone are evidence-based interventions to improve health outcomes in people who inject drugs (PWIDs) and have opioid use disorder (OUD) [1].PWIDs may inject multiple substances in isolation or in combination that includes opioids, methamphetamines, and cocaine, with opioids being the most frequently injected drug reported [2,3].PWIDs are at risk for bacterial infections that lead to hospitalizations such as skin and soft-tissue infections, endocarditis, and osteomyelitis.As such, there has been a call to initiate MOUD within hospital settings for PWIDs [4].MOUD is important in treating PWIDs with OUD in hospital as those hospitalized for infectious diseases frequently leave when opioid withdrawal and cravings are not treated [5].Yet, providers and hospitals may delay MOUD initiation until other medical issues are stabilized and the patient is seen in an outpatient setting either at opioid treatment programs or in settings that offer office-based addiction treatment with buprenorphine [6].
Research evaluating outcomes (eg, mortality, readmissions, patient-directed discharges) related to hospital-based initiation of MOUD in retrospective observational cohorts has been mixed [7][8][9][10][11][12][13][14][15][16][17][18][19][20][21].Although multiple studies have reported improved health-related outcomes after starting MOUD within the hospital [11,13,21], multiple others have not.For example, 1 study using International Classification of Diseases, 10th edition (ICD-10), codes to define their analytic cohort showed that the initiation of MOUD in patients hospitalized with endocarditis had no effect on patient-directed discharges (also known as leaving against medical advice) and 30-day readmissions [14].Ray et al found no significant difference in length of stay, readmissions, drug use at readmission, or reoperations between groups with and without MOUD [16].O'Kane et al showed no statistical difference in all-cause 180-day readmission or mortality when prescribing buprenorphine at discharge [12].Traver et al showed no association with the completion of Open Forum Infectious Diseases M A J O R A R T I C L E outpatient parenteral antimicrobial therapy [20].In this study, the authors acknowledged the possible selection bias in the use of ICD codes in potentially missing patients with incomplete documentation as well as the inclusion of patients who use prescription opioids and do not have a substance use disorder.
A major barrier to understanding the impact of MOUD on infectious diseases outcomes are imperfect systems of cohort identification [15,[22][23][24][25][26].Many of these recent studies suggest that there is no viable combination of ICD code algorithms effective in identifying PWIDs.The lack of uniformity in ICD-based algorithms in identification of PWIDs may lead to misclassification of the population and inaccurate conclusions.Misclassification may be due to several factors: there is no specific ICD code for injection drug use; the subjects may not have injection drug use; may not have OUD; codes may be carried over from prior admission; and the disease and use of opioids may no longer be active [27].For example, Marks et al showed that ICD codes misclassified 56.4% of PWIDs in a cohort of 114 patients with infective endocarditis.When analyzing the benefits of MOUD in relation to patientdirected discharges and all-cause mortality, the use of ICD codes suggested no benefit of MOUD.However, when using a manual chart review, there was a protective effect in both outcomes [28].
Recognizing the methodology gaps, our team published an algorithm based on natural language processing (NLP) and machine learning to identify PWIDs in electronic health records [26].In that study, we showed that our NLP/machine learning algorithm, which was validated by manual chart review, had a sensitivity of 92.6% and a specificity of 95.4% in correctly identifying PWIDs, and the ICD-code algorithm with maximum accuracy had a sensitivity of 92.0% and a specificity of 52.4%.NLP, a discipline within computer science, addresses the comprehension, interpretation, and production of oral or written communication.Its applications are diverse across numerous research tasks.In our previous study, we trained an algorithm to categorize unstructured text data, such as admission notes, as pertaining to PWIDs or not related to PWIDs.Building off our initial analysis, our objective was to evaluate the differences in the clinical effectiveness of MOUD on outcomes in PWID admitted to Veterans Health Administration (VHA) hospitals because of Staphylococcus aureus bacteremia when identifying a cohort of PWIDs by ICD codes or by using natural language processing.We hypothesized that the way a cohort of PWID was identified was associated with MOUD effectiveness.

Data
We used a cohort of 36

PWID Identification
Based on our previous work [26], we used the ICD code-based and NLP-based algorithm with the maximum classification accuracies to subset the larger cohort into cases identified as PWIDs.Our ICD code-based algorithm included codes for hepatitis C and substance use disorders (Table 1) because these were shown to have the best diagnostic metrics in our prior analysis [26].Our NLP algorithm used text preprocessing, regular expressions, NegEx to remove negation, term frequency-inverse document frequency for numeric representation, and a random forest model for classification of admission notes.The algorithm was designed to identify cases related to injection drug use regardless of substance of use.

Covariates and Primary Predictor
For each case in the cohorts, we extracted demographic variables (age, gender, race, ethnicity, homelessness), date and hospital of the index hospitalization, discharge status (patient-directed discharge or other), readmission, blood culture data (methicillin-  [30].ICD-9 codes were used to identify chronic comorbidities based on algorithms from the Centers for Medicare & Medicaid Services' Chronic Conditions Data Warehouse [31].ICD-9 codes were used to identify concomitant infectious diseases (human immunodeficiency virus [HIV], hepatitis B and C), psychiatric disorders (anxiety, bipolar disorder, depression, posttraumatic stress disorder, schizophrenia), medical comorbidities (diabetes, chronic kidney disease, chronic liver diseases, cardiovascular disease, ischemic heart disease, heart failure, obesity).ICD-9 codes were used to calculate a weighted Elixhauser score [32].Additionally, we included a series of care quality process measures as variables (appropriate antibiotic treatment, infectious disease consultation, and echocardiography).These process variables were defined in a previous publication [29].
We defined the primary predictor as the prescription or not of inpatient MOUD.Inpatient MOUD was defined as the prescription of methadone or buprenorphine at any time during the index hospitalization for S. aureus bacteremia (this includes during an emergency department visit and inpatient stay).We did not include naltrexone in the analysis because VHA guidelines preferred methadone and buprenorphine as first-line agents [33].We reviewed all cases that were identified as PWIDs that received MOUD by either algorithm.We reviewed notes and dosing for MOUD and classified if MOUD was prescribed for OUD, chronic pain, or palliative care.

Outcomes
The primary outcome was 365-day all-cause mortality from the date of discharge.Date of death was ascertained using the VHA Vital Status file that integrates VHA and non-VHA sources.Survival days were calculated from the discharge day for the index hospitalization for S. aureus bacteremia until the day of death or censored if death did not occur within the mortality time frame.Secondary outcomes were 30-day mortality, 365-day acute hospital readmission status, time to first readmission, and discharge status (patient-directed discharge or other).

Statistical Analyses
All analyses were conducted in R 4.1.2within the VINCI platform.We calculated descriptive statistics and compared covariates by those who were prescribed inpatient MOUD with those who did not by using χ 2 tests or Wilcoxon rank-sum tests.We then performed 2 separate analyses: an epidemiologic analysis and a comparative effectiveness analysis.
In the epidemiologic analysis, we estimated the number of S. aureus bacteremia cases among PWIDs in the original Goto cohort using ICD codes and NLP.We calculated and plotted yearly rates per 100 000 veterans enrolled in the VHA [34].
In the comparative effectiveness analysis, we fit a mixed-effects Cox regression models for both the cohort identified using NLP and the other identified by ICD codes.The main predictor was receipt of MOUD (methadone or buprenorphine) during the index hospitalization for S. aureus bacteremia.The primary outcome was mortality at 365 days from the discharge date of the index hospitalization.Models were controlled for demographic variables, MRSA status, concomitant infectious diseases, psychiatric disorders, medical comorbidities, and process variables.Hospital of admission and year of admission were treated as random effects.We only included cases that were discharged alive from the acute care admission.This was done to address immortal time bias by excluding cases that died during their index admission and mitigate survival differences in time from admission to start of MOUD.Time 0 for both groups (received inpatient MOUD and not received inpatient MOUD) was the day of their discharge.Additionally, we performed bivariate analyses for MOUD on secondary outcomes that included mortality at 30 days, readmission at 365 days, time to first readmission, and patient self-directed discharges during the index hospitalization.Schoenfeld residuals testing was done to test the proportional hazard assumption for the models.

RESULTS
Of the 36 868 cases with S. aureus bacteremia, NLP identified 2389 (6.5%) cases as PWID, and ICD codes identified 6804 (18%) cases.Figure 1 displays epidemiologic trends of S. aureus bacteremia at the national level.The mean national rate per year over the study period of S. aureus bacteremia among PWIDs identified was 1.68 per 100 000 using NLP and 7.04 per 100 000 using ICD codes.Peak rates of S. aureus bacteremia were in 2004 and 2005 for the NLP and ICD methods, respectively.Rates decreased by 60% in the NLP cohort and by 41% in the ICD cohort at the end of the study period (2014).
For the comparative effectiveness analysis, we excluded 242 cases from the NLP cohort (10.1%) and excluded 730 cases from the ICD cohort (10.7%) because these cases died during the index admission.Our final analysis cohorts included 2147 cases in the NLP cohort and 6074 cases in the ICD cohort.There were 1636 cases that were present in both cohorts.MOUD were dispensed to 113 cases (106 methadone [93.8%], 7 buprenorphine [6.2%]) in the NLP cohort and 244 cases (234 methadone [95.6%], 10 buprenorphine [4.1%]) in the ICD code cohort (105 cases that received MOUD appeared in both cohorts).Of the 113 cases identified by NLP, 110 were confirmed to be PWIDs and have OUD (97.3%), 3 were prescribed for chronic pain and were not identified to be PWIDs (2.7%).Of the 244 cases identified by ICD codes, 122 were identified to be PWIDs and have OUD (50%), 95 had chronic pain (38.9%), 14 had palliative care indications (5.7%), and 13 others were identified as having OUD but not identified to be PWID (5.3%).
Table 2 shows demographic and clinical differences by MOUD status for both cohorts.In the NLP cohort, MOUD was less common in cases that identified as Black, had chronic kidney disease, and more common in those identified as Latino, diagnosed with endocarditis, hepatitis B and C, anxiety, and had chronic liver disease compared with those without a respective diagnosis.In the ICD cohort, MOUD was more common in cases identified as Latino, diagnosed with HIV, hepatitis C, endocarditis, osteomyelitis, and posttraumatic stress disorder, and less common in cases with schizophrenia, diabetes, or heart failure than those without a respective diagnosis.
Bivariate associations of MOUD with outcomes are described in Table 3.In the NLP cohort, cases that received MOUD were less statistically less likely to die at 365 days but not at 30 days and had a longer time to acute care hospital readmission compared with cases that did not receive MOUD.In the ICD code cohort, there was no difference in mortality, readmissions, time to readmission, or patient-directed discharges between those receiving MOUD or not.
In bivariate survival analysis, receipt of inpatient MOUD was associated with a protective effect on 365-day survival (hazard ratio [HR], 0.52; 95% confidence interval [CI], .32-.84; P < .01)compared with those not receiving MOUD in the NLP cohort.There was no significant effect of receipt of MOUD in the ICD cohort (HR, 0.85; 95% CI, .66-1.10;P = .02),as depicted in Figure 2. In multivariate survival analysis, receipt of MOUD retained its protective effect (adjusted HR [AHR], 0.48; 95% CI, .29-.81; P < .01)compared with not receiving MOUD after adjustment for covariates in the NLP cohort.Again, receipt of MOUD was not significant in the ICD cohort (AHR, 1.00; 95% CI, .77-1.30;P = .99).Schoenfeld tests were nonsignificant in both the NLP and ICD cohort analysis, suggesting that the proportional hazards assumption was not violated.

DISCUSSION
We showed that the methods used to delineate a cohort of PWIDs in electronic health records had a significant impact on both epidemiologic and comparative effectiveness analyses.In the first, the ICD code-based method overestimated the rate of S. aureus bacteremia cases by more than 4-fold compared with NLP.In the second, the effectiveness of prescribing MOUD-more specifically, methadone-to PWIDs in the hospital was only shown to be beneficial when delineating a precise cohort using NLP, whereas when using an ICD code-based approach, the effect of MOUD was nonsignificant.The bias toward the null is most likely from misclassification.The ICD-based approach likely captured many cases that did not merit MOUD because of its low specificity and many who were receiving methadone for other reasons (eg, palliative care or chronic pain).To this point, 50% of the cases captured by ICD codes and received MOUD were not PWIDs.Receipt of MOUD in the hospital was very low for PWIDs within the cohort.This is similar to what has been reported nationally for people living with OUD, where less than 20% receive any treatment for OUD and even less receive medication-based treatment [35].MOUD prescriptions for buprenorphine or methadone historically are provided to more  than 15% of veterans with OUD, but as of 2020 more than 40% of veterans receive formulations of MOUD (buprenorphine, methadone, and naltrexone) for OUD treatment at any given point [36].Within the VHA in 2017, only 15% of admissions with OUD received MOUD and mostly for withdrawal management [37].Only 2% received linkage to outpatient addiction care.Multiple efforts have been instituted nationwide to improve access to MOUD, and specifically within the VHA [38][39][40].One specific effort in the VHA was the Stepped Care for Opioid Use Disorder, Train the Trainer [39].Early outcomes form this program showed an increase in providers able to prescribe buprenorphine and an increase in MOUD in eligible patients by 4% [38,40].
Our study underscores the importance of refining methods to identify PWIDs within healthcare records and support future efforts to use NLP to identify PWIDs.Nevertheless, NLP has limitations to widespread use, which include concerns about confidentiality when obtaining clinical notes and the need for extensive technical expertise and time are required to train, test, and evaluate models.Next steps include building on this work by enhancing NLP algorithms or further refining code-based approaches to accurately categorize PWID cohorts and developing real-time NLP algorithms to increase quality of care in hospitals for PWIDs.Newer large language models should be evaluated for the purpose of identifying PWIDs and for extracting richer data such as the type of substance used, the duration of use, the routes of usage, and treatment offered.
There are many limitations to this study.First, we only used receipt or not of MOUD during inpatient admissions.Many other factors related to MOUD may mediate the relationship with our analyzed outcomes.These may include the specific medication, dose, duration of treatment, and linkage to outpatient care.The majority of cases that received MOUD were prescribed methadone, and we cannot extrapolate our findings to the small number that received buprenorphine.Second, because PWIDs are a heterogenous group, cases that may have been exclusively using methamphetamines or cocaine may have been included in both groups; this may have biased the association of MOUD with the outcome toward the null.Further models should be developed to identify specific substances that PWIDs are reported to be using.Third, the study population was exclusively from the VHA and the findings may not generalize to other healthcare systems.As well, most cases were men, which may limit extrapolations of the results to other genders.Additionally, coding practices at the VHA may be different to other health systems.Fourth, our analyzed cohort was from before 2015, and this presents multiple limitations regarding comparability and applicability to contemporary cohorts.ICD-9 codes were used because those were current of the study period.Future studies need to compare NLP with ICD-10-based coding (and ICD-11 when it becomes available).During our study period, illicit fentanyl use was not commonly reported by PWIDs.Additionally, since then, there has been much more attention to improving inpatient diagnosis of substance use disorders and initiation of MOUD for people with OUD.These changes in substance use patterns and provision of care may change the relationship of MOUD with the outcome and the diagnostic metrics of both the ICD and NLP algorithms.Further evaluations of NLP to ICD codes with contemporary cohorts are necessary.
Identification of PWID in health services research is paramount to estimate the true benefits of different interventions in this population.The mortality rate of 29% at 1 year highlights the need for better interventions in PWIDs that are hospitalized for an infection, in this case S. aureus bacteremia.Continued efforts are needed to maximize the provision of evidence-based care measures for patients with S. aureus bacteremia such as appropriate antibiotics, infectious diseases consultation, echocardiography, and for PWIDs specifically, MOUD.Future research related to PWIDs should consider using NLP over ICD methods to identify cases to reduce bias resulting from misclassification.

Notes
Author contributions.D. G. M., S. S., A. A. T. B., and M. B. G. conceptualized and designed the study.D. G. M. conducted the statistical analysis.M. G. assisted in obtaining access to the cohort data.All authors had full access to all data in the study and take responsibility for the integrity of the data and the accuracy of the data analysis.All authors contributed to the writing and critical revision of the report.
Disclaimer.This material is the result of work supported with resources and the use of facilities at the Greater Los Angeles Veterans Administration.The views expressed in this article are those of the authors and do not necessarily reflect the position or policy of the Department of Veterans Affairs, the United States government, the National Institutes of Health, or the author's affiliated universities.
Financial support.D. G. M. was supported by the US National Institute on Drug Abuse (grant K08 DA048163).S. S. was supported by the US National Institute on Mental Health (grant P30 MH58107), the UCLA Center for AIDS Research (grant 5P30 AI028697), and the UCLA Clinical Translational Science Institute (grant UL1 TR001881).All other authors have no conflicts of interest to declare and received no special funding related to this work.

Figure 1 .
Figure 1.Estimated rates of Staphylococcus aureus bacteremia in people who inject drugs per 100 000 enrolled veterans in 124 hospitals in the Veterans Health Administration nationally from 2003 to 2014.

Figure 2 .
Figure 2. Survival probability over 1 y for people who inject drugs after hospitalization because of Staphylococcus aureus bacteremia by receipt of inpatient medications for opioid use disorders (MOUD).

Table 1 . ICD Codes Used to Identify PWIDs
Race and ethnicity were categorized based on US census categories.Race was categorized as American Indian or Alaska Native, Asian, Black or African American, Native Hawaiian or Other Pacific Islander, White, or unknown.Ethnicity was categorized as Hispanic or Latino, not Hispanic or Latino, or unknown.During the cohort period, the VHA used ICD-9 codes (the VHA transitioned to ICD-10 on 1 October 2015) Abbreviations: HCV, hepatitis C virus; ICD, International Classification of Diseases; PWID, people who inject drugs.resistantS.aureus[MRSA]ornon-MRSA), ICD-9 codes, and medication administration data.

Table 2 . Descriptive Characteristics of People who Inject Drugs in a Cohort of Veterans Admitted for Staphylococcus aureus Bacteremia Between 2003 and 2014 Identified by Natural Language Processing or ICD Codes
American Indian or Alaska Native; CKD, chronic kidney diseases; CVD, cerebrovascular disease; HBV, hepatitis B virus; HCV, hepatitis C virus; ICD, International Classification of Diseases; IHD, ischemic heart disease; MOUD, medications for opioid use disorder; MRSA, methicillin-resistant Staphylococcus aureus; NHPI, Native Hawaiian and Pacific Islander; NLP, natural language processing; PTSD, posttraumatic stress disorder.

Table 3 . Outcomes of People who Inject Drugs in a Cohort of Veterans Admitted for Staphylococcus aureus Bacteremia Between 2003 and 2014 Identified by Natural Language Processing or ICD Codes
: ICD, International Classification of Diseases; IQR, interquartile range; NLP, natural language processing.