-
PDF
- Split View
-
Views
-
Cite
Cite
W Katherine Yih, Martin Kulldorff, Inna Dashevsky, Judith C Maro, Sequential Data-Mining for Adverse Events After Recombinant Herpes Zoster Vaccination Using the Tree-Based Scan Statistic, American Journal of Epidemiology, Volume 192, Issue 2, February 2023, Pages 276–282, https://doi.org/10.1093/aje/kwac176
- Share Icon Share
Abstract
Tree-based scan statistics have been successfully used to study the safety of several vaccines without prespecifying health outcomes of concern. In this study, the binomial tree-based scan statistic was applied sequentially to detect adverse events in days 1–28 compared with days 29–56 after recombinant herpes zoster (RZV) vaccination, with 5 looks at the data and formal adjustment for the repeated analyses over time. IBM MarketScan data on commercially insured persons ≥50 years of age receiving RZV during January 1, 2018, to May 5, 2020, were used. With 999,876 doses of RZV included, statistically significant signals were detected only for unspecified adverse effects/complications following immunization, with attributable risks as low as 2 excess cases per 100,000 vaccinations. Ninety percent of cases in the signals occurred in the week after vaccination and, based on previous studies, likely represent nonserious events like fever, fatigue, and headache. Strengths of our study include its untargeted nature, self-controlled design, and formal adjustment for repeated testing. Although the method requires prespecification of the risk window of interest and may miss some true signals detectable using the tree-temporal variant of the method, it allows for early detection of potential safety problems through early initiation of ongoing monitoring.
Abbreviations
- GBS
Guillain-Barré syndrome
- ICD-10-CM
International Classification of Diseases, Tenth Revision, Clinical Modification
- RZV
recombinant herpes zoster vaccine
Sequential analysis allows for repeated looks at accumulating data in claims or electronic health record databases to detect increased risks of prespecified medically attended health outcomes following receipt of a vaccine or drug, adjusting for the multiple testing entailed in conducting these repeated analyses (1–3). A major advantage of these methods over more traditional, one-time safety studies is that monitoring can begin soon after approval and initial uptake of the vaccine or drug of interest, meaning true safety problems can theoretically be found earlier than in traditional studies, where researchers typically wait for a certain number of doses administered to accumulate in order to obtain the desired statistical power before conducting analysis. Sequential analysis methods are increasingly being used to monitor the safety of new vaccines, including coronavirus disease 2019 (COVID-19) vaccines (4–8).
Over the same period that sequential analysis for targeted vaccine safety surveillance was becoming established, a data-mining approach using the tree-based scan statistic was developed to detect signals of potential adverse reactions after exposure to a vaccine or drug (9–11). This method looks for statistically unusual clustering of cases of adverse events in a “tree” or hierarchical structure of diagnoses without requiring prespecification of health outcomes of interest and in that sense provides an untargeted, broad safety assessment. Scanning counts of diagnoses organized in a tree structure, such as the International Classification of Diseases, Tenth Revision, Clinical Modification (ICD-10-CM), hierarchy of codes, allows clusters of cases of related diagnoses to be detected, in the event that the exposure of interest is associated with a spectrum of disease rather than a highly specific health outcome. The method adjusts for the multiple overlapping diagnoses considered during the construction of the composite null hypothesis that there is no unusual clustering of cases in the tree. The tree-temporal variant of the method has been used to study several vaccines and detect unusual clustering of cases in the hierarchy of ICD-10-CM diagnosis codes and simultaneously in the postexposure follow-up period (12–16). However, until now, none of the variants of the method has allowed for repeated adjusted analyses as data accumulate over the course of time.
In this study, we applied the binomial tree-based scan statistic sequentially to detect adverse events in 2 intervals of potentially increased risk after recombinant herpes zoster vaccination, taking 5 total looks at the data and adjusting not only for the multiple overlapping diagnoses scanned but also for the multiple testing over time. Recombinant herpes zoster vaccine (RZV) (Shingrix; GlaxoSmithKline, Brentford, United Kingdom) was approved by the Food and Drug Administration (FDA) in October 2017 for use in people aged 50 years or older for the prevention of herpes zoster, which in the United States affects about 1 out of every 3 people in their lifetime (17). RZV is administered as a 2-dose series, with 2–6 months between doses (18). It has proven highly effective in preventing herpes zoster and post-herpetic neuralgia (19, 20), which are common, often debilitating ailments.
METHODS
Study population, enrollment criteria, and exposure
We used the IBM MarketScan Research Databases (MarketScan; International Business Machines Corporation, Armonk, New York). These are among the largest proprietary US claims databases available for health-care research and are likely highly representative of the commercially insured population. The databases capture person-specific clinical utilization, expenditures, and enrollment across inpatient, outpatient, prescription drug, and carve-out services. Paid claims and encounter data are linked to detailed patient information across sites and types of providers collected from approximately 350 payers (mainly large employers and health plans, predominantly fee-for-service data).
We extracted data on persons at least 50 years of age who were vaccinated during 5 nonoverlapping time periods: January 1 through November 5, 2018; November 6, 2018, through February 3, 2019; February 4 through August 5, 2019; August 6, 2019, through February 4, 2020; and February 5 through May 5, 2020. Data extraction for the first 4 periods was conducted at one time point, while data extraction for the fifth period was carried out later, after more data had accumulated. Thus, we were simulating sequential analysis as opposed to analyzing the data at 5 different time points. (We did it this way because we had already collected approximately 90% of the intended cohort by the time we started this exploratory sequential analysis.) To be included in analysis, an individual had to have been enrolled from 400 days before through 56 days after RZV vaccination. RZV was identified using Current Procedural Terminology code 90750 and National Drug Codes 58160081912, 58160082311, 58160082801, 58160082803, 58160082901, and 58160082903. More than 1 dose per person was allowed, but RZV doses received within 42 days of a prior dose were excluded. Analyses were of all eligible doses rather than of dose 1 or dose 2 specifically.
Hierarchical diagnosis tree
Outcomes were identified using ICD-10-CM codes. ICD-10-CM codes have a hierarchical tree-like structure, starting with 21 broad categories of diagnoses (e.g., diseases of the circulatory system), which progressively branch into more and more specific sets of diagnoses, culminating in a highly specific diagnosis code. The ICD-10-CM tree we used has 6 levels. Table 1 presents an example of the hierarchical classification scheme; this example diagnosis does not use the 6th level.
Example of Hierarchical Organization of International Classification of Diseases, Tenth Revision, Coding System, Showing Levels of Tree Employed
Level . | Node . | Description . |
---|---|---|
1 | I00–I99 | Diseases of the circulatory system |
2 | I63 | Cerebral infarction |
3 | I63.0 | Cerebral infarction due to thrombosis of precerebral arteries |
4 | I63.03 | Cerebral infarction due to thrombosis of carotid artery |
5 | I63.031 | Cerebral infarction due to thrombosis of right carotid artery |
Level . | Node . | Description . |
---|---|---|
1 | I00–I99 | Diseases of the circulatory system |
2 | I63 | Cerebral infarction |
3 | I63.0 | Cerebral infarction due to thrombosis of precerebral arteries |
4 | I63.03 | Cerebral infarction due to thrombosis of carotid artery |
5 | I63.031 | Cerebral infarction due to thrombosis of right carotid artery |
Example of Hierarchical Organization of International Classification of Diseases, Tenth Revision, Coding System, Showing Levels of Tree Employed
Level . | Node . | Description . |
---|---|---|
1 | I00–I99 | Diseases of the circulatory system |
2 | I63 | Cerebral infarction |
3 | I63.0 | Cerebral infarction due to thrombosis of precerebral arteries |
4 | I63.03 | Cerebral infarction due to thrombosis of carotid artery |
5 | I63.031 | Cerebral infarction due to thrombosis of right carotid artery |
Level . | Node . | Description . |
---|---|---|
1 | I00–I99 | Diseases of the circulatory system |
2 | I63 | Cerebral infarction |
3 | I63.0 | Cerebral infarction due to thrombosis of precerebral arteries |
4 | I63.03 | Cerebral infarction due to thrombosis of carotid artery |
5 | I63.031 | Cerebral infarction due to thrombosis of right carotid artery |
The composite null hypothesis was constructed to consider clusters in levels 2–5, which contain 88,156 groupings of similar clinical diagnosis codes. (We refer to these groupings as “nodes” of the tree.) We did not look for clusters in the first or sixth level because these groupings are not clinically meaningful—the former are too general, and the latter are too specific, often only for the purpose of specifying anatomic laterality of a health outcome or distinguishing between initial and subsequent encounters.
Incident diagnoses
The study examined “incident” diagnoses observed in the inpatient or emergency department setting during the follow-up period of 56 days. To be counted as an incident case, the patient must not have been assigned another ICD-10-CM diagnosis code having the same first 3 characters (i.e., in the same second level of the tree) in any setting during the prior 400 days. (We chose 400 days in order to enable ascertainment of preexisting conditions that might have been recorded at a visit roughly 1 year prior, considering that some patients have preventive care visits on an approximately annual basis.) Because incidence was determined using the second level of the tree, above which no analysis of clustering was carried out, no patient could have contributed more than 1 case count to any detected cluster.
Risk and comparison windows
The primary risk interval evaluated was the period of days 1–28 after vaccination, with a comparison period of days 29–56. The secondary risk interval was the period of days 8–21 after vaccination, with a comparison period consisting of days 1–7 and days 22–56. In this self-controlled design, the comparison is between risk and comparison intervals within persons. The question being asked is whether there is an elevated occurrence of cases of a particular kind of adverse event during the postexposure risk window as compared with the comparison window.
Sequential binomial tree-based scan statistics
Scan statistics have been used in infectious disease, cancer, and genomic surveillance (21–23). Most recently, scan statistics have been adapted for vaccine and drug safety surveillance (10, 12–16, 24, 25). A scan statistic detects a cluster of nonrandom activity in a data set by moving a window of evaluation (e.g., temporal or geospatial) across it and calculating its likelihood under expected conditions. Composite null hypothesis-testing procedures use a generalized likelihood ratio test that examines many potential models or combinations of data and maximizes the likelihood ratio function over the multiple potential combinations of data analyzed.
With our self-controlled design, we scanned cases within the ICD-10-CM hierarchical structure, looking for any clustering of cases within this tree-like structure in the risk window compared with the comparison window. Under the composite null hypothesis, there would be no unusual clustering of events within the risk window in any part of the tree. Under the alternative hypothesis, there would be at least 1 leaf or branch of the tree with an unusual cluster of events. We used Monte Carlo simulation to generate 9,999 replications of a null data set (i.e., a data set without any association between the exposure and the outcome other than due to chance alone) using a data-permutation strategy, holding constant the number of cases at each node of the tree (in risk and comparison windows combined) in the original data set. The test statistic was calculated for each replication data set plus the real data set and was the maximum of the individual log-likelihood ratios calculated over the 88,156 clinical outcome groups. This technique allows us to rank each potential cluster against the test statistic distribution. Given the 9,999 replications, the lowest possible P value is 0.0001.
In a nonsequential application of binomial tree-based scan statistics, this distribution of maximum log-likelihood ratios is realized once, and a prespecified P value is selected as the threshold for statistical significance. In the novel sequential version that we demonstrate here, we preselect a number of interim hypothesis tests, an alpha-spending plan, and a maximum value of alpha spent. Here, we used a 5-look plan and distributed the prespecified total alpha level of 0.05 across those 5 looks according to an alpha spending function that approximates a Wald boundary (Table 2, penultimate column). A Wald boundary uses a larger amount of alpha in early looks to prioritize the ability to signal early if any cluster is detected. At each interim hypothesis test, the boundary for signaling is prespecified by the alpha-spending plan shown in Table 2. Data on subsequent looks are incrementally added to each node, the data-permutation procedure is repeated, and new maximal log-likelihood ratio tests are calculated. Nodes where cluster(s) emerged earlier in the surveillance are removed from subsequent calculations of the maximum log-likelihood ratio. That is, we no longer include an identified cluster in follow-on hypothesis testing. This sequential adaptation of data-mining has the advantage of finding potential clusters earlier in time, allowing for earlier follow-up while surveillance continues on remaining outcomes.
Alpha-Spending and Sample Sizes in the 5 Sequential Tree-Based Data-Mining Looks for Adverse Events After Vaccination With Recombinant Herpes Zoster Vaccine Among Adults ≥50 Years of Age in IBM MarketScan Research Databases, United States, January 1, 2018, Through May 5, 2020 (Range of Eligible Vaccination Dates)
Look . | No. of Persons . | No. of Doses . | Cumulative Doses . | Proportion of Doses Accumulated . | Start of Dose Accrual . | End of Dose Accrual . | Alpha Spenta . | Cumulative Alpha Spent . |
---|---|---|---|---|---|---|---|---|
1 | 176,133 | 245,020 | 245,020 | 0.25 | January 1, 2018 | November 5, 2018 | 0.0247 | 0.0247 |
2 | 68,355 | 70,228 | 315,248 | 0.32 | November 6, 2018 | February 3, 2019 | 0.0034 | 0.0281 |
3 | 185,829 | 228,986 | 544,234 | 0.54 | February 4, 2019 | August 5, 2019 | 0.0088 | 0.0369 |
4 | 283,549 | 364,448 | 908,682 | 0.91 | August 6, 2019 | February 4, 2020 | 0.0108 | 0.0477 |
5 | 89,341 | 91,194 | 999,876 | 1.00 | February 5, 2020 | May 5, 2020 | 0.0023 | 0.0500 |
Look . | No. of Persons . | No. of Doses . | Cumulative Doses . | Proportion of Doses Accumulated . | Start of Dose Accrual . | End of Dose Accrual . | Alpha Spenta . | Cumulative Alpha Spent . |
---|---|---|---|---|---|---|---|---|
1 | 176,133 | 245,020 | 245,020 | 0.25 | January 1, 2018 | November 5, 2018 | 0.0247 | 0.0247 |
2 | 68,355 | 70,228 | 315,248 | 0.32 | November 6, 2018 | February 3, 2019 | 0.0034 | 0.0281 |
3 | 185,829 | 228,986 | 544,234 | 0.54 | February 4, 2019 | August 5, 2019 | 0.0088 | 0.0369 |
4 | 283,549 | 364,448 | 908,682 | 0.91 | August 6, 2019 | February 4, 2020 | 0.0108 | 0.0477 |
5 | 89,341 | 91,194 | 999,876 | 1.00 | February 5, 2020 | May 5, 2020 | 0.0023 | 0.0500 |
a Alpha spending based on approximation of Wald style boundary with 1 million doses as final stopping point.
Alpha-Spending and Sample Sizes in the 5 Sequential Tree-Based Data-Mining Looks for Adverse Events After Vaccination With Recombinant Herpes Zoster Vaccine Among Adults ≥50 Years of Age in IBM MarketScan Research Databases, United States, January 1, 2018, Through May 5, 2020 (Range of Eligible Vaccination Dates)
Look . | No. of Persons . | No. of Doses . | Cumulative Doses . | Proportion of Doses Accumulated . | Start of Dose Accrual . | End of Dose Accrual . | Alpha Spenta . | Cumulative Alpha Spent . |
---|---|---|---|---|---|---|---|---|
1 | 176,133 | 245,020 | 245,020 | 0.25 | January 1, 2018 | November 5, 2018 | 0.0247 | 0.0247 |
2 | 68,355 | 70,228 | 315,248 | 0.32 | November 6, 2018 | February 3, 2019 | 0.0034 | 0.0281 |
3 | 185,829 | 228,986 | 544,234 | 0.54 | February 4, 2019 | August 5, 2019 | 0.0088 | 0.0369 |
4 | 283,549 | 364,448 | 908,682 | 0.91 | August 6, 2019 | February 4, 2020 | 0.0108 | 0.0477 |
5 | 89,341 | 91,194 | 999,876 | 1.00 | February 5, 2020 | May 5, 2020 | 0.0023 | 0.0500 |
Look . | No. of Persons . | No. of Doses . | Cumulative Doses . | Proportion of Doses Accumulated . | Start of Dose Accrual . | End of Dose Accrual . | Alpha Spenta . | Cumulative Alpha Spent . |
---|---|---|---|---|---|---|---|---|
1 | 176,133 | 245,020 | 245,020 | 0.25 | January 1, 2018 | November 5, 2018 | 0.0247 | 0.0247 |
2 | 68,355 | 70,228 | 315,248 | 0.32 | November 6, 2018 | February 3, 2019 | 0.0034 | 0.0281 |
3 | 185,829 | 228,986 | 544,234 | 0.54 | February 4, 2019 | August 5, 2019 | 0.0088 | 0.0369 |
4 | 283,549 | 364,448 | 908,682 | 0.91 | August 6, 2019 | February 4, 2020 | 0.0108 | 0.0477 |
5 | 89,341 | 91,194 | 999,876 | 1.00 | February 5, 2020 | May 5, 2020 | 0.0023 | 0.0500 |
a Alpha spending based on approximation of Wald style boundary with 1 million doses as final stopping point.
Software used
We used Sentinel Query Request Package, version 10.3.2 (26), created at the Harvard Pilgrim Health Care Institute, and TreeScan, version 1.5 (27).
Institutional review board approval
The study was approved by the Harvard Pilgrim Health Care Institutional Review Board.
RESULTS
The series of analyses included 999,876 total doses of RZV. These were distributed as shown in Table 2, with 245,020 doses in look 1; 70,228 added in look 2; 228,986 added in look 3; 364,448 added in look 4; and 91,194 added in look 5.
In the primary analysis, which used the risk window of days 1–28, 4 statistically significant clusters appeared (Table 3). In look 1, there was a signal for T50.Z95 (adverse effect of other vaccines and biological substances). By the end of sequential analysis, there were 37 cases of this diagnosis in the risk window days 1–28 compared with 0 in the days 29–56 comparison window and an attributable risk of 3.7 per 100,000 doses of vaccine. In look 2, a signal emerged for T88.1 (other complications following immunization, not elsewhere classified). By the end of analysis, there were 27 cases in the risk window and 1 in the comparison window and an attributable risk of 2.6 per 100,000 doses. No new signals appeared at look 3. Then at look 4, there was a signal for T50.B (poisoning by, adverse effect of, and underdosing of viral vaccines). By the end of analysis, there were 24 cases of T50.B in the risk window and 0 in the comparison window and an attributable risk of 2.4 per 100,000 doses. T50.B95 (adverse effect of other viral vaccines), nested within T50.B, also signaled at look 4 and was clearly driving the T50.B signal, contributing 23 of T50.B’s 24 cases in the risk window, with attributable risk of 2.3 per 100,000 doses, by the end of analysis (Figure 1).
Details of Statistically Significant Signals Detected During Simulated Sequential Tree-Based Data-Mining for Adverse Events After Vaccination With Recombinant Herpes Zoster Vaccine Among Adults ≥50 Years of Age in IBM MarketScan Research Databases, United States, January 1, 2018, Through May 5, 2020 (Range of Eligible Vaccination Dates)a
Diagnosis Code and Look Number . | Diagnosis Description . | Cases in Days 1–28 . | Cases in Days 29–56 . | Expected Cases in Days 1–28 . | Excess Cases in Days 1–28 . | Attributable Risk per 100,000 Doses . | Log Likelihood Ratio . | Signaled? . |
---|---|---|---|---|---|---|---|---|
Look 1 | ||||||||
T50.Z95 | Adverse effect of other vaccines and biological substances | 16 | 0 | 8.0 | 8.0 | 6.5 | 11.09 | Yes |
T88.1 | Other complications following immunization, not elsewhere classified | 11 | 0 | 5.5 | 5.5 | 4.5 | 7.62 | No |
T50.B | Poisoning by, adverse effect of, and underdosing of viral vaccines | 5 | 0 | 2.5 | 2.5 | 2.0 | 3.47 | No |
T50.B95 | Adverse effect of other viral vaccines | 5 | 0 | 2.5 | 2.5 | 2.0 | 3.47 | No |
Look 2 | ||||||||
T50.Z95 | Adverse effect of other vaccines and biological substances | 18 | 0 | 9.0 | 9.0 | 5.7 | 12.48 | (In look 1) |
T88.1 | Other complications following immunization, not elsewhere classified | 15 | 0 | 7.5 | 7.5 | 4.8 | 10.40 | Yes |
T50.B | Poisoning by, adverse effect of, and underdosing of viral vaccines | 5 | 0 | 2.5 | 2.5 | 1.6 | 3.47 | No |
T50.B95 | Adverse effect of other viral vaccines | 5 | 0 | 2.5 | 2.5 | 1.6 | 3.47 | No |
Look 3 | ||||||||
T50.Z95 | Adverse effect of other vaccines and biological substances | 26 | 0 | 13.0 | 13.0 | 4.8 | 18.02 | (In look 1) |
T88.1 | Other complications following immunization, not elsewhere classified | 18 | 0 | 9.0 | 9.0 | 3.3 | 12.48 | (In look 2) |
T50.B | Poisoning by, adverse effect of, and underdosing of viral vaccines | 11 | 0 | 5.5 | 5.5 | 2.0 | 7.62 | No |
T50.B95 | Adverse effect of other viral vaccines | 11 | 0 | 5.5 | 5.5 | 2.0 | 7.62 | No |
Look 4 | ||||||||
T50.Z95 | Adverse effect of other vaccines and biological substances | 37 | 0 | 18.5 | 18.5 | 4.1 | 25.65 | (In look 1) |
T88.1 | Other complications following immunization, not elsewhere classified | 27 | 1 | 14.0 | 13.0 | 2.9 | 15.09 | (In look 2) |
T50.B | Poisoning by, adverse effect of, and underdosing of viral vaccines | 23 | 0 | 11.5 | 11.5 | 2.5 | 15.94 | Yes |
T50.B95 | Adverse effect of other viral vaccines | 22 | 0 | 11.0 | 11.0 | 2.4 | 15.25 | Yes |
Look 5 | ||||||||
T50.Z95 | Adverse effect of other vaccines and biological substances | 37 | 0 | 18.5 | 18.5 | 3.7 | 25.65 | (In look 1) |
T88.1 | Other complications following immunization, not elsewhere classified | 27 | 1 | 14.0 | 13.0 | 2.6 | 15.09 | (In look 2) |
T50.B | Poisoning by, adverse effect of, and underdosing of viral vaccines | 24 | 0 | 12.0 | 12.0 | 2.4 | 16.64 | (In look 4) |
T50.B95 | Adverse effect of other viral vaccines | 23 | 0 | 11.5 | 11.5 | 2.3 | 15.94 | (In look 4) |
Diagnosis Code and Look Number . | Diagnosis Description . | Cases in Days 1–28 . | Cases in Days 29–56 . | Expected Cases in Days 1–28 . | Excess Cases in Days 1–28 . | Attributable Risk per 100,000 Doses . | Log Likelihood Ratio . | Signaled? . |
---|---|---|---|---|---|---|---|---|
Look 1 | ||||||||
T50.Z95 | Adverse effect of other vaccines and biological substances | 16 | 0 | 8.0 | 8.0 | 6.5 | 11.09 | Yes |
T88.1 | Other complications following immunization, not elsewhere classified | 11 | 0 | 5.5 | 5.5 | 4.5 | 7.62 | No |
T50.B | Poisoning by, adverse effect of, and underdosing of viral vaccines | 5 | 0 | 2.5 | 2.5 | 2.0 | 3.47 | No |
T50.B95 | Adverse effect of other viral vaccines | 5 | 0 | 2.5 | 2.5 | 2.0 | 3.47 | No |
Look 2 | ||||||||
T50.Z95 | Adverse effect of other vaccines and biological substances | 18 | 0 | 9.0 | 9.0 | 5.7 | 12.48 | (In look 1) |
T88.1 | Other complications following immunization, not elsewhere classified | 15 | 0 | 7.5 | 7.5 | 4.8 | 10.40 | Yes |
T50.B | Poisoning by, adverse effect of, and underdosing of viral vaccines | 5 | 0 | 2.5 | 2.5 | 1.6 | 3.47 | No |
T50.B95 | Adverse effect of other viral vaccines | 5 | 0 | 2.5 | 2.5 | 1.6 | 3.47 | No |
Look 3 | ||||||||
T50.Z95 | Adverse effect of other vaccines and biological substances | 26 | 0 | 13.0 | 13.0 | 4.8 | 18.02 | (In look 1) |
T88.1 | Other complications following immunization, not elsewhere classified | 18 | 0 | 9.0 | 9.0 | 3.3 | 12.48 | (In look 2) |
T50.B | Poisoning by, adverse effect of, and underdosing of viral vaccines | 11 | 0 | 5.5 | 5.5 | 2.0 | 7.62 | No |
T50.B95 | Adverse effect of other viral vaccines | 11 | 0 | 5.5 | 5.5 | 2.0 | 7.62 | No |
Look 4 | ||||||||
T50.Z95 | Adverse effect of other vaccines and biological substances | 37 | 0 | 18.5 | 18.5 | 4.1 | 25.65 | (In look 1) |
T88.1 | Other complications following immunization, not elsewhere classified | 27 | 1 | 14.0 | 13.0 | 2.9 | 15.09 | (In look 2) |
T50.B | Poisoning by, adverse effect of, and underdosing of viral vaccines | 23 | 0 | 11.5 | 11.5 | 2.5 | 15.94 | Yes |
T50.B95 | Adverse effect of other viral vaccines | 22 | 0 | 11.0 | 11.0 | 2.4 | 15.25 | Yes |
Look 5 | ||||||||
T50.Z95 | Adverse effect of other vaccines and biological substances | 37 | 0 | 18.5 | 18.5 | 3.7 | 25.65 | (In look 1) |
T88.1 | Other complications following immunization, not elsewhere classified | 27 | 1 | 14.0 | 13.0 | 2.6 | 15.09 | (In look 2) |
T50.B | Poisoning by, adverse effect of, and underdosing of viral vaccines | 24 | 0 | 12.0 | 12.0 | 2.4 | 16.64 | (In look 4) |
T50.B95 | Adverse effect of other viral vaccines | 23 | 0 | 11.5 | 11.5 | 2.3 | 15.94 | (In look 4) |
a A binomial model with risk window of days 1–28 and comparison window of days 29–56 after vaccination, 5 sequential analyses, and an alpha-spending function that approximates a Wald boundary were used. Diagnosis codes come from International Classification of Diseases, Tenth Revision, Clinical Modification, and are shown in the order in which the signal arose and then in descending order of the test statistic.
Details of Statistically Significant Signals Detected During Simulated Sequential Tree-Based Data-Mining for Adverse Events After Vaccination With Recombinant Herpes Zoster Vaccine Among Adults ≥50 Years of Age in IBM MarketScan Research Databases, United States, January 1, 2018, Through May 5, 2020 (Range of Eligible Vaccination Dates)a
Diagnosis Code and Look Number . | Diagnosis Description . | Cases in Days 1–28 . | Cases in Days 29–56 . | Expected Cases in Days 1–28 . | Excess Cases in Days 1–28 . | Attributable Risk per 100,000 Doses . | Log Likelihood Ratio . | Signaled? . |
---|---|---|---|---|---|---|---|---|
Look 1 | ||||||||
T50.Z95 | Adverse effect of other vaccines and biological substances | 16 | 0 | 8.0 | 8.0 | 6.5 | 11.09 | Yes |
T88.1 | Other complications following immunization, not elsewhere classified | 11 | 0 | 5.5 | 5.5 | 4.5 | 7.62 | No |
T50.B | Poisoning by, adverse effect of, and underdosing of viral vaccines | 5 | 0 | 2.5 | 2.5 | 2.0 | 3.47 | No |
T50.B95 | Adverse effect of other viral vaccines | 5 | 0 | 2.5 | 2.5 | 2.0 | 3.47 | No |
Look 2 | ||||||||
T50.Z95 | Adverse effect of other vaccines and biological substances | 18 | 0 | 9.0 | 9.0 | 5.7 | 12.48 | (In look 1) |
T88.1 | Other complications following immunization, not elsewhere classified | 15 | 0 | 7.5 | 7.5 | 4.8 | 10.40 | Yes |
T50.B | Poisoning by, adverse effect of, and underdosing of viral vaccines | 5 | 0 | 2.5 | 2.5 | 1.6 | 3.47 | No |
T50.B95 | Adverse effect of other viral vaccines | 5 | 0 | 2.5 | 2.5 | 1.6 | 3.47 | No |
Look 3 | ||||||||
T50.Z95 | Adverse effect of other vaccines and biological substances | 26 | 0 | 13.0 | 13.0 | 4.8 | 18.02 | (In look 1) |
T88.1 | Other complications following immunization, not elsewhere classified | 18 | 0 | 9.0 | 9.0 | 3.3 | 12.48 | (In look 2) |
T50.B | Poisoning by, adverse effect of, and underdosing of viral vaccines | 11 | 0 | 5.5 | 5.5 | 2.0 | 7.62 | No |
T50.B95 | Adverse effect of other viral vaccines | 11 | 0 | 5.5 | 5.5 | 2.0 | 7.62 | No |
Look 4 | ||||||||
T50.Z95 | Adverse effect of other vaccines and biological substances | 37 | 0 | 18.5 | 18.5 | 4.1 | 25.65 | (In look 1) |
T88.1 | Other complications following immunization, not elsewhere classified | 27 | 1 | 14.0 | 13.0 | 2.9 | 15.09 | (In look 2) |
T50.B | Poisoning by, adverse effect of, and underdosing of viral vaccines | 23 | 0 | 11.5 | 11.5 | 2.5 | 15.94 | Yes |
T50.B95 | Adverse effect of other viral vaccines | 22 | 0 | 11.0 | 11.0 | 2.4 | 15.25 | Yes |
Look 5 | ||||||||
T50.Z95 | Adverse effect of other vaccines and biological substances | 37 | 0 | 18.5 | 18.5 | 3.7 | 25.65 | (In look 1) |
T88.1 | Other complications following immunization, not elsewhere classified | 27 | 1 | 14.0 | 13.0 | 2.6 | 15.09 | (In look 2) |
T50.B | Poisoning by, adverse effect of, and underdosing of viral vaccines | 24 | 0 | 12.0 | 12.0 | 2.4 | 16.64 | (In look 4) |
T50.B95 | Adverse effect of other viral vaccines | 23 | 0 | 11.5 | 11.5 | 2.3 | 15.94 | (In look 4) |
Diagnosis Code and Look Number . | Diagnosis Description . | Cases in Days 1–28 . | Cases in Days 29–56 . | Expected Cases in Days 1–28 . | Excess Cases in Days 1–28 . | Attributable Risk per 100,000 Doses . | Log Likelihood Ratio . | Signaled? . |
---|---|---|---|---|---|---|---|---|
Look 1 | ||||||||
T50.Z95 | Adverse effect of other vaccines and biological substances | 16 | 0 | 8.0 | 8.0 | 6.5 | 11.09 | Yes |
T88.1 | Other complications following immunization, not elsewhere classified | 11 | 0 | 5.5 | 5.5 | 4.5 | 7.62 | No |
T50.B | Poisoning by, adverse effect of, and underdosing of viral vaccines | 5 | 0 | 2.5 | 2.5 | 2.0 | 3.47 | No |
T50.B95 | Adverse effect of other viral vaccines | 5 | 0 | 2.5 | 2.5 | 2.0 | 3.47 | No |
Look 2 | ||||||||
T50.Z95 | Adverse effect of other vaccines and biological substances | 18 | 0 | 9.0 | 9.0 | 5.7 | 12.48 | (In look 1) |
T88.1 | Other complications following immunization, not elsewhere classified | 15 | 0 | 7.5 | 7.5 | 4.8 | 10.40 | Yes |
T50.B | Poisoning by, adverse effect of, and underdosing of viral vaccines | 5 | 0 | 2.5 | 2.5 | 1.6 | 3.47 | No |
T50.B95 | Adverse effect of other viral vaccines | 5 | 0 | 2.5 | 2.5 | 1.6 | 3.47 | No |
Look 3 | ||||||||
T50.Z95 | Adverse effect of other vaccines and biological substances | 26 | 0 | 13.0 | 13.0 | 4.8 | 18.02 | (In look 1) |
T88.1 | Other complications following immunization, not elsewhere classified | 18 | 0 | 9.0 | 9.0 | 3.3 | 12.48 | (In look 2) |
T50.B | Poisoning by, adverse effect of, and underdosing of viral vaccines | 11 | 0 | 5.5 | 5.5 | 2.0 | 7.62 | No |
T50.B95 | Adverse effect of other viral vaccines | 11 | 0 | 5.5 | 5.5 | 2.0 | 7.62 | No |
Look 4 | ||||||||
T50.Z95 | Adverse effect of other vaccines and biological substances | 37 | 0 | 18.5 | 18.5 | 4.1 | 25.65 | (In look 1) |
T88.1 | Other complications following immunization, not elsewhere classified | 27 | 1 | 14.0 | 13.0 | 2.9 | 15.09 | (In look 2) |
T50.B | Poisoning by, adverse effect of, and underdosing of viral vaccines | 23 | 0 | 11.5 | 11.5 | 2.5 | 15.94 | Yes |
T50.B95 | Adverse effect of other viral vaccines | 22 | 0 | 11.0 | 11.0 | 2.4 | 15.25 | Yes |
Look 5 | ||||||||
T50.Z95 | Adverse effect of other vaccines and biological substances | 37 | 0 | 18.5 | 18.5 | 3.7 | 25.65 | (In look 1) |
T88.1 | Other complications following immunization, not elsewhere classified | 27 | 1 | 14.0 | 13.0 | 2.6 | 15.09 | (In look 2) |
T50.B | Poisoning by, adverse effect of, and underdosing of viral vaccines | 24 | 0 | 12.0 | 12.0 | 2.4 | 16.64 | (In look 4) |
T50.B95 | Adverse effect of other viral vaccines | 23 | 0 | 11.5 | 11.5 | 2.3 | 15.94 | (In look 4) |
a A binomial model with risk window of days 1–28 and comparison window of days 29–56 after vaccination, 5 sequential analyses, and an alpha-spending function that approximates a Wald boundary were used. Diagnosis codes come from International Classification of Diseases, Tenth Revision, Clinical Modification, and are shown in the order in which the signal arose and then in descending order of the test statistic.

One signal detected among adults ≥50 years of age in IBM MarketScan Research Databases, United States, January 1, 2018, through May 5, 2020 (range of eligible vaccination dates). Cumulative number of cases of International Classification of Diseases, Tenth Revision, Clinical Modification, code T50.B95 (adverse effect of other viral vaccines) in days 1–28 after vaccination, log-likelihood ratio, attributable risk, and cumulative number of recombinant herpes zoster vaccine doses over the 5 sequential analyses.
No signals were found in the secondary analyses, which used a risk window of days 8–21.
DISCUSSION
In this study of nearly 1 million RZV vaccinations, we found statistically significant signals only for unspecified adverse effects or complications following immunization, all in the sequential analyses using the primary, days 1–28 risk window. All of these signals may represent true vaccine-associated adverse events. Ninety percent of the unique cases in the 4 signals occurred in the 7 days after vaccination, according to the postvaccination time-to-event information in the data set. Previous case-by-case investigations of similarly nonspecific postvaccination signals detected in the first few days after vaccination using the tree-temporal variant of this method found that the large majority of the cases had conditions such as injection site reactions, fever, fatigue, and headache (13).
Attributable risks as low as 2 excess cases per 100,000 vaccinations were seen, indicating good statistical power to detect possible adverse events. In general, the power of this method depends on the number of exposed persons—in this case, vaccinees—and the background rate of the outcome in the affected group (e.g., elderly, women, etc.), as well as the specific features and parameter settings selected for the data extraction and analysis, including the length of the follow-up period, the size and nature of the tree, and the number of risk intervals evaluated.
In multiple cohort and self-controlled case-series analyses conducted in a population of Medicare beneficiaries aged 65 years or older, Goud et al. (28) found an increased risk of Guillain-Barré syndrome (GBS) after RZV vaccination. In their self-controlled case-series analysis using medical record-confirmed cases, symptom onset of all 7 of the chart-confirmed cases in the days 1–42 risk window occurred during the second and third weeks after vaccination. Our secondary analyses used a days 8–21 risk window, but no signals for GBS or any other health outcome emerged. By the end of the analysis period, there were 8 cases with the GBS ICD-10-CM code in our data, 4 of which occurred during the days 8–21 secondary risk window and 4 outside of that window. Under the null hypothesis of no increased risk, one would have expected 2 cases in the risk window and 6 in the comparison window, given the ratio of risk window length to comparison window length. The 4/4 split was not different enough from 2/6 to produce a signal. Insufficient sample size—particularly of older people, who are at greater risk of GBS—may have been the reason we did not see a signal for GBS. There were more than 1.3 million eligible RZV vaccinations among the ≥65-year-olds in the Medicare-based study, and Goud et al. (28) reported an attributable risk of 3 per million doses of RZV (95% confidence interval: 0.62–5.64), whereas in our study of commercially insured people, only about one-quarter of our approximately 1 million doses were received by people ≥65 years of age, too few to detect such a low attributable risk.
Addressing the question of power more generally, in simulation studies of the nonsequential binomial tree-based scan statistic, we showed that for common outcomes (with background rates of approximately 3 per 10,000 vaccinees), 1 million vaccinees, as we had in the current study, allowed detection of 100 excess cases per million doses with 90% power; for rarer outcomes (with background rates of 8 per million vaccinees), 1 million vaccinees allowed detection of 20 excess cases per million doses with 67% power (29). In the current sequential study of RZV, power would be somewhat less because of the multiple looks at the data.
Strengths of our study include: 1) its untargeted nature, with no prespecification of health outcomes of interest; 2) its self-controlled design, which eliminates confounding by fixed patient characteristics such as chronic disease status; and 3) its formal control for multiple analyses of the accumulating data over the course of time.
A notable limitation of using the binomial variant of the tree-based scan statistical method is that the risk interval must be prespecified, in contrast with the tree-temporal variant, which scans for temporal clustering within the follow-up period in addition to within the ICD-10-CM code tree. While the signals found in the current study were also found using the tree-temporal scan variant applied to the same data source and a similar amount of RZV data, that earlier study found signals not only for unspecified adverse effects, complications, or reactions to immunization or other medical substances/care but also for fever, unspecified allergy, syncope/collapse, cellulitis, myalgia, and dizziness/giddiness (16), all within a few days of vaccination and all of which accord well with the known safety profile of this and other injected vaccines.
However, no sequential version of the tree-temporal variant of the method (or any variant other than the binomial) has been developed as yet, and the ability to conduct sequential analysis with the binomial variant, with formal adjustment for the multiple testing over time, is an advantage in allowing monitoring to begin soon after approval of a vaccine or other medical product and permitting early detection of signals of potential adverse reactions that occur in the prespecified risk window.
In conclusion, we have applied a new sequential tree-based data-mining method to the study of vaccine safety, with reasonably good statistical power, findings of plausible adverse effects, and no false signals. Although the method requires the risk window of interest to be prespecified and may miss true signals detectable using the tree-temporal variant of the method, it allows for early detection of potential safety problems by means of early initiation of monitoring and repeated adjusted analysis as exposures accumulate.
ACKNOWLEDGMENTS
Author affiliations: Department of Population Medicine, Harvard Medical School and Harvard Pilgrim Health Care Institute, Boston, Massachusetts (W. Katherine Yih, Inna Dashevsky, Judith C. Maro); and Division of Pharmacoepidemiology and Pharmacoeconomics, Department of Medicine, Harvard Medical School and Brigham and Women’s Hospital, Boston, Massachusetts (Martin Kulldorff).
This work was supported by a Harvard Pilgrim Health Care Institute Robert H. Ebert Career Development Award.
Aggregated data are available upon request.
The Harvard Pilgrim Health Care Institute had no role in the design or conduct of the study.
W.K.Y. has received some research funding from GlaxoSmithKline in the past. The other authors report no conflicts.
REFERENCES
Shingrix (package insert): https://www.fda.gov/media/108597/download.