Abstract

Tree-based scan statistics have been successfully used to study the safety of several vaccines without prespecifying health outcomes of concern. In this study, the binomial tree-based scan statistic was applied sequentially to detect adverse events in days 1–28 compared with days 29–56 after recombinant herpes zoster (RZV) vaccination, with 5 looks at the data and formal adjustment for the repeated analyses over time. IBM MarketScan data on commercially insured persons ≥50 years of age receiving RZV during January 1, 2018, to May 5, 2020, were used. With 999,876 doses of RZV included, statistically significant signals were detected only for unspecified adverse effects/complications following immunization, with attributable risks as low as 2 excess cases per 100,000 vaccinations. Ninety percent of cases in the signals occurred in the week after vaccination and, based on previous studies, likely represent nonserious events like fever, fatigue, and headache. Strengths of our study include its untargeted nature, self-controlled design, and formal adjustment for repeated testing. Although the method requires prespecification of the risk window of interest and may miss some true signals detectable using the tree-temporal variant of the method, it allows for early detection of potential safety problems through early initiation of ongoing monitoring.

Abbreviations

     
  • GBS

    Guillain-Barré syndrome

  •  
  • ICD-10-CM

    International Classification of Diseases, Tenth Revision, Clinical Modification

  •  
  • RZV

    recombinant herpes zoster vaccine

Sequential analysis allows for repeated looks at accumulating data in claims or electronic health record databases to detect increased risks of prespecified medically attended health outcomes following receipt of a vaccine or drug, adjusting for the multiple testing entailed in conducting these repeated analyses (13). A major advantage of these methods over more traditional, one-time safety studies is that monitoring can begin soon after approval and initial uptake of the vaccine or drug of interest, meaning true safety problems can theoretically be found earlier than in traditional studies, where researchers typically wait for a certain number of doses administered to accumulate in order to obtain the desired statistical power before conducting analysis. Sequential analysis methods are increasingly being used to monitor the safety of new vaccines, including coronavirus disease 2019 (COVID-19) vaccines (48).

Over the same period that sequential analysis for targeted vaccine safety surveillance was becoming established, a data-mining approach using the tree-based scan statistic was developed to detect signals of potential adverse reactions after exposure to a vaccine or drug (911). This method looks for statistically unusual clustering of cases of adverse events in a “tree” or hierarchical structure of diagnoses without requiring prespecification of health outcomes of interest and in that sense provides an untargeted, broad safety assessment. Scanning counts of diagnoses organized in a tree structure, such as the International Classification of Diseases, Tenth Revision, Clinical Modification (ICD-10-CM), hierarchy of codes, allows clusters of cases of related diagnoses to be detected, in the event that the exposure of interest is associated with a spectrum of disease rather than a highly specific health outcome. The method adjusts for the multiple overlapping diagnoses considered during the construction of the composite null hypothesis that there is no unusual clustering of cases in the tree. The tree-temporal variant of the method has been used to study several vaccines and detect unusual clustering of cases in the hierarchy of ICD-10-CM diagnosis codes and simultaneously in the postexposure follow-up period (1216). However, until now, none of the variants of the method has allowed for repeated adjusted analyses as data accumulate over the course of time.

In this study, we applied the binomial tree-based scan statistic sequentially to detect adverse events in 2 intervals of potentially increased risk after recombinant herpes zoster vaccination, taking 5 total looks at the data and adjusting not only for the multiple overlapping diagnoses scanned but also for the multiple testing over time. Recombinant herpes zoster vaccine (RZV) (Shingrix; GlaxoSmithKline, Brentford, United Kingdom) was approved by the Food and Drug Administration (FDA) in October 2017 for use in people aged 50 years or older for the prevention of herpes zoster, which in the United States affects about 1 out of every 3 people in their lifetime (17). RZV is administered as a 2-dose series, with 2–6 months between doses (18). It has proven highly effective in preventing herpes zoster and post-herpetic neuralgia (19, 20), which are common, often debilitating ailments.

METHODS

Study population, enrollment criteria, and exposure

We used the IBM MarketScan Research Databases (MarketScan; International Business Machines Corporation, Armonk, New York). These are among the largest proprietary US claims databases available for health-care research and are likely highly representative of the commercially insured population. The databases capture person-specific clinical utilization, expenditures, and enrollment across inpatient, outpatient, prescription drug, and carve-out services. Paid claims and encounter data are linked to detailed patient information across sites and types of providers collected from approximately 350 payers (mainly large employers and health plans, predominantly fee-for-service data).

We extracted data on persons at least 50 years of age who were vaccinated during 5 nonoverlapping time periods: January 1 through November 5, 2018; November 6, 2018, through February 3, 2019; February 4 through August 5, 2019; August 6, 2019, through February 4, 2020; and February 5 through May 5, 2020. Data extraction for the first 4 periods was conducted at one time point, while data extraction for the fifth period was carried out later, after more data had accumulated. Thus, we were simulating sequential analysis as opposed to analyzing the data at 5 different time points. (We did it this way because we had already collected approximately 90% of the intended cohort by the time we started this exploratory sequential analysis.) To be included in analysis, an individual had to have been enrolled from 400 days before through 56 days after RZV vaccination. RZV was identified using Current Procedural Terminology code 90750 and National Drug Codes 58160081912, 58160082311, 58160082801, 58160082803, 58160082901, and 58160082903. More than 1 dose per person was allowed, but RZV doses received within 42 days of a prior dose were excluded. Analyses were of all eligible doses rather than of dose 1 or dose 2 specifically.

Hierarchical diagnosis tree

Outcomes were identified using ICD-10-CM codes. ICD-10-CM codes have a hierarchical tree-like structure, starting with 21 broad categories of diagnoses (e.g., diseases of the circulatory system), which progressively branch into more and more specific sets of diagnoses, culminating in a highly specific diagnosis code. The ICD-10-CM tree we used has 6 levels. Table 1 presents an example of the hierarchical classification scheme; this example diagnosis does not use the 6th level.

Table 1

Example of Hierarchical Organization of International Classification of Diseases, Tenth Revision, Coding System, Showing Levels of Tree Employed

LevelNodeDescription
1I00–I99Diseases of the circulatory system
2I63 Cerebral infarction
3I63.0  Cerebral infarction due to thrombosis of precerebral arteries
4I63.03   Cerebral infarction due to thrombosis of carotid artery
5I63.031    Cerebral infarction due to thrombosis of right carotid artery
LevelNodeDescription
1I00–I99Diseases of the circulatory system
2I63 Cerebral infarction
3I63.0  Cerebral infarction due to thrombosis of precerebral arteries
4I63.03   Cerebral infarction due to thrombosis of carotid artery
5I63.031    Cerebral infarction due to thrombosis of right carotid artery
Table 1

Example of Hierarchical Organization of International Classification of Diseases, Tenth Revision, Coding System, Showing Levels of Tree Employed

LevelNodeDescription
1I00–I99Diseases of the circulatory system
2I63 Cerebral infarction
3I63.0  Cerebral infarction due to thrombosis of precerebral arteries
4I63.03   Cerebral infarction due to thrombosis of carotid artery
5I63.031    Cerebral infarction due to thrombosis of right carotid artery
LevelNodeDescription
1I00–I99Diseases of the circulatory system
2I63 Cerebral infarction
3I63.0  Cerebral infarction due to thrombosis of precerebral arteries
4I63.03   Cerebral infarction due to thrombosis of carotid artery
5I63.031    Cerebral infarction due to thrombosis of right carotid artery

The composite null hypothesis was constructed to consider clusters in levels 2–5, which contain 88,156 groupings of similar clinical diagnosis codes. (We refer to these groupings as “nodes” of the tree.) We did not look for clusters in the first or sixth level because these groupings are not clinically meaningful—the former are too general, and the latter are too specific, often only for the purpose of specifying anatomic laterality of a health outcome or distinguishing between initial and subsequent encounters.

Incident diagnoses

The study examined “incident” diagnoses observed in the inpatient or emergency department setting during the follow-up period of 56 days. To be counted as an incident case, the patient must not have been assigned another ICD-10-CM diagnosis code having the same first 3 characters (i.e., in the same second level of the tree) in any setting during the prior 400 days. (We chose 400 days in order to enable ascertainment of preexisting conditions that might have been recorded at a visit roughly 1 year prior, considering that some patients have preventive care visits on an approximately annual basis.) Because incidence was determined using the second level of the tree, above which no analysis of clustering was carried out, no patient could have contributed more than 1 case count to any detected cluster.

Risk and comparison windows

The primary risk interval evaluated was the period of days 1–28 after vaccination, with a comparison period of days 29–56. The secondary risk interval was the period of days 8–21 after vaccination, with a comparison period consisting of days 1–7 and days 22–56. In this self-controlled design, the comparison is between risk and comparison intervals within persons. The question being asked is whether there is an elevated occurrence of cases of a particular kind of adverse event during the postexposure risk window as compared with the comparison window.

Sequential binomial tree-based scan statistics

Scan statistics have been used in infectious disease, cancer, and genomic surveillance (2123). Most recently, scan statistics have been adapted for vaccine and drug safety surveillance (10, 1216, 24, 25). A scan statistic detects a cluster of nonrandom activity in a data set by moving a window of evaluation (e.g., temporal or geospatial) across it and calculating its likelihood under expected conditions. Composite null hypothesis-testing procedures use a generalized likelihood ratio test that examines many potential models or combinations of data and maximizes the likelihood ratio function over the multiple potential combinations of data analyzed.

With our self-controlled design, we scanned cases within the ICD-10-CM hierarchical structure, looking for any clustering of cases within this tree-like structure in the risk window compared with the comparison window. Under the composite null hypothesis, there would be no unusual clustering of events within the risk window in any part of the tree. Under the alternative hypothesis, there would be at least 1 leaf or branch of the tree with an unusual cluster of events. We used Monte Carlo simulation to generate 9,999 replications of a null data set (i.e., a data set without any association between the exposure and the outcome other than due to chance alone) using a data-permutation strategy, holding constant the number of cases at each node of the tree (in risk and comparison windows combined) in the original data set. The test statistic was calculated for each replication data set plus the real data set and was the maximum of the individual log-likelihood ratios calculated over the 88,156 clinical outcome groups. This technique allows us to rank each potential cluster against the test statistic distribution. Given the 9,999 replications, the lowest possible P value is 0.0001.

In a nonsequential application of binomial tree-based scan statistics, this distribution of maximum log-likelihood ratios is realized once, and a prespecified P value is selected as the threshold for statistical significance. In the novel sequential version that we demonstrate here, we preselect a number of interim hypothesis tests, an alpha-spending plan, and a maximum value of alpha spent. Here, we used a 5-look plan and distributed the prespecified total alpha level of 0.05 across those 5 looks according to an alpha spending function that approximates a Wald boundary (Table 2, penultimate column). A Wald boundary uses a larger amount of alpha in early looks to prioritize the ability to signal early if any cluster is detected. At each interim hypothesis test, the boundary for signaling is prespecified by the alpha-spending plan shown in Table 2. Data on subsequent looks are incrementally added to each node, the data-permutation procedure is repeated, and new maximal log-likelihood ratio tests are calculated. Nodes where cluster(s) emerged earlier in the surveillance are removed from subsequent calculations of the maximum log-likelihood ratio. That is, we no longer include an identified cluster in follow-on hypothesis testing. This sequential adaptation of data-mining has the advantage of finding potential clusters earlier in time, allowing for earlier follow-up while surveillance continues on remaining outcomes.

Table 2

Alpha-Spending and Sample Sizes in the 5 Sequential Tree-Based Data-Mining Looks for Adverse Events After Vaccination With Recombinant Herpes Zoster Vaccine Among Adults ≥50 Years of Age in IBM MarketScan Research Databases, United States, January 1, 2018, Through May 5, 2020 (Range of Eligible Vaccination Dates)

LookNo. of PersonsNo. of DosesCumulative DosesProportion of Doses AccumulatedStart of Dose AccrualEnd of Dose AccrualAlpha SpentaCumulative Alpha Spent
1176,133245,020245,0200.25January 1, 2018November 5, 20180.02470.0247
268,35570,228315,2480.32November 6, 2018February 3, 20190.00340.0281
3185,829228,986544,2340.54February 4, 2019August 5, 20190.00880.0369
4283,549364,448908,6820.91August 6, 2019February 4, 20200.01080.0477
589,34191,194999,8761.00February 5, 2020May 5, 20200.00230.0500
LookNo. of PersonsNo. of DosesCumulative DosesProportion of Doses AccumulatedStart of Dose AccrualEnd of Dose AccrualAlpha SpentaCumulative Alpha Spent
1176,133245,020245,0200.25January 1, 2018November 5, 20180.02470.0247
268,35570,228315,2480.32November 6, 2018February 3, 20190.00340.0281
3185,829228,986544,2340.54February 4, 2019August 5, 20190.00880.0369
4283,549364,448908,6820.91August 6, 2019February 4, 20200.01080.0477
589,34191,194999,8761.00February 5, 2020May 5, 20200.00230.0500

a Alpha spending based on approximation of Wald style boundary with 1 million doses as final stopping point.

Table 2

Alpha-Spending and Sample Sizes in the 5 Sequential Tree-Based Data-Mining Looks for Adverse Events After Vaccination With Recombinant Herpes Zoster Vaccine Among Adults ≥50 Years of Age in IBM MarketScan Research Databases, United States, January 1, 2018, Through May 5, 2020 (Range of Eligible Vaccination Dates)

LookNo. of PersonsNo. of DosesCumulative DosesProportion of Doses AccumulatedStart of Dose AccrualEnd of Dose AccrualAlpha SpentaCumulative Alpha Spent
1176,133245,020245,0200.25January 1, 2018November 5, 20180.02470.0247
268,35570,228315,2480.32November 6, 2018February 3, 20190.00340.0281
3185,829228,986544,2340.54February 4, 2019August 5, 20190.00880.0369
4283,549364,448908,6820.91August 6, 2019February 4, 20200.01080.0477
589,34191,194999,8761.00February 5, 2020May 5, 20200.00230.0500
LookNo. of PersonsNo. of DosesCumulative DosesProportion of Doses AccumulatedStart of Dose AccrualEnd of Dose AccrualAlpha SpentaCumulative Alpha Spent
1176,133245,020245,0200.25January 1, 2018November 5, 20180.02470.0247
268,35570,228315,2480.32November 6, 2018February 3, 20190.00340.0281
3185,829228,986544,2340.54February 4, 2019August 5, 20190.00880.0369
4283,549364,448908,6820.91August 6, 2019February 4, 20200.01080.0477
589,34191,194999,8761.00February 5, 2020May 5, 20200.00230.0500

a Alpha spending based on approximation of Wald style boundary with 1 million doses as final stopping point.

The formula for attributable risk is as follows:
where c = the number of cases in the risk window, n = the total number of cases in the outcome category, and p = the length of the risk window divided by the length of the follow-up period.

Software used

We used Sentinel Query Request Package, version 10.3.2 (26), created at the Harvard Pilgrim Health Care Institute, and TreeScan, version 1.5 (27).

Institutional review board approval

The study was approved by the Harvard Pilgrim Health Care Institutional Review Board.

RESULTS

The series of analyses included 999,876 total doses of RZV. These were distributed as shown in Table 2, with 245,020 doses in look 1; 70,228 added in look 2; 228,986 added in look 3; 364,448 added in look 4; and 91,194 added in look 5.

In the primary analysis, which used the risk window of days 1–28, 4 statistically significant clusters appeared (Table 3). In look 1, there was a signal for T50.Z95 (adverse effect of other vaccines and biological substances). By the end of sequential analysis, there were 37 cases of this diagnosis in the risk window days 1–28 compared with 0 in the days 29–56 comparison window and an attributable risk of 3.7 per 100,000 doses of vaccine. In look 2, a signal emerged for T88.1 (other complications following immunization, not elsewhere classified). By the end of analysis, there were 27 cases in the risk window and 1 in the comparison window and an attributable risk of 2.6 per 100,000 doses. No new signals appeared at look 3. Then at look 4, there was a signal for T50.B (poisoning by, adverse effect of, and underdosing of viral vaccines). By the end of analysis, there were 24 cases of T50.B in the risk window and 0 in the comparison window and an attributable risk of 2.4 per 100,000 doses. T50.B95 (adverse effect of other viral vaccines), nested within T50.B, also signaled at look 4 and was clearly driving the T50.B signal, contributing 23 of T50.B’s 24 cases in the risk window, with attributable risk of 2.3 per 100,000 doses, by the end of analysis (Figure 1).

Table 3

Details of Statistically Significant Signals Detected During Simulated Sequential Tree-Based Data-Mining for Adverse Events After Vaccination With Recombinant Herpes Zoster Vaccine Among Adults ≥50 Years of Age in IBM MarketScan Research Databases, United States, January 1, 2018, Through May 5, 2020 (Range of Eligible Vaccination Dates)a

Diagnosis Code and Look NumberDiagnosis DescriptionCases in Days 1–28Cases in Days 29–56Expected Cases in Days 1–28Excess Cases in Days 1–28Attributable Risk per 100,000 DosesLog Likelihood RatioSignaled?
Look 1
 T50.Z95Adverse effect of other vaccines and biological substances1608.08.06.511.09Yes
 T88.1Other complications following immunization, not elsewhere classified1105.55.54.57.62No
 T50.BPoisoning by, adverse effect of, and underdosing of viral vaccines502.52.52.03.47No
 T50.B95Adverse effect of other viral vaccines502.52.52.03.47No
Look 2
 T50.Z95Adverse effect of other vaccines and biological substances1809.09.05.712.48(In look 1)
 T88.1Other complications following immunization, not elsewhere classified1507.57.54.810.40Yes
 T50.BPoisoning by, adverse effect of, and underdosing of viral vaccines502.52.51.63.47No
 T50.B95Adverse effect of other viral vaccines502.52.51.63.47No
Look 3
 T50.Z95Adverse effect of other vaccines and biological substances26013.013.04.818.02(In look 1)
 T88.1Other complications following immunization, not elsewhere classified1809.09.03.312.48(In look 2)
 T50.BPoisoning by, adverse effect of, and underdosing of viral vaccines1105.55.52.07.62No
 T50.B95Adverse effect of other viral vaccines1105.55.52.07.62No
Look 4
 T50.Z95Adverse effect of other vaccines and biological substances37018.518.54.125.65(In look 1)
 T88.1Other complications following immunization, not elsewhere classified27114.013.02.915.09(In look 2)
 T50.BPoisoning by, adverse effect of, and underdosing of viral vaccines23011.511.52.515.94Yes
 T50.B95Adverse effect of other viral vaccines22011.011.02.415.25Yes
Look 5
 T50.Z95Adverse effect of other vaccines and biological substances37018.518.53.725.65(In look 1)
 T88.1Other complications following immunization, not elsewhere classified27114.013.02.615.09(In look 2)
 T50.BPoisoning by, adverse effect of, and underdosing of viral vaccines24012.012.02.416.64(In look 4)
 T50.B95Adverse effect of other viral vaccines23011.511.52.315.94(In look 4)
Diagnosis Code and Look NumberDiagnosis DescriptionCases in Days 1–28Cases in Days 29–56Expected Cases in Days 1–28Excess Cases in Days 1–28Attributable Risk per 100,000 DosesLog Likelihood RatioSignaled?
Look 1
 T50.Z95Adverse effect of other vaccines and biological substances1608.08.06.511.09Yes
 T88.1Other complications following immunization, not elsewhere classified1105.55.54.57.62No
 T50.BPoisoning by, adverse effect of, and underdosing of viral vaccines502.52.52.03.47No
 T50.B95Adverse effect of other viral vaccines502.52.52.03.47No
Look 2
 T50.Z95Adverse effect of other vaccines and biological substances1809.09.05.712.48(In look 1)
 T88.1Other complications following immunization, not elsewhere classified1507.57.54.810.40Yes
 T50.BPoisoning by, adverse effect of, and underdosing of viral vaccines502.52.51.63.47No
 T50.B95Adverse effect of other viral vaccines502.52.51.63.47No
Look 3
 T50.Z95Adverse effect of other vaccines and biological substances26013.013.04.818.02(In look 1)
 T88.1Other complications following immunization, not elsewhere classified1809.09.03.312.48(In look 2)
 T50.BPoisoning by, adverse effect of, and underdosing of viral vaccines1105.55.52.07.62No
 T50.B95Adverse effect of other viral vaccines1105.55.52.07.62No
Look 4
 T50.Z95Adverse effect of other vaccines and biological substances37018.518.54.125.65(In look 1)
 T88.1Other complications following immunization, not elsewhere classified27114.013.02.915.09(In look 2)
 T50.BPoisoning by, adverse effect of, and underdosing of viral vaccines23011.511.52.515.94Yes
 T50.B95Adverse effect of other viral vaccines22011.011.02.415.25Yes
Look 5
 T50.Z95Adverse effect of other vaccines and biological substances37018.518.53.725.65(In look 1)
 T88.1Other complications following immunization, not elsewhere classified27114.013.02.615.09(In look 2)
 T50.BPoisoning by, adverse effect of, and underdosing of viral vaccines24012.012.02.416.64(In look 4)
 T50.B95Adverse effect of other viral vaccines23011.511.52.315.94(In look 4)

a A binomial model with risk window of days 1–28 and comparison window of days 29–56 after vaccination, 5 sequential analyses, and an alpha-spending function that approximates a Wald boundary were used. Diagnosis codes come from International Classification of Diseases, Tenth Revision, Clinical Modification, and are shown in the order in which the signal arose and then in descending order of the test statistic.

Table 3

Details of Statistically Significant Signals Detected During Simulated Sequential Tree-Based Data-Mining for Adverse Events After Vaccination With Recombinant Herpes Zoster Vaccine Among Adults ≥50 Years of Age in IBM MarketScan Research Databases, United States, January 1, 2018, Through May 5, 2020 (Range of Eligible Vaccination Dates)a

Diagnosis Code and Look NumberDiagnosis DescriptionCases in Days 1–28Cases in Days 29–56Expected Cases in Days 1–28Excess Cases in Days 1–28Attributable Risk per 100,000 DosesLog Likelihood RatioSignaled?
Look 1
 T50.Z95Adverse effect of other vaccines and biological substances1608.08.06.511.09Yes
 T88.1Other complications following immunization, not elsewhere classified1105.55.54.57.62No
 T50.BPoisoning by, adverse effect of, and underdosing of viral vaccines502.52.52.03.47No
 T50.B95Adverse effect of other viral vaccines502.52.52.03.47No
Look 2
 T50.Z95Adverse effect of other vaccines and biological substances1809.09.05.712.48(In look 1)
 T88.1Other complications following immunization, not elsewhere classified1507.57.54.810.40Yes
 T50.BPoisoning by, adverse effect of, and underdosing of viral vaccines502.52.51.63.47No
 T50.B95Adverse effect of other viral vaccines502.52.51.63.47No
Look 3
 T50.Z95Adverse effect of other vaccines and biological substances26013.013.04.818.02(In look 1)
 T88.1Other complications following immunization, not elsewhere classified1809.09.03.312.48(In look 2)
 T50.BPoisoning by, adverse effect of, and underdosing of viral vaccines1105.55.52.07.62No
 T50.B95Adverse effect of other viral vaccines1105.55.52.07.62No
Look 4
 T50.Z95Adverse effect of other vaccines and biological substances37018.518.54.125.65(In look 1)
 T88.1Other complications following immunization, not elsewhere classified27114.013.02.915.09(In look 2)
 T50.BPoisoning by, adverse effect of, and underdosing of viral vaccines23011.511.52.515.94Yes
 T50.B95Adverse effect of other viral vaccines22011.011.02.415.25Yes
Look 5
 T50.Z95Adverse effect of other vaccines and biological substances37018.518.53.725.65(In look 1)
 T88.1Other complications following immunization, not elsewhere classified27114.013.02.615.09(In look 2)
 T50.BPoisoning by, adverse effect of, and underdosing of viral vaccines24012.012.02.416.64(In look 4)
 T50.B95Adverse effect of other viral vaccines23011.511.52.315.94(In look 4)
Diagnosis Code and Look NumberDiagnosis DescriptionCases in Days 1–28Cases in Days 29–56Expected Cases in Days 1–28Excess Cases in Days 1–28Attributable Risk per 100,000 DosesLog Likelihood RatioSignaled?
Look 1
 T50.Z95Adverse effect of other vaccines and biological substances1608.08.06.511.09Yes
 T88.1Other complications following immunization, not elsewhere classified1105.55.54.57.62No
 T50.BPoisoning by, adverse effect of, and underdosing of viral vaccines502.52.52.03.47No
 T50.B95Adverse effect of other viral vaccines502.52.52.03.47No
Look 2
 T50.Z95Adverse effect of other vaccines and biological substances1809.09.05.712.48(In look 1)
 T88.1Other complications following immunization, not elsewhere classified1507.57.54.810.40Yes
 T50.BPoisoning by, adverse effect of, and underdosing of viral vaccines502.52.51.63.47No
 T50.B95Adverse effect of other viral vaccines502.52.51.63.47No
Look 3
 T50.Z95Adverse effect of other vaccines and biological substances26013.013.04.818.02(In look 1)
 T88.1Other complications following immunization, not elsewhere classified1809.09.03.312.48(In look 2)
 T50.BPoisoning by, adverse effect of, and underdosing of viral vaccines1105.55.52.07.62No
 T50.B95Adverse effect of other viral vaccines1105.55.52.07.62No
Look 4
 T50.Z95Adverse effect of other vaccines and biological substances37018.518.54.125.65(In look 1)
 T88.1Other complications following immunization, not elsewhere classified27114.013.02.915.09(In look 2)
 T50.BPoisoning by, adverse effect of, and underdosing of viral vaccines23011.511.52.515.94Yes
 T50.B95Adverse effect of other viral vaccines22011.011.02.415.25Yes
Look 5
 T50.Z95Adverse effect of other vaccines and biological substances37018.518.53.725.65(In look 1)
 T88.1Other complications following immunization, not elsewhere classified27114.013.02.615.09(In look 2)
 T50.BPoisoning by, adverse effect of, and underdosing of viral vaccines24012.012.02.416.64(In look 4)
 T50.B95Adverse effect of other viral vaccines23011.511.52.315.94(In look 4)

a A binomial model with risk window of days 1–28 and comparison window of days 29–56 after vaccination, 5 sequential analyses, and an alpha-spending function that approximates a Wald boundary were used. Diagnosis codes come from International Classification of Diseases, Tenth Revision, Clinical Modification, and are shown in the order in which the signal arose and then in descending order of the test statistic.

One signal detected among adults ≥50 years of age in IBM MarketScan Research Databases, United States, January 1, 2018, through May 5, 2020 (range of eligible vaccination dates). Cumulative number of cases of International Classification of Diseases, Tenth Revision, Clinical Modification, code T50.B95 (adverse effect of other viral vaccines) in days 1–28 after vaccination, log-likelihood ratio, attributable risk, and cumulative number of recombinant herpes zoster vaccine doses over the 5 sequential analyses.
Figure 1

One signal detected among adults ≥50 years of age in IBM MarketScan Research Databases, United States, January 1, 2018, through May 5, 2020 (range of eligible vaccination dates). Cumulative number of cases of International Classification of Diseases, Tenth Revision, Clinical Modification, code T50.B95 (adverse effect of other viral vaccines) in days 1–28 after vaccination, log-likelihood ratio, attributable risk, and cumulative number of recombinant herpes zoster vaccine doses over the 5 sequential analyses.

No signals were found in the secondary analyses, which used a risk window of days 8–21.

DISCUSSION

In this study of nearly 1 million RZV vaccinations, we found statistically significant signals only for unspecified adverse effects or complications following immunization, all in the sequential analyses using the primary, days 1–28 risk window. All of these signals may represent true vaccine-associated adverse events. Ninety percent of the unique cases in the 4 signals occurred in the 7 days after vaccination, according to the postvaccination time-to-event information in the data set. Previous case-by-case investigations of similarly nonspecific postvaccination signals detected in the first few days after vaccination using the tree-temporal variant of this method found that the large majority of the cases had conditions such as injection site reactions, fever, fatigue, and headache (13).

Attributable risks as low as 2 excess cases per 100,000 vaccinations were seen, indicating good statistical power to detect possible adverse events. In general, the power of this method depends on the number of exposed persons—in this case, vaccinees—and the background rate of the outcome in the affected group (e.g., elderly, women, etc.), as well as the specific features and parameter settings selected for the data extraction and analysis, including the length of the follow-up period, the size and nature of the tree, and the number of risk intervals evaluated.

In multiple cohort and self-controlled case-series analyses conducted in a population of Medicare beneficiaries aged 65 years or older, Goud et al. (28) found an increased risk of Guillain-Barré syndrome (GBS) after RZV vaccination. In their self-controlled case-series analysis using medical record-confirmed cases, symptom onset of all 7 of the chart-confirmed cases in the days 1–42 risk window occurred during the second and third weeks after vaccination. Our secondary analyses used a days 8–21 risk window, but no signals for GBS or any other health outcome emerged. By the end of the analysis period, there were 8 cases with the GBS ICD-10-CM code in our data, 4 of which occurred during the days 8–21 secondary risk window and 4 outside of that window. Under the null hypothesis of no increased risk, one would have expected 2 cases in the risk window and 6 in the comparison window, given the ratio of risk window length to comparison window length. The 4/4 split was not different enough from 2/6 to produce a signal. Insufficient sample size—particularly of older people, who are at greater risk of GBS—may have been the reason we did not see a signal for GBS. There were more than 1.3 million eligible RZV vaccinations among the ≥65-year-olds in the Medicare-based study, and Goud et al. (28) reported an attributable risk of 3 per million doses of RZV (95% confidence interval: 0.62–5.64), whereas in our study of commercially insured people, only about one-quarter of our approximately 1 million doses were received by people ≥65 years of age, too few to detect such a low attributable risk.

Addressing the question of power more generally, in simulation studies of the nonsequential binomial tree-based scan statistic, we showed that for common outcomes (with background rates of approximately 3 per 10,000 vaccinees), 1 million vaccinees, as we had in the current study, allowed detection of 100 excess cases per million doses with 90% power; for rarer outcomes (with background rates of 8 per million vaccinees), 1 million vaccinees allowed detection of 20 excess cases per million doses with 67% power (29). In the current sequential study of RZV, power would be somewhat less because of the multiple looks at the data.

Strengths of our study include: 1) its untargeted nature, with no prespecification of health outcomes of interest; 2) its self-controlled design, which eliminates confounding by fixed patient characteristics such as chronic disease status; and 3) its formal control for multiple analyses of the accumulating data over the course of time.

A notable limitation of using the binomial variant of the tree-based scan statistical method is that the risk interval must be prespecified, in contrast with the tree-temporal variant, which scans for temporal clustering within the follow-up period in addition to within the ICD-10-CM code tree. While the signals found in the current study were also found using the tree-temporal scan variant applied to the same data source and a similar amount of RZV data, that earlier study found signals not only for unspecified adverse effects, complications, or reactions to immunization or other medical substances/care but also for fever, unspecified allergy, syncope/collapse, cellulitis, myalgia, and dizziness/giddiness (16), all within a few days of vaccination and all of which accord well with the known safety profile of this and other injected vaccines.

However, no sequential version of the tree-temporal variant of the method (or any variant other than the binomial) has been developed as yet, and the ability to conduct sequential analysis with the binomial variant, with formal adjustment for the multiple testing over time, is an advantage in allowing monitoring to begin soon after approval of a vaccine or other medical product and permitting early detection of signals of potential adverse reactions that occur in the prespecified risk window.

In conclusion, we have applied a new sequential tree-based data-mining method to the study of vaccine safety, with reasonably good statistical power, findings of plausible adverse effects, and no false signals. Although the method requires the risk window of interest to be prespecified and may miss true signals detectable using the tree-temporal variant of the method, it allows for early detection of potential safety problems by means of early initiation of monitoring and repeated adjusted analysis as exposures accumulate.

ACKNOWLEDGMENTS

Author affiliations: Department of Population Medicine, Harvard Medical School and Harvard Pilgrim Health Care Institute, Boston, Massachusetts (W. Katherine Yih, Inna Dashevsky, Judith C. Maro); and Division of Pharmacoepidemiology and Pharmacoeconomics, Department of Medicine, Harvard Medical School and Brigham and Women’s Hospital, Boston, Massachusetts (Martin Kulldorff).

This work was supported by a Harvard Pilgrim Health Care Institute Robert H. Ebert Career Development Award.

Aggregated data are available upon request.

The Harvard Pilgrim Health Care Institute had no role in the design or conduct of the study.

W.K.Y. has received some research funding from GlaxoSmithKline in the past. The other authors report no conflicts.

REFERENCES

1.

Davis
RL
,
Kolczak
M
,
Lewis
E
, et al.
Active surveillance of vaccine safety: a system to detect early signs of adverse events
.
Epidemiology.
2005
;
16
(
3
):
336
341
.

2.

Kulldorff
M
,
Davis
RL
,
Kolczak
M
, et al.
A maximized sequential probability ratio test for drug and vaccine safety surveillance
.
Seq Anal.
2011
;
30
(
1
):
58
78
.

3.

Li
L
,
Kulldorff
M
.
A conditional maximized sequential probability ratio test for pharmacovigilance
.
Stat Med.
2010
;
29
(
2
):
284
295
.

4.

Lieu
TA
,
Kulldorff
M
,
Davis
RL
, et al.
Real-time vaccine safety surveillance for the early detection of adverse events
.
Med Care.
2007
;
45
(
Suppl 2
):
S89
S95
.

5.

Lee
GM
,
Greene
SK
,
Weintraub
ES
, et al.
H1N1 and seasonal influenza vaccine safety in the Vaccine Safety Datalink project
.
Am J Prev Med.
2011
;
41
(
2
):
121
128
.

6.

Daley
MF
,
Yih
WK
,
Glanz
JM
, et al.
Safety of diphtheria, tetanus, acellular pertussis and inactivated poliovirus (DTaP-IPV) vaccine
.
Vaccine.
2014
;
32
(
25
):
3019
3024
.

7.

Li
R
,
Stewart
B
,
McNeil
MM
, et al.
Post licensure surveillance of influenza vaccines in the Vaccine Safety Datalink in the 2013–2014 and 2014–2015 seasons
.
Pharmacoepidemiol Drug Saf.
2016
;
25
(
8
):
928
934
.

8.

Klein
NP
,
Lewis
N
,
Goddard
K
, et al.
Surveillance for adverse events after COVID-19 mRNA vaccination
.
JAMA.
2021
;
326
(
14
):
1390
1399
.

9.

Kulldorff
M
,
Fang
Z
,
Walsh
SJ
.
A tree-based scan statistic for database disease surveillance
.
Biometrics.
2003
;
59
(
2
):
323
331
.

10.

Kulldorff
M
,
Dashevsky
I
,
Avery
TR
, et al.
Drug safety data mining with a tree-based scan statistic
.
Pharmacoepidemiol Drug Saf.
2013
;
22
(
5
):
517
523
.

11.

Brown
JS
,
Petronis
KR
,
Bate
A
, et al.
Drug adverse event detection in health plan data using the Gamma Poisson Shrinker and comparison to the tree-based scan statistic
.
Pharmaceutics.
2013
;
5
(
1
):
179
200
.

12.

Li
R
,
Weintraub
E
,
McNeil
MM
, et al.
Meningococcal conjugate vaccine safety surveillance in the Vaccine Safety Datalink using a tree-temporal scan data mining method
.
Pharmacoepidemiol Drug Saf.
2018
;
27
(
4
):
391
397
.

13.

Yih
WK
,
Maro
JC
,
Nguyen
M
, et al.
Assessment of quadrivalent human papillomavirus vaccine safety using the self-controlled tree-temporal scan statistic signal-detection method in the Sentinel System
.
Am J Epidemiol.
2018
;
187
(
6
):
1269
1276
.

14.

Yih
WK
,
Kulldorff
M
,
Dashevsky
I
, et al.
Using the self-controlled tree-temporal scan statistic to assess the safety of live attenuated herpes zoster vaccine
.
Am J Epidemiol.
2019
;
188
(
7
):
1383
1388
.

15.

Yih
WK
,
Kulldorff
M
,
Dashevsky
I
, et al.
A broad safety assessment of the 9-valent human papillomavirus vaccine
.
Am J Epidemiol.
2021
;
190
(
7
):
1253
1259
.

16.

Yih
WK
,
Kulldorff
M
,
Dashevsky
I
, et al.
A broad safety assessment of the recombinant herpes zoster vaccine
.
Am J Epidemiol.
2022
;
191
(
5
):
957
964
.

17.

Centers for Disease Control and Prevention
. About shingles. https://www.cdc.gov/shingles/about/index.html.
Accessed February 18, 2022
.

18.

Shingrix (package insert): https://www.fda.gov/media/108597/download.

Accessed February 18, 2022
.

19.

Lal
H
,
Cunningham
AL
,
Godeaux
O
, et al.
Efficacy of an adjuvanted herpes zoster subunit vaccine in older adults
.
N Engl J Med.
2015
;
372
(
22
):
2087
2096
.

20.

Izurieta
HS
,
Wu
X
,
Forshee
R
, et al.
Recombinant zoster vaccine (Shingrix) real-world effectiveness in the first two years post-licensure
.
Clin Infect Dis.
2021
;
73
(
6
):
941
948
.

21.

Naus
JL
.
Clustering of random points in two dimensions
.
Biometrika.
1965
;
52
(
1–2
):
263
266
.

22.

Glaz
J
,
Koutras
MV
.
Handbook of Scan Statistics
.
New York, NY
:
Springer New York
;
2020
.

23.

Abolhassani
A
,
Prates
MO
.
An up-to-date review of scan statistics
.
Statist Surv.
2021
;
15
:
111
153
.

24.

Fralick
M
,
Kulldorff
M
,
Redelmeier
D
, et al.
A novel data mining application to detect safety signals for newly approved medications in routine care of patients with diabetes
.
Endocrinol Diabetes Metab.
2021
;
4
(
3
):e00237.

25.

Wang
SV
,
Maro
JC
,
Gagne
JJ
, et al.
A general propensity score for signal identification using tree-based scan statistics
.
Am J Epidemiol.
2021
;
190
(
7
):
1424
1433
.

26.

Sentinel System
. Sentinel Routine Querying System Reporting Tool. https://sentinelinitiative.org/methods-data-tools/routine-querying-tools/sentinel-routine-querying-system-reporting-tool.
Accessed February 23, 2022
.

27.

Kulldorff M
,
Information Management Services, Inc.
TreeScan: Software for the Tree-Based Scan Statistic
,
version 1.5
.
Calverton, MD
:
Information Management Services, Inc.
;
2021
.

28.

Goud
R
,
Lufkin
B
,
Duffy
J
, et al.
Risk of Guillain-Barré syndrome following recombinant zoster vaccine in Medicare beneficiaries
.
JAMA Intern Med.
2021
;
181
(
12
):
1623
1630
.

29.

Maro
JC
,
Dashevsky
I
,
Kulldorff
M
.
Postlicensure medical product safety data-mining: power calculations for Bernoulli data
.
Sentinel Methods Report
,
December 22, 2017
. https://www.sentinelinitiative.org/sites/default/files/vaccines-blood-biologics/assessments/TreeScanPower_FinalReport.pdf.
Accessed May 11, 2022
.

This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic.oup.com/journals/pages/open_access/funder_policies/chorus/standard_publication_model)