-
PDF
- Split View
-
Views
-
Cite
Cite
Kimberly E Hanson, Angela M Caliendo, Cesar A Arias, Janet A Englund, Mary K Hayden, Mark J Lee, Mark Loeb, Robin Patel, Osama Altayar, Abdallah El Alayli, Shahnaz Sultan, Yngve Falck-Ytter, Valéry Lavergne, Rebecca L Morgan, M Hassan Murad, Adarsh Bhimraj, Reem A Mustafa, Infectious Diseases Society of America Guidelines on the Diagnosis of Coronavirus Disease 2019 (COVID-19): Serologic Testing, Clinical Infectious Diseases, , ciaa1343, https://doi.org/10.1093/cid/ciaa1343
Close -
Share
Abstract
The availability of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) serologic testing has rapidly increased. Current assays use a variety of technologies, measure different classes of immunoglobulin or immunoglobulin combinations, and detect antibodies directed against different portions of the virus. The overall accuracy of these tests, however, has not been well defined. The Infectious Diseases Society of America (IDSA) convened an expert panel to perform a systematic review of the coronavirus disease 2019 (COVID-19) serology literature and construct best-practice guidance related to SARS-CoV-2 serologic testing. This guideline is the fourth in a series of rapid, frequently updated COVID-19 guidelines developed by IDSA.
IDSA’s goal was to develop evidence-based recommendations that assist clinicians, clinical laboratories, patients, and policymakers in decisions related to the optimal use of SARS-CoV-2 serologic tests in a variety of settings. We also highlight important unmet research needs pertaining to the use of anti–SARS-CoV-2 antibody tests for diagnosis, public health surveillance, vaccine development, and the selection of convalescent plasma donors.
A multidisciplinary panel of infectious diseases clinicians, clinical microbiologists, and experts in systematic literature review identified and prioritized clinical questions related to the use of SARS-CoV-2 serologic tests. Grading of Recommendations Assessment, Development, and Evaluation (GRADE) methodology was used to assess the certainty of evidence and make testing recommendations.
The panel agreed on 8 diagnostic recommendations.
Information on the clinical performance and utility of SARS-CoV-2 serologic tests is rapidly emerging. Based on available evidence, detection of anti–SARS-CoV-2 antibodies may be useful for confirming the presence of current or past infection in selected situations. The panel identified 3 potential indications for serologic testing, including (1) evaluation of patients with a high clinical suspicion for COVID-19 when molecular diagnostic testing is negative and ≥2 weeks have passed since symptom onset, (2) assessment of multisystem inflammatory syndrome in children, and (3) conducting serosurveillance studies. The certainty of available evidence supporting the use of serology for either diagnosis or epidemiology was, however, graded as very low to moderate. For the most updated version of these guidelines, please go to https://www.idsociety.org/covid19guidelines.
INFECTIOUS DISEASES SOCIETY OF AMERICA LEGAL DISCLAIMER
It is important to realize that guidelines cannot account for individual variation among patients. They are not intended to supplant physician judgment with respect to particular patients or special clinical situations. The Infectious Diseases Society of America (IDSA) considers adherence to these guidelines to be voluntary, with the ultimate determination regarding their application to be made by the physician in light of a patient’s individual circumstances. While IDSA makes every effort to present accurate and reliable information, the information provided in these guidelines is “as is” without any warranty of accuracy, reliability, or otherwise, either express or implied. Neither IDSA nor its officers, directors, members, employees, or agents will be liable for any loss, damage, or claim with respect to any liabilities, including direct, special, indirect, or consequential damages, incurred in connection with implementation of these guidelines or reliance on the information presented.
The guidelines represent the proprietary and copyrighted property of IDSA. Copyright 2020 Infectious Diseases Society of America. All rights reserved. No part of these guidelines may be reproduced, distributed, or transmitted in any form or by any means, including photocopying, recording, or other electronic or mechanical methods, without the prior written permission of IDSA. Permission is granted to physicians and healthcare providers solely to copy and use the guidelines in their professional practices and clinical decision making. No license or permission is granted to any person or entity, and prior written authorization by IDSA is required, to sell, distribute, or modify the guidelines, or to make derivative works of or incorporate the guidelines into any product, including but not limited to clinical decision-support software or any other software product. Except for the permission granted above, any person or entity desiring to use the guidelines in any way must contact IDSA for approval in accordance with the terms and conditions of third-party use, in particular any use of the guidelines in any software product.
EXECUTIVE SUMMARY
Serologic tests for severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) are now widely available. Unlike nucleic acid amplification tests (NAATs), which detect viral RNA, antibody-based assays measure the host’s humoral immune response to current or past infection. Anti–SARS-CoV-2 antibodies typically become detectable more than 2 weeks after the onset of symptoms (Figure 1). As a result, SARS-CoV-2 serology lacks sufficient sensitivity to confidently exclude the diagnosis of coronavirus disease 2019 (COVID-19) when antibodies are not detected in the acute phase of illness. Nucleic acid amplification tests remain the diagnostic modality of choice for acute infection. Antibody testing, however, may be useful as an adjunct to NAAT at later time points following infection. In general, immunoglobulin (Ig) M (IgM) tests tend to have lower sensitivity to detect past infection than IgG or total antibody tests. Assays designed to detect and differentiate IgM and IgG in combination, where the detection of either IgM or IgG is used to define a positive test result, and IgA tests tend to have lower specificity to detect past infection compared with IgG only or total antibody tests. Test specificity is especially important for large serosurveillance studies when the prevalence of prior infection in the community is expected to be low. To be of value, anti–SARS-CoV-2 antibody tests are required to have high clinical sensitivity and specificity (ie, ≥99.5%).
Antibody sensitivity over time. This figure summarizes the pooled sensitivity with 95% confidence intervals of antibody classes per week post–symptom onset. The estimates were derived from the 24 studies and 7 package inserts informing recommendations 1 through 5. Abbreviations: IgA, immunoglobulin A; IgG, immunoglobulin G; IgM, immunoglobulin M; Total, total antibody.
Antibody sensitivity over time. This figure summarizes the pooled sensitivity with 95% confidence intervals of antibody classes per week post–symptom onset. The estimates were derived from the 24 studies and 7 package inserts informing recommendations 1 through 5. Abbreviations: IgA, immunoglobulin A; IgG, immunoglobulin G; IgM, immunoglobulin M; Total, total antibody.
In addition to use in epidemiologic studies, the panel identified 2 clinical scenarios where antibody testing was felt to have potential utility for diagnosis. Serologic testing may be helpful in the evaluation of individual patients with a high clinical suspicion for COVID-19 when the results of molecular diagnostic testing are repeatedly negative or such testing was not performed. The sensitivity and specificity of IgG and total antibody is optimal 3 to 4 weeks after the onset of symptoms. At the current time, few data exist in the fifth week post–symptom onset to judge serologic test performance at later periods after infection. Detection of anti–SARS-CoV-2 antibodies is also useful for assessments of suspected multisystem inflammatory syndrome in children. For symptomatic patients, optimal serology result interpretation requires careful determination of the timing of testing relative to symptom onset combined with assessments of disease severity. Based on the available evidence at this time, serologic tests should not be used to determine immunity or risk of reinfection. Thus, anti–SARS-CoV-2 antibody detection cannot inform decisions to discontinue physical distancing or lessen the use of personal protective equipment.
Summarized below are specific recommendations and comments related to the use of SARS-CoV-2 serologic testing in clinical practice and public health. A detailed description of background, methods, evidence summary, and rationales that support each recommendation can be found online in the full text.
Recommendation 1: The IDSA panel suggests against using serologic testing to diagnose SARS-CoV-2 infection during the first 2 weeks (14 days) following symptom onset (conditional recommendation, very low certainty of evidence).
Recommendation 2: When SARS-CoV-2 infection requires laboratory confirmation for clinical or epidemiological purposes, the IDSA panel suggests testing for SARS-CoV-2 IgG or total antibody 3 to 4 weeks after symptom onset to detect evidence of past SARS-CoV-2 infection (conditional recommendation, very low certainty of evidence).
Remark—When serology is being considered as an adjunct to NAAT for diagnosis, testing 3 to 4 weeks post–symptom onset maximizes the sensitivity and specificity to detect past infection.
Remark—Serosurveillance studies should use assays with high specificity (ie, ≥99.5%), especially when the prevalence of SARS-CoV-2 in the community is expected to be low.
Recommendation 3: The IDSA panel makes no recommendation either for or against using IgM antibodies to detect evidence of past SARS-CoV-2 infection (conditional recommendation, very low certainty of evidence).
Recommendation 4: The IDSA panel suggests against using IgA antibodies to detect evidence of past SARS-CoV-2 infection (conditional recommendation, very low certainty of evidence).
Recommendation 5: The IDSA panel suggests against using IgM or IgG antibody combination tests to detect evidence of past SARS-CoV-2 infection (conditional recommendation, very low certainty of evidence).
Remark—IgM or IgG combination tests are those where detecting either antibody class is used to define a positive result.
Recommendation 6: The IDSA panel suggests using IgG antibody to provide evidence of COVID-19 infection in symptomatic patients with a high clinical suspicion and repeatedly negative NAAT testing (weak recommendation, very low certainty of evidence).
Remark—When serology is being considered as an adjunct to NAAT for diagnosis, testing 3 to 4 weeks post–symptom onset maximizes the sensitivity and specificity to detect past infection.
Recommendation 7: In pediatric patients with multisystem inflammatory syndrome, the IDSA panel suggests using both IgG antibody and NAAT to provide evidence of current or past COVID-19 infection (strong recommendation, very low certainty of evidence).
Recommendation 8: The IDSA panel makes no recommendation for or against using capillary versus venous blood for serologic testing to detect SARS-CoV-2 antibodies (knowledge gap).
BACKGROUND
Since its emergence in December 2019, the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has caused over 21 million known infections and nearly 770 000 deaths worldwide [1]. Definitive diagnosis of coronavirus disease 2019 (COVID-19), the illness caused by SARS-CoV-2 infection, relies on the direct detection of virus-specific RNA or virus-specific glycoprotein antigens in respiratory tract specimens. Serologic tests that detect the host antibody response to SARS-CoV-2 may also help to confirm the presence of current or past infection using blood samples.
Coronavirus genomes encode 4 major structural proteins including spike (S), envelope (E), membrane (M), and nucleocapsid (N). Both the S and N proteins of SARS-CoV-2 have been shown to be immunogenic in humans and current serologic tests target antibodies directed against these antigens [2]. The S protein is the most exposed viral protein and is responsible for viral attachment and entry into the host cell via binding to the angiotensin-converting enzyme 2 (ACE-2) receptor [3]. The S protein is composed of an N-terminal S1 subunit, involved in virus-receptor binding, and a C-terminal S2 subunit that is involved in fusion to the host cell membrane. The S1 subunit is further divided into the N terminal domain (NTD) and a receptor binding domain (RBD). There has been particular focus on the SARS-CoV-2 RBD for vaccine development and targeted antibody therapies because neutralizing antibodies against this region effectively block viral entry [4, 5]. The N protein is an RNA-binding protein that is abundantly expressed during infection and plays an important role in RNA transcription and replication [6].
There are 2 general types of antibodies, neutralizing antibodies (nAbs) and non-neutralizing antibodies (also known as binding antibodies) [7]. Neutralization is defined as the loss of infectivity that occurs when an nAb binds to a viral particle. Virus-specific or vaccine-induced nAbs can play a crucial role in controlling viral infection, but definitive data are lacking to know whether individuals with detectable anti–SARS-CoV-2 nAbs are protected against reinfection. In comparison, binding antibodies are characterized by their inability to prevent viral infection of permissive cells. Regardless of their function, both types of virus-specific antibodies are potentially useful as diagnostic indicators of current or past infection.
Commercially available anti–SARS-CoV-2 antibody tests use different technologies to qualitatively measure single immunoglobulin (Ig) classes (IgM, IgG, or IgA) or total antibody but do not differentiate nAbs from binding antibodies. IgM antibodies directed against microorganisms are typically produced first after infection and are used as a measure of recent infection. IgG antibodies generally develop later after IgM and remains elevated for months to years after infection. Although IgM antibodies can be detected within the first 2 weeks of symptoms in some patients, SARS-CoV-2 infection appears unusual in that IgM and IgG more commonly increase together, more than 2 weeks after the onset of symptoms [8]. Secretory IgA is important for mucosal immunity. IgA can also be detected systemically in certain types of infection including SARS-CoV-2, but comparatively little is known about the kinetics of IgA in blood. The components of “total antibody” presumably include IgM and IgG and theoretically other antigen-specific immunoglobulins as well.
Given that the majority of the population has previously been exposed to seasonal human coronaviruses (HCoVs), and these viruses may share similar structure with SARS-CoV-2, an essential part of serologic test development and validation is to ensure that the anti–SARS-CoV-2 antibodies detected by a given assay do not cross-react with other coronaviruses (eg, HCoV-229E, HCoV-NL63, HCoV-OC43, or HCoV-HKU1). Specificity studies typically involve analyzing archived sera obtained before the identification of COVID-19 as a clinical entity, as well as assessing for potential interfering substances such as auto-antibodies or heterophile antibodies.
The most common clinical diagnostic platforms utilized for SARS-CoV-2 include lateral flow (LF) devices, enzyme-linked immunosorbent assays (ELISAs), and chemiluminescent immunoassays (CIAs). Lateral flow assays typically require a drop of blood (or serum or plasma) applied to a test strip, with results read in approximately 15–30 minutes. These devices are suitable for point-of-care testing and have potential to be deployed in the field as a part of large serological surveys. ELISA comes in a variety of different formats. Typically, a bound antigen–antibody complex is detected using a type-specific secondary antibody linked to a substrate that generates a colorimetric or fluorescent signal. CIA methods are similar to ELISA but use chemical probes that emit light instead of enzymatic substrates. Both ELISA and CIA are clinical laboratory–based methods amenable to high-throughput testing using serum, plasma, or potentially dried blood spots. At this time, neutralization assays are mainly used in research settings or offered as laboratory-developed tests by reference laboratories.
In the United States, the Food and Drug Administration (FDA) currently requires Emergency Use Authorization (EUA) to market an SARS-CoV-2 antibody test. This means that commercial manufacturers and clinical laboratories with laboratory-developed tests must submit performance data to the FDA for review. Early in the pandemic, however, official EUA review was voluntary. Test developers were only expected to internally validate their tests and notify the FDA of their intent to market. As a result, the market was flooded with poorly performing assays. In response, the FDA subsequently issued a “removed” test list that includes tests where significant performance problems were identified, assays for which official EUA review was not appropriately submitted, or assays voluntarily withdrawn by the developer.
Many different serologic tests for SARS-CoV-2 have become commercially available in a short amount of time. The incredible speed of development has significantly outpaced rigorous assessments of test performance. Therefore, the Infectious Diseases Society of America (IDSA) convened an expert panel to systematically review the available serologic literature, compare pooled estimates of test accuracy, and make evidence-based recommendations for informed use in clinical practice.
METHODS
This guideline was developed using the Grading of Recommendations Assessment, Development, and Evaluation (GRADE) approach for evidence assessment. In addition, given the need for a rapid response to an urgent public health crisis, the methodological approach was modified according to the GIN/McMaster checklist for the development of rapid recommendations [9]. To assess the positive- and negative-predictive value of serologic testing we considered prevalence of 1% to represent communities with low levels of circulating SARS-CoV-2 infections, 10% [10–12] to represent “hot spots,” and 40% to represent patients meeting the clinical definition for COVID-19 who were hospitalized or in the investigation of outbreaks in congregate settings or factories [13, 14].
Panel Composition
The panel was composed of clinicians and clinical microbiologists who are members of the IDSA, the American Society for Microbiology (ASM), the Society for Healthcare Epidemiology of America (SHEA), and the Pediatric Infectious Diseases Society (PIDS). They represented the disciplines of infectious diseases, pediatrics, clinical microbiology, hepatology, nephrology, and gastroenterology. The Evidence Foundation provided technical support and guideline methodologists for the development of this guideline.
Disclosure and Management of Potential Conflicts of Interest
The conflict of interest (COI) review group included 2 representatives from IDSA who were responsible for reviewing, evaluating, and approving all disclosures. All members of the expert panel complied with the COI process for reviewing and managing conflicts of interest, which required disclosure of any financial, intellectual, or other interest that might be construed as constituting an actual, potential, or apparent conflict, regardless of relevancy to the guideline topic. The assessment of disclosed relationships for possible COI was based on the relative weight of the financial relationship (ie, monetary amount) and the relevance of the relationship (ie, the degree to which an association might reasonably be interpreted by an independent observer as related to the topic or recommendation of consideration). The COI review group ensured that the majority of the panel and chair was without potential relevant (related to the topic) conflicts. The chair and all members of the technical team were determined to have no COIs relevant to the guidelines.
Question Generation
Clinical questions were developed into a PICO format (Population, Intervention, Comparison, Outcomes) [15] prior to the first panel meeting. Panel members prioritized questions with available evidence that met the minimum acceptable criteria (ie, diagnostic test accuracy reported on at least a case-series design; case reports were excluded) (Supplementary Figure 1). Panel members prioritized patient-important outcomes such as the role of serologic testing in diagnosing acute or recent infections, the role of point-of-care serologic tests and the role of serologic tests in the evaluation of pediatric patients with inflammatory multisystem syndromes.
Search Strategy
The search by the National Institute of Health and Care Excellence and the Centers for Disease Control and Prevention (CDC) was reviewed by the methodologists in consultation with an experienced information specialist and was determined to have high sensitivity. Terms identified in the PICO questions and the term “COVID” were added to the search strategy. We searched Ovid Medline and Embase from 2019 through 19 June 2020. We also performed horizon scans periodically during the evidence assessment and recommendation process to locate additional gray literature and manuscript preprints from LitCovid, medRxiv/bioRxiv, and SSRN. Reference lists and literature suggested by the panelists were reviewed for inclusion as well. We also manually searched the manufacturers’ package inserts of the serologic tests that received EUA by the FDA.
Screening and Study Selection
Two independent reviewers screened the titles and abstracts of the references identified by the search strategy and then reviewed the full texts of the studies under consideration for inclusion. Disagreements were resolved by discussion to reach consensus in consultation with a third reviewer when needed. Studies were reviewed if they reported data on the diagnostic test accuracy of anti–SARS-CoV-2 IgM, IgG, IgA, and/or total antibody tests compared with the nucleic acid amplification test (NAAT) as the reference standard. We only included studies that evaluated 1 of the 3 most commonly used serology platforms (ie, LF, ELISA, and CIA). To be included, cohort, cross-sectional, and case-control studies had to include at least 30 samples, describe the time of testing relative to symptom onset, and provide information on the reference standard for comparison. A sample size of 30 specimens was set to mirror the minimal numbers of specimens required by the FDA for EUA, but assays were included in our review regardless of FDA authorization status. Assays that required both IgM and IgG to be detected to define a positive result were excluded. We also excluded studies that reported sensitivity only or specificity only and studies that did not provide enough information to extract the true positives, true negatives, false positives, and false negatives.
Data Collection and Analysis
The data extraction was completed by 2 independent reviewers in duplicate. Disagreements were resolved by reaching a consensus and consulting with an expert clinician scientist member of the panel. We extracted baseline characteristics (authors, publication year, country, study design, inclusion criteria, age, gender), index test information (timing from onset of symptoms, sample type, target antigen, platform, immunoglobulin class, FDA EUA status, and European Economic Area Conformite Europeenne (CE) marking status), reference test (name of test, sample type), and diagnostic test accuracy raw data (true and false positives and negatives).
We used the bivariate random-effects model to pool the sensitivity and specificity using the logit transformation when there were enough studies [16]. When the number of studies did not allow the use of the bivariate model, we pooled the sensitivity and specificity separately using the random-effects generalized linear mixed models [17, 18]. To evaluate the between-study heterogeneity, we examined the forest plots for each pooled estimate rather than relying on the I2 statistic, which does not take into account the variability resulting from different positivity thresholds. We evaluated the graphs visually for factors that could explain the heterogeneity, including the platform used, FDA EUA or European Conformity (CE) mark statuses, and target antigen. To evaluate the effects of our decision to exclude studies that included fewer than 30 samples, we performed sensitivity analyses by including all the studies that reported both sensitivity and specificity. We also performed sensitivity analyses by excluding studies that included less than 100 samples in the specificity group and studies that evaluated tests that were removed from the FDA EUA list. The analyses were performed using the packages mada 0.5.10 and meta 4.11.0 in R 3.6.3 [19–21].
Risk of Bias and Certainty of Evidence
We used the Quality Assessment of Diagnostic Studies (QUADAS)–2 revised tool to assess the risk of bias in the included studies [22]. We used the GRADE framework to assess the overall certainty by evaluating the body of evidence for each outcome on the following domains: risk of bias, imprecision, inconsistency, indirectness, and publication bias [23, 24]. We developed GRADE summary-of-findings tables using the GRADEpro Guideline Development Tool [25].
Evidence to Recommendations
The panel considered the core elements of GRADE evidence in the decision process, including certainty of evidence and balance between desirable and undesirable effects. Additional domains were acknowledged where applicable (eg, feasibility, resource use, acceptability). For all recommendations, the expert panelists reached consensus. Voting rules were agreed on prior to the panel meeting for situations when consensus could not be reached.
As per GRADE methodology, recommendations were labeled as “strong” or “conditional.” The words “we recommend” indicate strong recommendations and “we suggest” indicate conditional recommendations. Figure 2 provides the suggested interpretation of strong and weak recommendations for patients, clinicians, and healthcare policymakers. For recommendations where the comparators are not formally stated, the comparison of interest was implicitly referred to as “not using the test.” Some recommendations acknowledge a current “knowledge gap” and aim at avoiding premature favorable recommendations for test use and to avoid encouraging the rapid diffusion of potentially nonuseful tests.
Approach and implications to rating the quality of evidence and strength of recommendations using the GRADE methodology (unrestricted use of the figure granted by the US GRADE Network). Abbreviation: GRADE, Grading of Recommendations Assessment, Development, and Evaluation.
Approach and implications to rating the quality of evidence and strength of recommendations using the GRADE methodology (unrestricted use of the figure granted by the US GRADE Network). Abbreviation: GRADE, Grading of Recommendations Assessment, Development, and Evaluation.
Revision Process
The draft guideline underwent a rapid review for approval by the IDSA Board of Directors Executive Committee external to the guideline development panel. The guidelines were reviewed and endorsed by ASM, PIDS, and SHEA. The IDSA Board of Directors Executive Committee reviewed and approved the guideline prior to dissemination.
Updating Process
Regular, frequent screening of the literature will take place to determine the need for revisions based on the likelihood that any new data will have an impact on the recommendations. If necessary, the entire expert panel will be reconvened to discuss potential changes.
RESULTS
Systematic review and horizon scan of the literature identified 9468 references, of which 47 informed the evidence base for these recommendations (Supplementary Figure 2 [PRISMA flow diagram]). Characteristics of the included studies can be found in (Supplementary Tables 1–5).
Recommendation 1: The IDSA panel suggests against using serologic testing to diagnose SARS-CoV-2 infection during the first 2 weeks (14 days) following symptom onset (conditional recommendation, very low certainty of evidence).
Summary of Evidence
Using our search strategy, we identified 24 [26–49] studies and 7 package inserts [50–56] that assessed the diagnostic test accuracy of serologic tests compared with SARS-CoV-2 reverse transcription–polymerase chain reaction (RT-PCR) (Supplementary Table 2). Studies used different controls including samples collected from healthy individuals, patients who had other respiratory or nonrespiratory infections, patients with autoimmune diseases whose blood had been collected prior to the COVID-19 pandemic, symptomatic patients with negative NAAT for SARS-CoV-2, patients hospitalized for other reasons, and asymptomatic individuals including pregnant women evaluated during the COVID-19 pandemic. All of the package-insert information [50–56] and 21 studies [26–47] were case-control studies. Two additional studies used a cohort design [48, 49]. Included studies reporting on 3 different testing methodologies, namely LF assays, ELISA, and CIA. Studies also assessed different antibody classes including IgM alone, IgG alone, IgA alone, or total antibody, depending on the kit used. For the assays that detected and differentiated IgM and IgG in the same platform, results interpretation included assessments “IgM or IgG” approaches where the presence of one of them qualified as a positive test.
The total number of samples included ranged from 91 to 2708 and 721 to 11 887 for the sensitivity and specificity analyses, respectively. The pooled sensitivity at week 1 after symptom onset ranged from 0.23 to 0.63 and at week 2 ranged from 0.68 to 0.96, while the pooled specificity ranged from 0.96 to 1 (Tables 1–5). The quality of evidence that informed sensitivity determinations ranged from very low to low, while specificity was low to moderate. Quality was rated down for serious risk of bias (case-control study design), imprecision (assuming the upper and lower limits of the confidence interval would lead to different decisions), and unexplained inconsistency.
Antibody Performance, Weeks 1 and 2: Immunoglobulin M
| IgM . | Week 1 . | . | . | Week 2 . | . | . |
|---|---|---|---|---|---|---|
| Sensitivity | .33 (95% CI: .25 to .41) | .73 (95% CI: .66 to .78) | ||||
| Specificity | .98 (95% CI: .97 to .99) | |||||
| Outcome | Effect per 1000 Patients Tested | |||||
| Pretest Probability of 1%a | Pretest Probability of 10%b | Pretest Probability of 40%c | Pretest Probability of 1%a | Pretest Probability of 10%b | Pretest Probability of 40%c | |
| True positives (patients with COVID-19) | 3 (3 to 4) | 33 (25 to 41) | 132 (100 to 164) | 7 (7 to 8) | 73 (66 to 78) | 292 (264 to 312) |
| False negatives (patients incorrectly classified as not having COVID-19) | 7 (6 to 7) | 67 (59 to 75) | 268 (236 to 300) | 3 (2 to 3) | 27 (22 to 34) | 108 (88 to 136) |
| Quality of the evidence | 12 studies, 919 patients ⨁⨁◯◯ Low d, e | 16 studies, 2309 patients ⨁⨁◯◯ Low d, e | ||||
| Pretest Probability of 1%a | Pretest Probability of 10%b | Pretest Probability of 40%c | ||||
| True negatives (patients without COVID-19) | 970 (960 to 980) | 882 (873 to 891) | 588 (582 to 594) | |||
| False positives (patients incorrectly classified as having COVID-19) | 20 (10 to 30) | 18 (9 to 27) | 12 (6 to 18) | |||
| Quality of evidence | 21 studies, 7165 patients ⨁⨁⨁◯ Moderated |
| IgM . | Week 1 . | . | . | Week 2 . | . | . |
|---|---|---|---|---|---|---|
| Sensitivity | .33 (95% CI: .25 to .41) | .73 (95% CI: .66 to .78) | ||||
| Specificity | .98 (95% CI: .97 to .99) | |||||
| Outcome | Effect per 1000 Patients Tested | |||||
| Pretest Probability of 1%a | Pretest Probability of 10%b | Pretest Probability of 40%c | Pretest Probability of 1%a | Pretest Probability of 10%b | Pretest Probability of 40%c | |
| True positives (patients with COVID-19) | 3 (3 to 4) | 33 (25 to 41) | 132 (100 to 164) | 7 (7 to 8) | 73 (66 to 78) | 292 (264 to 312) |
| False negatives (patients incorrectly classified as not having COVID-19) | 7 (6 to 7) | 67 (59 to 75) | 268 (236 to 300) | 3 (2 to 3) | 27 (22 to 34) | 108 (88 to 136) |
| Quality of the evidence | 12 studies, 919 patients ⨁⨁◯◯ Low d, e | 16 studies, 2309 patients ⨁⨁◯◯ Low d, e | ||||
| Pretest Probability of 1%a | Pretest Probability of 10%b | Pretest Probability of 40%c | ||||
| True negatives (patients without COVID-19) | 970 (960 to 980) | 882 (873 to 891) | 588 (582 to 594) | |||
| False positives (patients incorrectly classified as having COVID-19) | 20 (10 to 30) | 18 (9 to 27) | 12 (6 to 18) | |||
| Quality of evidence | 21 studies, 7165 patients ⨁⨁⨁◯ Moderated |
Abbreviations: CI, confidence interval; COVID-19, coronavirus disease 2019; IgM, immunoglobulin M. For confidence ratings (i.e., very low, low, moderate or high), see Table 2.
aTypically seen in the general population in areas that are not hotspots.
bTypically seen in the general population in high-risk populations.
cTypically seen in the general population in SARS-CoV-2 exposed and nursing homes.
dThe case-control design leads to a serious risk of bias.
eUnexplained inconsistency observed with considerably variable sensitivity. Sensitivity ranges: W1, 0.06–0.62; W2, 0.33–1.00.
Antibody Performance, Weeks 1 and 2: Immunoglobulin M
| IgM . | Week 1 . | . | . | Week 2 . | . | . |
|---|---|---|---|---|---|---|
| Sensitivity | .33 (95% CI: .25 to .41) | .73 (95% CI: .66 to .78) | ||||
| Specificity | .98 (95% CI: .97 to .99) | |||||
| Outcome | Effect per 1000 Patients Tested | |||||
| Pretest Probability of 1%a | Pretest Probability of 10%b | Pretest Probability of 40%c | Pretest Probability of 1%a | Pretest Probability of 10%b | Pretest Probability of 40%c | |
| True positives (patients with COVID-19) | 3 (3 to 4) | 33 (25 to 41) | 132 (100 to 164) | 7 (7 to 8) | 73 (66 to 78) | 292 (264 to 312) |
| False negatives (patients incorrectly classified as not having COVID-19) | 7 (6 to 7) | 67 (59 to 75) | 268 (236 to 300) | 3 (2 to 3) | 27 (22 to 34) | 108 (88 to 136) |
| Quality of the evidence | 12 studies, 919 patients ⨁⨁◯◯ Low d, e | 16 studies, 2309 patients ⨁⨁◯◯ Low d, e | ||||
| Pretest Probability of 1%a | Pretest Probability of 10%b | Pretest Probability of 40%c | ||||
| True negatives (patients without COVID-19) | 970 (960 to 980) | 882 (873 to 891) | 588 (582 to 594) | |||
| False positives (patients incorrectly classified as having COVID-19) | 20 (10 to 30) | 18 (9 to 27) | 12 (6 to 18) | |||
| Quality of evidence | 21 studies, 7165 patients ⨁⨁⨁◯ Moderated |
| IgM . | Week 1 . | . | . | Week 2 . | . | . |
|---|---|---|---|---|---|---|
| Sensitivity | .33 (95% CI: .25 to .41) | .73 (95% CI: .66 to .78) | ||||
| Specificity | .98 (95% CI: .97 to .99) | |||||
| Outcome | Effect per 1000 Patients Tested | |||||
| Pretest Probability of 1%a | Pretest Probability of 10%b | Pretest Probability of 40%c | Pretest Probability of 1%a | Pretest Probability of 10%b | Pretest Probability of 40%c | |
| True positives (patients with COVID-19) | 3 (3 to 4) | 33 (25 to 41) | 132 (100 to 164) | 7 (7 to 8) | 73 (66 to 78) | 292 (264 to 312) |
| False negatives (patients incorrectly classified as not having COVID-19) | 7 (6 to 7) | 67 (59 to 75) | 268 (236 to 300) | 3 (2 to 3) | 27 (22 to 34) | 108 (88 to 136) |
| Quality of the evidence | 12 studies, 919 patients ⨁⨁◯◯ Low d, e | 16 studies, 2309 patients ⨁⨁◯◯ Low d, e | ||||
| Pretest Probability of 1%a | Pretest Probability of 10%b | Pretest Probability of 40%c | ||||
| True negatives (patients without COVID-19) | 970 (960 to 980) | 882 (873 to 891) | 588 (582 to 594) | |||
| False positives (patients incorrectly classified as having COVID-19) | 20 (10 to 30) | 18 (9 to 27) | 12 (6 to 18) | |||
| Quality of evidence | 21 studies, 7165 patients ⨁⨁⨁◯ Moderated |
Abbreviations: CI, confidence interval; COVID-19, coronavirus disease 2019; IgM, immunoglobulin M. For confidence ratings (i.e., very low, low, moderate or high), see Table 2.
aTypically seen in the general population in areas that are not hotspots.
bTypically seen in the general population in high-risk populations.
cTypically seen in the general population in SARS-CoV-2 exposed and nursing homes.
dThe case-control design leads to a serious risk of bias.
eUnexplained inconsistency observed with considerably variable sensitivity. Sensitivity ranges: W1, 0.06–0.62; W2, 0.33–1.00.
Antibody Performance, Weeks 1 and 2: Immunoglobulin G
| IgG . | Week 1 . | . | . | Week 2 . | . | . |
|---|---|---|---|---|---|---|
| Sensitivity | .23 (95% CI: .16 to .32) | .68 (95% CI: .62 to .73) | ||||
| Specificity | .99 (95% CI: .99 to .99) | |||||
| Outcome | Effect per 1000 Patients Tested | |||||
| Pretest Probability of 1%a | Pretest Probability of 10%b | Pretest Probability of 40%c | Pretest Probability of 1%a | Pretest Probability of 10%b | Pretest Probability of 40%c | |
| True positives (patients with COVID-19) | 2 (2 to 3) | 23 (16 to 32) | 92 (64 to 128) | 7 (6 to 7) | 68 (62 to 73) | 272 (248 to 292) |
| False negatives (patients incorrectly classified as not having COVID-19) | 8 (7 to 8) | 77 (68 to 84) | 308 (272 to 336) | 3 (3 to 4) | 32 (27 to 38) | 128 (108 to 152) |
| Quality of the evidence | 13 studies, 1343 patients ⨁◯◯◯ Very lowd,e | 16 studies, 2708 patients ⨁⨁◯◯ Lowd,e | ||||
| Pretest Probability of 1%a | Pretest Probability of 10%b | Pretest Probability of 40%c | ||||
| True negatives (patients without COVID-19) | 980 (980 to 980) | 891 (891 to 891) | 594 (594 to 594) | |||
| False positives (patients incorrectly classified as having COVID-19) | 10 (10 to 10) | 9 (9 to 9) | 6 (6 to 6) | |||
| Quality of evidence | 25 studies, 11887 patients ⨁⨁⨁◯ Moderated |
| IgG . | Week 1 . | . | . | Week 2 . | . | . |
|---|---|---|---|---|---|---|
| Sensitivity | .23 (95% CI: .16 to .32) | .68 (95% CI: .62 to .73) | ||||
| Specificity | .99 (95% CI: .99 to .99) | |||||
| Outcome | Effect per 1000 Patients Tested | |||||
| Pretest Probability of 1%a | Pretest Probability of 10%b | Pretest Probability of 40%c | Pretest Probability of 1%a | Pretest Probability of 10%b | Pretest Probability of 40%c | |
| True positives (patients with COVID-19) | 2 (2 to 3) | 23 (16 to 32) | 92 (64 to 128) | 7 (6 to 7) | 68 (62 to 73) | 272 (248 to 292) |
| False negatives (patients incorrectly classified as not having COVID-19) | 8 (7 to 8) | 77 (68 to 84) | 308 (272 to 336) | 3 (3 to 4) | 32 (27 to 38) | 128 (108 to 152) |
| Quality of the evidence | 13 studies, 1343 patients ⨁◯◯◯ Very lowd,e | 16 studies, 2708 patients ⨁⨁◯◯ Lowd,e | ||||
| Pretest Probability of 1%a | Pretest Probability of 10%b | Pretest Probability of 40%c | ||||
| True negatives (patients without COVID-19) | 980 (980 to 980) | 891 (891 to 891) | 594 (594 to 594) | |||
| False positives (patients incorrectly classified as having COVID-19) | 10 (10 to 10) | 9 (9 to 9) | 6 (6 to 6) | |||
| Quality of evidence | 25 studies, 11887 patients ⨁⨁⨁◯ Moderated |
Abbreviations: CI, confidence interval; COVID-19, coronavirus disease 2019; IgG, immunoglobulin G.
aTypically seen in the general population in areas that are not hotspots.
bTypically seen in the general population in high-risk populations.
cTypically seen in the general population in SARS-CoV-2 exposed and nursing homes.
dThe case-control design leads to a serious risk of bias.
eUnexplained inconsistency observed with considerably variable sensitivity. Sensitivity ranges: W1, 0.00–0.69; W2, 0.27–0.91.
Antibody Performance, Weeks 1 and 2: Immunoglobulin G
| IgG . | Week 1 . | . | . | Week 2 . | . | . |
|---|---|---|---|---|---|---|
| Sensitivity | .23 (95% CI: .16 to .32) | .68 (95% CI: .62 to .73) | ||||
| Specificity | .99 (95% CI: .99 to .99) | |||||
| Outcome | Effect per 1000 Patients Tested | |||||
| Pretest Probability of 1%a | Pretest Probability of 10%b | Pretest Probability of 40%c | Pretest Probability of 1%a | Pretest Probability of 10%b | Pretest Probability of 40%c | |
| True positives (patients with COVID-19) | 2 (2 to 3) | 23 (16 to 32) | 92 (64 to 128) | 7 (6 to 7) | 68 (62 to 73) | 272 (248 to 292) |
| False negatives (patients incorrectly classified as not having COVID-19) | 8 (7 to 8) | 77 (68 to 84) | 308 (272 to 336) | 3 (3 to 4) | 32 (27 to 38) | 128 (108 to 152) |
| Quality of the evidence | 13 studies, 1343 patients ⨁◯◯◯ Very lowd,e | 16 studies, 2708 patients ⨁⨁◯◯ Lowd,e | ||||
| Pretest Probability of 1%a | Pretest Probability of 10%b | Pretest Probability of 40%c | ||||
| True negatives (patients without COVID-19) | 980 (980 to 980) | 891 (891 to 891) | 594 (594 to 594) | |||
| False positives (patients incorrectly classified as having COVID-19) | 10 (10 to 10) | 9 (9 to 9) | 6 (6 to 6) | |||
| Quality of evidence | 25 studies, 11887 patients ⨁⨁⨁◯ Moderated |
| IgG . | Week 1 . | . | . | Week 2 . | . | . |
|---|---|---|---|---|---|---|
| Sensitivity | .23 (95% CI: .16 to .32) | .68 (95% CI: .62 to .73) | ||||
| Specificity | .99 (95% CI: .99 to .99) | |||||
| Outcome | Effect per 1000 Patients Tested | |||||
| Pretest Probability of 1%a | Pretest Probability of 10%b | Pretest Probability of 40%c | Pretest Probability of 1%a | Pretest Probability of 10%b | Pretest Probability of 40%c | |
| True positives (patients with COVID-19) | 2 (2 to 3) | 23 (16 to 32) | 92 (64 to 128) | 7 (6 to 7) | 68 (62 to 73) | 272 (248 to 292) |
| False negatives (patients incorrectly classified as not having COVID-19) | 8 (7 to 8) | 77 (68 to 84) | 308 (272 to 336) | 3 (3 to 4) | 32 (27 to 38) | 128 (108 to 152) |
| Quality of the evidence | 13 studies, 1343 patients ⨁◯◯◯ Very lowd,e | 16 studies, 2708 patients ⨁⨁◯◯ Lowd,e | ||||
| Pretest Probability of 1%a | Pretest Probability of 10%b | Pretest Probability of 40%c | ||||
| True negatives (patients without COVID-19) | 980 (980 to 980) | 891 (891 to 891) | 594 (594 to 594) | |||
| False positives (patients incorrectly classified as having COVID-19) | 10 (10 to 10) | 9 (9 to 9) | 6 (6 to 6) | |||
| Quality of evidence | 25 studies, 11887 patients ⨁⨁⨁◯ Moderated |
Abbreviations: CI, confidence interval; COVID-19, coronavirus disease 2019; IgG, immunoglobulin G.
aTypically seen in the general population in areas that are not hotspots.
bTypically seen in the general population in high-risk populations.
cTypically seen in the general population in SARS-CoV-2 exposed and nursing homes.
dThe case-control design leads to a serious risk of bias.
eUnexplained inconsistency observed with considerably variable sensitivity. Sensitivity ranges: W1, 0.00–0.69; W2, 0.27–0.91.
Antibody Performance, Weeks 1 and 2: Immunoglobulin A
| IgA . | Week 1 . | . | . | Week 2 . | . | . |
|---|---|---|---|---|---|---|
| Sensitivity | .63 (95% CI: .52 to .72) | .96 (95% CI: .51 to 1.00) | ||||
| Specificity | .96 (95% CI: .91 to .99) | |||||
| Outcome | Effect per 1000 Patients Tested | |||||
| Pretest Probability of 1%a | Pretest Probability of 10%b | Pretest Probability of 40%c | Pretest Probability of 1%a | Pretest Probability of 10%b | Pretest Probability of 40%c | |
| True positives (patients with COVID-19) | 6 (5 to 7) | 63 (52 to 72) | 252 (208 to 288) | 10 (5 to 10) | 96 (51 to 100) | 384 (204 to 400) |
| False negatives (patients incorrectly classified as not having COVID-19) | 4 (3 to 5) | 37 (28 to 48) | 148 (112 to 192) | 0 (0 to 5) | 4 (0 to 49) | 16 (0 to 196) |
| Quality of the evidence | 2 studies, 91 patients ⨁⨁◯◯ Lowd,e | 2 studies, 102 patients ⨁◯◯◯ Very lowd,e | ||||
| Pretest Probability of 1%a | Pretest Probability of 10%b | Pretest Probability of 40%c | ||||
| True negatives (patients without COVID-19) | 950 (901 to 980) | 864 (819 to 891) | 576 (546 to 594) | |||
| False positives (patients incorrectly classified as having COVID-19) | 40 (10 to 89) | 36 (9 to 81) | 24 (6 to 54) | |||
| Quality of evidence | 4 studies, 760 patients ⨁⨁◯◯ Lowd,e |
| IgA . | Week 1 . | . | . | Week 2 . | . | . |
|---|---|---|---|---|---|---|
| Sensitivity | .63 (95% CI: .52 to .72) | .96 (95% CI: .51 to 1.00) | ||||
| Specificity | .96 (95% CI: .91 to .99) | |||||
| Outcome | Effect per 1000 Patients Tested | |||||
| Pretest Probability of 1%a | Pretest Probability of 10%b | Pretest Probability of 40%c | Pretest Probability of 1%a | Pretest Probability of 10%b | Pretest Probability of 40%c | |
| True positives (patients with COVID-19) | 6 (5 to 7) | 63 (52 to 72) | 252 (208 to 288) | 10 (5 to 10) | 96 (51 to 100) | 384 (204 to 400) |
| False negatives (patients incorrectly classified as not having COVID-19) | 4 (3 to 5) | 37 (28 to 48) | 148 (112 to 192) | 0 (0 to 5) | 4 (0 to 49) | 16 (0 to 196) |
| Quality of the evidence | 2 studies, 91 patients ⨁⨁◯◯ Lowd,e | 2 studies, 102 patients ⨁◯◯◯ Very lowd,e | ||||
| Pretest Probability of 1%a | Pretest Probability of 10%b | Pretest Probability of 40%c | ||||
| True negatives (patients without COVID-19) | 950 (901 to 980) | 864 (819 to 891) | 576 (546 to 594) | |||
| False positives (patients incorrectly classified as having COVID-19) | 40 (10 to 89) | 36 (9 to 81) | 24 (6 to 54) | |||
| Quality of evidence | 4 studies, 760 patients ⨁⨁◯◯ Lowd,e |
Abbreviations: CI, confidence interval; COVID-19, coronavirus disease 2019; IgA, immunoglobulin A.
aTypically seen in the general population in areas that are not hotspots.
bTypically seen in the general population in high-risk populations.
cTypically seen in the general population in SARS-CoV-2 exposed and nursing homes.
dThe case-control design leads to a serious risk of bias.
eConsidering the upper vs lower limits of the sensitivity’s CI would lead to different clinical decisions.
Antibody Performance, Weeks 1 and 2: Immunoglobulin A
| IgA . | Week 1 . | . | . | Week 2 . | . | . |
|---|---|---|---|---|---|---|
| Sensitivity | .63 (95% CI: .52 to .72) | .96 (95% CI: .51 to 1.00) | ||||
| Specificity | .96 (95% CI: .91 to .99) | |||||
| Outcome | Effect per 1000 Patients Tested | |||||
| Pretest Probability of 1%a | Pretest Probability of 10%b | Pretest Probability of 40%c | Pretest Probability of 1%a | Pretest Probability of 10%b | Pretest Probability of 40%c | |
| True positives (patients with COVID-19) | 6 (5 to 7) | 63 (52 to 72) | 252 (208 to 288) | 10 (5 to 10) | 96 (51 to 100) | 384 (204 to 400) |
| False negatives (patients incorrectly classified as not having COVID-19) | 4 (3 to 5) | 37 (28 to 48) | 148 (112 to 192) | 0 (0 to 5) | 4 (0 to 49) | 16 (0 to 196) |
| Quality of the evidence | 2 studies, 91 patients ⨁⨁◯◯ Lowd,e | 2 studies, 102 patients ⨁◯◯◯ Very lowd,e | ||||
| Pretest Probability of 1%a | Pretest Probability of 10%b | Pretest Probability of 40%c | ||||
| True negatives (patients without COVID-19) | 950 (901 to 980) | 864 (819 to 891) | 576 (546 to 594) | |||
| False positives (patients incorrectly classified as having COVID-19) | 40 (10 to 89) | 36 (9 to 81) | 24 (6 to 54) | |||
| Quality of evidence | 4 studies, 760 patients ⨁⨁◯◯ Lowd,e |
| IgA . | Week 1 . | . | . | Week 2 . | . | . |
|---|---|---|---|---|---|---|
| Sensitivity | .63 (95% CI: .52 to .72) | .96 (95% CI: .51 to 1.00) | ||||
| Specificity | .96 (95% CI: .91 to .99) | |||||
| Outcome | Effect per 1000 Patients Tested | |||||
| Pretest Probability of 1%a | Pretest Probability of 10%b | Pretest Probability of 40%c | Pretest Probability of 1%a | Pretest Probability of 10%b | Pretest Probability of 40%c | |
| True positives (patients with COVID-19) | 6 (5 to 7) | 63 (52 to 72) | 252 (208 to 288) | 10 (5 to 10) | 96 (51 to 100) | 384 (204 to 400) |
| False negatives (patients incorrectly classified as not having COVID-19) | 4 (3 to 5) | 37 (28 to 48) | 148 (112 to 192) | 0 (0 to 5) | 4 (0 to 49) | 16 (0 to 196) |
| Quality of the evidence | 2 studies, 91 patients ⨁⨁◯◯ Lowd,e | 2 studies, 102 patients ⨁◯◯◯ Very lowd,e | ||||
| Pretest Probability of 1%a | Pretest Probability of 10%b | Pretest Probability of 40%c | ||||
| True negatives (patients without COVID-19) | 950 (901 to 980) | 864 (819 to 891) | 576 (546 to 594) | |||
| False positives (patients incorrectly classified as having COVID-19) | 40 (10 to 89) | 36 (9 to 81) | 24 (6 to 54) | |||
| Quality of evidence | 4 studies, 760 patients ⨁⨁◯◯ Lowd,e |
Abbreviations: CI, confidence interval; COVID-19, coronavirus disease 2019; IgA, immunoglobulin A.
aTypically seen in the general population in areas that are not hotspots.
bTypically seen in the general population in high-risk populations.
cTypically seen in the general population in SARS-CoV-2 exposed and nursing homes.
dThe case-control design leads to a serious risk of bias.
eConsidering the upper vs lower limits of the sensitivity’s CI would lead to different clinical decisions.
Antibody Performance, Weeks 1 and 2: Total Antibodies
| Total Antibodies . | Week 1 . | . | . | Week 2 . | . | . |
|---|---|---|---|---|---|---|
| Sensitivity | .50 (95% CI: .32 to .69) | .94 (95% CI: .84 to .98) | ||||
| Specificity | 1.00 (95% CI: .99 to 1.00) | |||||
| Outcome | Effect per 1000 Patients Tested | |||||
| Pretest Probability of 1%a | Pretest Probability of 10%b | Pretest Probability of 40%c | Pretest Probability of 1%a | Pretest Probability of 10%b | Pretest Probability of 40%c | |
| True positives (patients with COVID-19) | 5 (3 to 7) | 50 (32 to 69) | 200 (128 to 276) | 9 (8 to 10) | 94 (84 to 98) | 376 (336 to 392) |
| False negatives (patients incorrectly classified as not having COVID-19) | 5 (3 to 7) | 50 (31 to 68) | 200 (124 to 272) | 1 (0 to 2) | 6 (2 to 16) | 24 (8 to 64) |
| Quality of the evidence | 7 studies, 418 patients ⨁◯◯◯ Very lowd,e,f | 6 studies, 359 patients ⨁⨁◯◯ Lowd,e | ||||
| Pretest Probability of 1%a | Pretest Probability of 10%b | Pretest Probability of 40%c | ||||
| True negatives (patients without COVID-19) | 990 (980 to 990) | 900 (891 to 900) | 600 (594 to 600) | |||
| False positives (patients incorrectly classified as having COVID-19) | 0 (0 to 10) | 0 (0 to 9) | 0 (0 to 6) | |||
| Quality of evidence | 8 studies, 4521 patients ⨁⨁⨁◯ Moderated |
| Total Antibodies . | Week 1 . | . | . | Week 2 . | . | . |
|---|---|---|---|---|---|---|
| Sensitivity | .50 (95% CI: .32 to .69) | .94 (95% CI: .84 to .98) | ||||
| Specificity | 1.00 (95% CI: .99 to 1.00) | |||||
| Outcome | Effect per 1000 Patients Tested | |||||
| Pretest Probability of 1%a | Pretest Probability of 10%b | Pretest Probability of 40%c | Pretest Probability of 1%a | Pretest Probability of 10%b | Pretest Probability of 40%c | |
| True positives (patients with COVID-19) | 5 (3 to 7) | 50 (32 to 69) | 200 (128 to 276) | 9 (8 to 10) | 94 (84 to 98) | 376 (336 to 392) |
| False negatives (patients incorrectly classified as not having COVID-19) | 5 (3 to 7) | 50 (31 to 68) | 200 (124 to 272) | 1 (0 to 2) | 6 (2 to 16) | 24 (8 to 64) |
| Quality of the evidence | 7 studies, 418 patients ⨁◯◯◯ Very lowd,e,f | 6 studies, 359 patients ⨁⨁◯◯ Lowd,e | ||||
| Pretest Probability of 1%a | Pretest Probability of 10%b | Pretest Probability of 40%c | ||||
| True negatives (patients without COVID-19) | 990 (980 to 990) | 900 (891 to 900) | 600 (594 to 600) | |||
| False positives (patients incorrectly classified as having COVID-19) | 0 (0 to 10) | 0 (0 to 9) | 0 (0 to 6) | |||
| Quality of evidence | 8 studies, 4521 patients ⨁⨁⨁◯ Moderated |
Abbreviations: CI, confidence interval; COVID-19, coronavirus disease 2019.
aTypically seen in the general population in areas that are not hotspots.
bTypically seen in the general population in high-risk populations.
cTypically seen in the general population in SARS-CoV-2 exposed and nursing homes.
dThe case-control design leads to a serious risk of bias.
eUnexplained inconsistency observed with considerably variable sensitivity. Sensitivity ranges: W1, 0.03–0.75; W2, 0.74–1.00.
fConsidering the upper vs lower limits of the sensitivity’s CI would lead to different clinical decisions.
Antibody Performance, Weeks 1 and 2: Total Antibodies
| Total Antibodies . | Week 1 . | . | . | Week 2 . | . | . |
|---|---|---|---|---|---|---|
| Sensitivity | .50 (95% CI: .32 to .69) | .94 (95% CI: .84 to .98) | ||||
| Specificity | 1.00 (95% CI: .99 to 1.00) | |||||
| Outcome | Effect per 1000 Patients Tested | |||||
| Pretest Probability of 1%a | Pretest Probability of 10%b | Pretest Probability of 40%c | Pretest Probability of 1%a | Pretest Probability of 10%b | Pretest Probability of 40%c | |
| True positives (patients with COVID-19) | 5 (3 to 7) | 50 (32 to 69) | 200 (128 to 276) | 9 (8 to 10) | 94 (84 to 98) | 376 (336 to 392) |
| False negatives (patients incorrectly classified as not having COVID-19) | 5 (3 to 7) | 50 (31 to 68) | 200 (124 to 272) | 1 (0 to 2) | 6 (2 to 16) | 24 (8 to 64) |
| Quality of the evidence | 7 studies, 418 patients ⨁◯◯◯ Very lowd,e,f | 6 studies, 359 patients ⨁⨁◯◯ Lowd,e | ||||
| Pretest Probability of 1%a | Pretest Probability of 10%b | Pretest Probability of 40%c | ||||
| True negatives (patients without COVID-19) | 990 (980 to 990) | 900 (891 to 900) | 600 (594 to 600) | |||
| False positives (patients incorrectly classified as having COVID-19) | 0 (0 to 10) | 0 (0 to 9) | 0 (0 to 6) | |||
| Quality of evidence | 8 studies, 4521 patients ⨁⨁⨁◯ Moderated |
| Total Antibodies . | Week 1 . | . | . | Week 2 . | . | . |
|---|---|---|---|---|---|---|
| Sensitivity | .50 (95% CI: .32 to .69) | .94 (95% CI: .84 to .98) | ||||
| Specificity | 1.00 (95% CI: .99 to 1.00) | |||||
| Outcome | Effect per 1000 Patients Tested | |||||
| Pretest Probability of 1%a | Pretest Probability of 10%b | Pretest Probability of 40%c | Pretest Probability of 1%a | Pretest Probability of 10%b | Pretest Probability of 40%c | |
| True positives (patients with COVID-19) | 5 (3 to 7) | 50 (32 to 69) | 200 (128 to 276) | 9 (8 to 10) | 94 (84 to 98) | 376 (336 to 392) |
| False negatives (patients incorrectly classified as not having COVID-19) | 5 (3 to 7) | 50 (31 to 68) | 200 (124 to 272) | 1 (0 to 2) | 6 (2 to 16) | 24 (8 to 64) |
| Quality of the evidence | 7 studies, 418 patients ⨁◯◯◯ Very lowd,e,f | 6 studies, 359 patients ⨁⨁◯◯ Lowd,e | ||||
| Pretest Probability of 1%a | Pretest Probability of 10%b | Pretest Probability of 40%c | ||||
| True negatives (patients without COVID-19) | 990 (980 to 990) | 900 (891 to 900) | 600 (594 to 600) | |||
| False positives (patients incorrectly classified as having COVID-19) | 0 (0 to 10) | 0 (0 to 9) | 0 (0 to 6) | |||
| Quality of evidence | 8 studies, 4521 patients ⨁⨁⨁◯ Moderated |
Abbreviations: CI, confidence interval; COVID-19, coronavirus disease 2019.
aTypically seen in the general population in areas that are not hotspots.
bTypically seen in the general population in high-risk populations.
cTypically seen in the general population in SARS-CoV-2 exposed and nursing homes.
dThe case-control design leads to a serious risk of bias.
eUnexplained inconsistency observed with considerably variable sensitivity. Sensitivity ranges: W1, 0.03–0.75; W2, 0.74–1.00.
fConsidering the upper vs lower limits of the sensitivity’s CI would lead to different clinical decisions.
Antibody Performance, Weeks 1 and 2: Immunoglobulin M or Immunoglobulin G
| IgM or IgG . | Week 1 . | . | . | Week 2 . | . | . |
|---|---|---|---|---|---|---|
| Sensitivity | .51 (95% CI: .42 to .59) | .81 (95% CI: .77 to .84) | ||||
| Specificity | .97 (95% CI: .95 to .98) | |||||
| Outcome | Effect per 1000 Patients Tested | |||||
| Pretest Probability of 1%a | Pretest Probability of 10%b | Pretest Probability of 40%c | Pretest Probability of 1%a | Pretest Probability of 10%b | Pretest Probability of 40%c | |
| True positives (patients with COVID-19) | 5 (4 to 6) | 51 (42 to 59) | 204 (168 to 236) | 8 (8 to 8) | 81 (77 to 84) | 324 (308 to 336) |
| False negatives (patients incorrectly classified as not having COVID-19) | 5 (4 to 6) | 49 (41 to 58) | 196 (164 to 232) | 2 (2 to 2) | 19 (16 to 23) | 76 (64 to 92) |
| Quality of the evidence | 7 studies, 830 patients ⨁⨁◯◯ Lowd,e | 7 studies, 1996 patients ⨁⨁◯◯ Lowd,e | ||||
| Pretest Probability of 1%a | Pretest Probability of 10%b | Pretest Probability of 40%c | ||||
| True negatives (patients without COVID-19) | 960 (941 to 970) | 873 (855 to 882) | 582 (570 to 588) | |||
| False positives (patients incorrectly classified as having COVID-19) | 30 (20 to 50) | 27 (18 to 45) | 18 (12 to 30) | |||
| Quality of evidence | 11 studies, 5660 patients ⨁⨁◯◯ Lowd,e |
| IgM or IgG . | Week 1 . | . | . | Week 2 . | . | . |
|---|---|---|---|---|---|---|
| Sensitivity | .51 (95% CI: .42 to .59) | .81 (95% CI: .77 to .84) | ||||
| Specificity | .97 (95% CI: .95 to .98) | |||||
| Outcome | Effect per 1000 Patients Tested | |||||
| Pretest Probability of 1%a | Pretest Probability of 10%b | Pretest Probability of 40%c | Pretest Probability of 1%a | Pretest Probability of 10%b | Pretest Probability of 40%c | |
| True positives (patients with COVID-19) | 5 (4 to 6) | 51 (42 to 59) | 204 (168 to 236) | 8 (8 to 8) | 81 (77 to 84) | 324 (308 to 336) |
| False negatives (patients incorrectly classified as not having COVID-19) | 5 (4 to 6) | 49 (41 to 58) | 196 (164 to 232) | 2 (2 to 2) | 19 (16 to 23) | 76 (64 to 92) |
| Quality of the evidence | 7 studies, 830 patients ⨁⨁◯◯ Lowd,e | 7 studies, 1996 patients ⨁⨁◯◯ Lowd,e | ||||
| Pretest Probability of 1%a | Pretest Probability of 10%b | Pretest Probability of 40%c | ||||
| True negatives (patients without COVID-19) | 960 (941 to 970) | 873 (855 to 882) | 582 (570 to 588) | |||
| False positives (patients incorrectly classified as having COVID-19) | 30 (20 to 50) | 27 (18 to 45) | 18 (12 to 30) | |||
| Quality of evidence | 11 studies, 5660 patients ⨁⨁◯◯ Lowd,e |
Abbreviations: CI, confidence interval; COVID-19, coronavirus disease 2019; IgG, immunoglobulin G; IgM, immunoglobulin M.
aTypically seen in the general population in areas that are not hotspots.
bTypically seen in the general population in high-risk populations.
cTypically seen in the general population in SARS-CoV-2 exposed and nursing homes.
dThe case-control design leads to a serious risk of bias.
eUnexplained inconsistency observed with considerably variable sensitivity. Sensitivity ranges: W1, 0.09–0.79; W2: 0.46–0.94.
Antibody Performance, Weeks 1 and 2: Immunoglobulin M or Immunoglobulin G
| IgM or IgG . | Week 1 . | . | . | Week 2 . | . | . |
|---|---|---|---|---|---|---|
| Sensitivity | .51 (95% CI: .42 to .59) | .81 (95% CI: .77 to .84) | ||||
| Specificity | .97 (95% CI: .95 to .98) | |||||
| Outcome | Effect per 1000 Patients Tested | |||||
| Pretest Probability of 1%a | Pretest Probability of 10%b | Pretest Probability of 40%c | Pretest Probability of 1%a | Pretest Probability of 10%b | Pretest Probability of 40%c | |
| True positives (patients with COVID-19) | 5 (4 to 6) | 51 (42 to 59) | 204 (168 to 236) | 8 (8 to 8) | 81 (77 to 84) | 324 (308 to 336) |
| False negatives (patients incorrectly classified as not having COVID-19) | 5 (4 to 6) | 49 (41 to 58) | 196 (164 to 232) | 2 (2 to 2) | 19 (16 to 23) | 76 (64 to 92) |
| Quality of the evidence | 7 studies, 830 patients ⨁⨁◯◯ Lowd,e | 7 studies, 1996 patients ⨁⨁◯◯ Lowd,e | ||||
| Pretest Probability of 1%a | Pretest Probability of 10%b | Pretest Probability of 40%c | ||||
| True negatives (patients without COVID-19) | 960 (941 to 970) | 873 (855 to 882) | 582 (570 to 588) | |||
| False positives (patients incorrectly classified as having COVID-19) | 30 (20 to 50) | 27 (18 to 45) | 18 (12 to 30) | |||
| Quality of evidence | 11 studies, 5660 patients ⨁⨁◯◯ Lowd,e |
| IgM or IgG . | Week 1 . | . | . | Week 2 . | . | . |
|---|---|---|---|---|---|---|
| Sensitivity | .51 (95% CI: .42 to .59) | .81 (95% CI: .77 to .84) | ||||
| Specificity | .97 (95% CI: .95 to .98) | |||||
| Outcome | Effect per 1000 Patients Tested | |||||
| Pretest Probability of 1%a | Pretest Probability of 10%b | Pretest Probability of 40%c | Pretest Probability of 1%a | Pretest Probability of 10%b | Pretest Probability of 40%c | |
| True positives (patients with COVID-19) | 5 (4 to 6) | 51 (42 to 59) | 204 (168 to 236) | 8 (8 to 8) | 81 (77 to 84) | 324 (308 to 336) |
| False negatives (patients incorrectly classified as not having COVID-19) | 5 (4 to 6) | 49 (41 to 58) | 196 (164 to 232) | 2 (2 to 2) | 19 (16 to 23) | 76 (64 to 92) |
| Quality of the evidence | 7 studies, 830 patients ⨁⨁◯◯ Lowd,e | 7 studies, 1996 patients ⨁⨁◯◯ Lowd,e | ||||
| Pretest Probability of 1%a | Pretest Probability of 10%b | Pretest Probability of 40%c | ||||
| True negatives (patients without COVID-19) | 960 (941 to 970) | 873 (855 to 882) | 582 (570 to 588) | |||
| False positives (patients incorrectly classified as having COVID-19) | 30 (20 to 50) | 27 (18 to 45) | 18 (12 to 30) | |||
| Quality of evidence | 11 studies, 5660 patients ⨁⨁◯◯ Lowd,e |
Abbreviations: CI, confidence interval; COVID-19, coronavirus disease 2019; IgG, immunoglobulin G; IgM, immunoglobulin M.
aTypically seen in the general population in areas that are not hotspots.
bTypically seen in the general population in high-risk populations.
cTypically seen in the general population in SARS-CoV-2 exposed and nursing homes.
dThe case-control design leads to a serious risk of bias.
eUnexplained inconsistency observed with considerably variable sensitivity. Sensitivity ranges: W1, 0.09–0.79; W2: 0.46–0.94.
Benefits and Harms
The panel placed a high value on reducing false-negative results. During the first 2 weeks following infection the sensitivity for all serology tests, regardless of the platform and immunoglobulin detected, was inadequate to avoid a large number of false-negative results. The concern with low sensitivity is that individuals who test negative would be classified as uninfected, when, in fact, they had been infected but have not fully developed an antibody response. Likewise, when assessing the seropositivity rate of a population, tests with poor sensitivity will provide an underestimation of the percentage of the population that is or was infected. Tests with low specificity can lead to false-positive results at any time point of testing. A false-positive result can lead to the incorrect conclusion that an individual has been infected, thus eliminating the search for the true etiology of their symptoms. When assessing the seroprevalence of the population, false-positive results can lead to an overestimation of the percentage of the population that has been infected.
Waiting beyond 2 weeks after the onset of symptoms to test for an antibody response may delay the confirmation of infection in some cases. A portion of infected individuals (23% to 63%, depending on the test) will develop antibodies within the first week from onset of symptoms, such that early testing could be actionable if positive. However, depending on the specificity of the test, a significant number of those results could be false positives; this is particularly concerning when the pretest probability of infection is low. While this may not be as serious of an issue for the surveillance studies, it may be important in the limited clinical situations where serologic tests are used for the diagnosis of COVID-19 (see recommendations 6 and 7).
Other Considerations
IgM antibody responses typically occur earlier after the onset of infection compared with IgG antibody responses in most microbial infections. In contrast, with SARS-CoV-2 infection, there does not appear to be a significant difference in the sensitivity of tests that detect IgM antibody compared with those that detect IgG antibodies in the first weeks following infection. At 1 week after onset of symptoms, the pooled sensitivity for IgM tests was 33% compared with 23% for IgG tests, and at 2 weeks postonset of symptoms, the pooled sensitivity of IgM tests was 73% compared with 68% for IgG tests (Tables 1 and 2). The specificity for IgM tests was slightly lower (98%) compared with IgG tests (99%), but the confidence intervals overlapped. Based on available evidence, there does not appear to be a substantially increased diagnostic accuracy when using IgM tests compared with IgG tests early in the course of illness. IgA tests, while more sensitive than IgG and IgM tests within the first 2 weeks of symptom onset, have a lower specificity (96%) and so their use would greatly increase false-positive results. In a low-prevalence population, for example (ie, 1%), 87% of the positive IgA results would be false positives (Table 3); when the prevalence is 10% (for example in “hot spots”), 39% of the positive results would be false positives. Of note, the number of studies evaluating IgA are much fewer than for IgM and IgG.
There are LF assays that detect and distinguish between IgM and IgG. These platforms require either IgM or IgG to be detected for a positive result (referred to as IgM or IgG tests). This “either/or” interpretation increases test sensitivity but also slightly reduces specificity from 98% to 97%. Overall, IgM or IgG detection does not improve the diagnostic accuracy relative to detection of either antibody class alone. In contrast, detection of total antibody within the first 2 weeks after infection does increase sensitivity while maintaining a high specificity. These differences may be due to the fact that IgM or IgG tests tend to be LF assays, which show more inconsistency than total antibody detected using ELISA or CIA (Supplementary Figure 3 [forest plots]). Total antibody platforms may have the highest diagnostic utility during the early time period. However, the number of studies evaluating total antibody tests was substantially lower than the number of studies assessing IgM or IgG tests.
Conclusions and Research Needs for This Recommendation
There is substantial variability in the performance of tests and different testing platforms (ie, LF vs CIA vs ELISA). Overall, there is more inconsistency across LF platforms and with IgM compared with IgG tests. Regardless of the immunoglobulin detected or the testing platform used, anti–SARS-CoV-2 antibodies generally lack adequate sensitivity to rule out infection during the first 2 weeks of symptoms. Additional cohort studies are needed to truly understand the performance of serology tests, including studies with larger numbers of well-characterized patients, where the timing of symptom onset and the severity of illness are clearly defined. Ideally, cohort studies should be conducted assessing multiple tests and using the same well-characterized specimens. The antibody response in special populations such as children, immunocompromised patients, and patients with autoimmune or rheumatologic disease also needs to be studied. Last, correlation of viral RNA shedding and culture positivity (or other surrogate for infectivity) should be studied relative to immunoglobulin titers over time.
Recommendation 2: When SARS-CoV-2 infection requires laboratory confirmation for clinical or epidemiological purposes, the IDSA panel suggests testing for SARS-CoV-2 IgG or total antibody 3 to 4 weeks after symptom onset to detect evidence of past SARS-CoV-2 infection (conditional recommendation, very low certainty of evidence).
- •
Remark—When serology is being considered as an adjunct to NAAT for diagnosis, testing 3 to 4 weeks post–symptom onset maximizes the sensitivity and specificity to detect past infection.
- •
Remark—Serosurveillance studies should use assays with high specificity (ie, ≥99.5%), especially when the prevalence of SARS-CoV-2 in the community is expected to be low.
Recommendation 3: The IDSA panel makes no recommendation either for or against using IgM antibodies to detect evidence of past SARS-CoV-2 infection (conditional recommendation, very low certainty of evidence).
Recommendation 4: The IDSA panel suggests against using IgA antibodies to detect evidence of past SARS-CoV-2 infection (conditional recommendation, very low certainty of evidence).
Recommendation 5: The IDSA panel suggests against using IgM or IgG antibody combination tests to detect evidence of past SARS-CoV-2 infection (conditional recommendation, very low certainty of evidence).
- •
Remark— IgM or IgG combination tests are those where detecting either antibody class is used to define a positive result.
Summary of the Evidence
The same studies [26–49] and package inserts [50–56] that were used to inform recommendation 1 were examined to assess test performance at weeks 3, 4, and 5 following the onset of COVID-19 signs or symptoms (Supplementary Table 2). The total number of samples included in the sensitivity analyses ranged from 163 to 2298 samples, and the number included in the specificity analyses ranged from 721 to 11 887 samples. The pooled sensitivity at week 3 after symptom onset ranged from 0.89 to 0.98, at week 4 from 0.84 to 0.95, and at week 5 from 0.78 to 0.95, while the pooled specificity ranged from 0.96 to 1 (Tables 6–10). The quality of evidence that informed sensitivity analyses ranged from very low to moderate; the quality of evidence that informed specificity was also low to moderate. There was a serious risk of bias (case-control study design), imprecision (assuming the upper and lower limits of the confidence interval would lead to different decisions), and unexplained inconsistency.
Antibody Performance, Weeks 3 to 5: Immunoglobulin M
| IgM . | Week 3 . | . | . | Week 4 . | . | . | Week 5 . | . | . |
|---|---|---|---|---|---|---|---|---|---|
| Sensitivity | .89 (95% CI: .82 to .93) | .84 (95% CI: .67 to .93) | .78 (95% CI: .73 to .83) | ||||||
| Specificity | .98 (95% CI: .97 to .99) | ||||||||
| Outcome | Effect per 1000 Patients Tested | ||||||||
| Pretest Probability | |||||||||
| 1%a | 10%b | 40%c | 1%a | 10%b | 40%c | 1%a | 10%b | 40%c | |
| True positives (patients with COVID-19) | 9 (8 to 9) | 89 (82 to 93) | 356 (328 to 372) | 8 (7 to 9) | 84 (67 to 93) | 336 (268 to 372) | 8 (7 to 8) | 78 (73 to 83) | 312 (292 to 332) |
| False negatives (patients incorrectly classified as not having COVID-19) | 1 (1 to 2) | 11 (7 to 18) | 44 (28 to 72) | 2 (1 to 3) | 16 (7 to 33) | 64 (28 to 132) | 2 (2 to 3) | 22 (17 to 27) | 88 (68 to 108) |
| Quality of the evidence | 14 studies, 1730 patients ⨁◯◯◯ Very lowd,e,f | 6 studies, 619 patients ⨁◯◯◯ Very lowd,e,f | 2 studies, 260 patients ⨁⨁◯◯ Low d,e | ||||||
| Pretest Probability of 1%a | Pretest Probability of 10%b | Pretest Probability of 40%c | |||||||
| True negatives (patients without COVID-19) | 970 (960 to 980) | 882 (873 to 891) | 588 (582 to 594) | ||||||
| False positives (patients incorrectly classified as having COVID-19) | 20 (10 to 30) | 18 (9 to 27) | 12 (6 to 18) | ||||||
| Quality of evidence | 21 studies, 7165 patients ⨁⨁⨁◯ Moderated |
| IgM . | Week 3 . | . | . | Week 4 . | . | . | Week 5 . | . | . |
|---|---|---|---|---|---|---|---|---|---|
| Sensitivity | .89 (95% CI: .82 to .93) | .84 (95% CI: .67 to .93) | .78 (95% CI: .73 to .83) | ||||||
| Specificity | .98 (95% CI: .97 to .99) | ||||||||
| Outcome | Effect per 1000 Patients Tested | ||||||||
| Pretest Probability | |||||||||
| 1%a | 10%b | 40%c | 1%a | 10%b | 40%c | 1%a | 10%b | 40%c | |
| True positives (patients with COVID-19) | 9 (8 to 9) | 89 (82 to 93) | 356 (328 to 372) | 8 (7 to 9) | 84 (67 to 93) | 336 (268 to 372) | 8 (7 to 8) | 78 (73 to 83) | 312 (292 to 332) |
| False negatives (patients incorrectly classified as not having COVID-19) | 1 (1 to 2) | 11 (7 to 18) | 44 (28 to 72) | 2 (1 to 3) | 16 (7 to 33) | 64 (28 to 132) | 2 (2 to 3) | 22 (17 to 27) | 88 (68 to 108) |
| Quality of the evidence | 14 studies, 1730 patients ⨁◯◯◯ Very lowd,e,f | 6 studies, 619 patients ⨁◯◯◯ Very lowd,e,f | 2 studies, 260 patients ⨁⨁◯◯ Low d,e | ||||||
| Pretest Probability of 1%a | Pretest Probability of 10%b | Pretest Probability of 40%c | |||||||
| True negatives (patients without COVID-19) | 970 (960 to 980) | 882 (873 to 891) | 588 (582 to 594) | ||||||
| False positives (patients incorrectly classified as having COVID-19) | 20 (10 to 30) | 18 (9 to 27) | 12 (6 to 18) | ||||||
| Quality of evidence | 21 studies, 7165 patients ⨁⨁⨁◯ Moderated |
Abbreviations: CI, confidence interval; COVID-19, coronavirus disease 2019; IgM, immunoglobulin M.
aTypically seen in the general population in areas that are not hotspots.
bTypically seen in the general population in high-risk populations.
cTypically seen in the general population in SARS-CoV-2 exposed and nursing homes.
dThe case-control design leads to a serious risk of bias.
eUnexplained inconsistency observed with considerably variable sensitivity. Sensitivity ranges: W3, 0.55–1.00; W4, 0.36–1.00; W5, 0.76–0.92.
fConsidering the upper vs lower limits of the sensitivity’s CI would lead to different clinical decisions.
Antibody Performance, Weeks 3 to 5: Immunoglobulin M
| IgM . | Week 3 . | . | . | Week 4 . | . | . | Week 5 . | . | . |
|---|---|---|---|---|---|---|---|---|---|
| Sensitivity | .89 (95% CI: .82 to .93) | .84 (95% CI: .67 to .93) | .78 (95% CI: .73 to .83) | ||||||
| Specificity | .98 (95% CI: .97 to .99) | ||||||||
| Outcome | Effect per 1000 Patients Tested | ||||||||
| Pretest Probability | |||||||||
| 1%a | 10%b | 40%c | 1%a | 10%b | 40%c | 1%a | 10%b | 40%c | |
| True positives (patients with COVID-19) | 9 (8 to 9) | 89 (82 to 93) | 356 (328 to 372) | 8 (7 to 9) | 84 (67 to 93) | 336 (268 to 372) | 8 (7 to 8) | 78 (73 to 83) | 312 (292 to 332) |
| False negatives (patients incorrectly classified as not having COVID-19) | 1 (1 to 2) | 11 (7 to 18) | 44 (28 to 72) | 2 (1 to 3) | 16 (7 to 33) | 64 (28 to 132) | 2 (2 to 3) | 22 (17 to 27) | 88 (68 to 108) |
| Quality of the evidence | 14 studies, 1730 patients ⨁◯◯◯ Very lowd,e,f | 6 studies, 619 patients ⨁◯◯◯ Very lowd,e,f | 2 studies, 260 patients ⨁⨁◯◯ Low d,e | ||||||
| Pretest Probability of 1%a | Pretest Probability of 10%b | Pretest Probability of 40%c | |||||||
| True negatives (patients without COVID-19) | 970 (960 to 980) | 882 (873 to 891) | 588 (582 to 594) | ||||||
| False positives (patients incorrectly classified as having COVID-19) | 20 (10 to 30) | 18 (9 to 27) | 12 (6 to 18) | ||||||
| Quality of evidence | 21 studies, 7165 patients ⨁⨁⨁◯ Moderated |
| IgM . | Week 3 . | . | . | Week 4 . | . | . | Week 5 . | . | . |
|---|---|---|---|---|---|---|---|---|---|
| Sensitivity | .89 (95% CI: .82 to .93) | .84 (95% CI: .67 to .93) | .78 (95% CI: .73 to .83) | ||||||
| Specificity | .98 (95% CI: .97 to .99) | ||||||||
| Outcome | Effect per 1000 Patients Tested | ||||||||
| Pretest Probability | |||||||||
| 1%a | 10%b | 40%c | 1%a | 10%b | 40%c | 1%a | 10%b | 40%c | |
| True positives (patients with COVID-19) | 9 (8 to 9) | 89 (82 to 93) | 356 (328 to 372) | 8 (7 to 9) | 84 (67 to 93) | 336 (268 to 372) | 8 (7 to 8) | 78 (73 to 83) | 312 (292 to 332) |
| False negatives (patients incorrectly classified as not having COVID-19) | 1 (1 to 2) | 11 (7 to 18) | 44 (28 to 72) | 2 (1 to 3) | 16 (7 to 33) | 64 (28 to 132) | 2 (2 to 3) | 22 (17 to 27) | 88 (68 to 108) |
| Quality of the evidence | 14 studies, 1730 patients ⨁◯◯◯ Very lowd,e,f | 6 studies, 619 patients ⨁◯◯◯ Very lowd,e,f | 2 studies, 260 patients ⨁⨁◯◯ Low d,e | ||||||
| Pretest Probability of 1%a | Pretest Probability of 10%b | Pretest Probability of 40%c | |||||||
| True negatives (patients without COVID-19) | 970 (960 to 980) | 882 (873 to 891) | 588 (582 to 594) | ||||||
| False positives (patients incorrectly classified as having COVID-19) | 20 (10 to 30) | 18 (9 to 27) | 12 (6 to 18) | ||||||
| Quality of evidence | 21 studies, 7165 patients ⨁⨁⨁◯ Moderated |
Abbreviations: CI, confidence interval; COVID-19, coronavirus disease 2019; IgM, immunoglobulin M.
aTypically seen in the general population in areas that are not hotspots.
bTypically seen in the general population in high-risk populations.
cTypically seen in the general population in SARS-CoV-2 exposed and nursing homes.
dThe case-control design leads to a serious risk of bias.
eUnexplained inconsistency observed with considerably variable sensitivity. Sensitivity ranges: W3, 0.55–1.00; W4, 0.36–1.00; W5, 0.76–0.92.
fConsidering the upper vs lower limits of the sensitivity’s CI would lead to different clinical decisions.
Antibody Performance, Weeks 3 to 5: Immunoglobulin G
| IgG . | Week 3 . | . | . | Week 4 . | . | . | Week 5 . | . | . |
|---|---|---|---|---|---|---|---|---|---|
| Sensitivity | .95 (95% CI: .92 to .96) | .88 (95% CI: .83 to .92) | .94 (95% CI: .88 to .97) | ||||||
| Specificity | .99 (95% CI: .99 to .99) | ||||||||
| Outcome | Effect per 1000 Patients Tested | ||||||||
| Pretest Probability | |||||||||
| 1%a | 10%b | 40%c | 1%a | 10%b | 40%c | 1%a | 10%b | 40%c | |
| True positives (patients with COVID-19) | 10 (9 to 10) | 95 (92 to 96) | 380 (368 to 384) | 9 (8 to 9) | 88 (83 to 92) | 352 (332 to 368) | 9 (9 to 10) | 94 (88 to 97) | 376 (352 to 388) |
| False negatives (patients incorrectly classified as not having COVID-19) | 0 (0 to 1) | 5 (4 to 8) | 20 (16 to 32) | 1 (1 to 2) | 12 (8 to 17) | 48 (32 to 68) | 1 (0 to 1) | 6 (3 to 12) | 24 (12 to 48) |
| Quality of the evidence | 16 studies, 2298 patients ⨁⨁⨁◯ Moderated | 8 studies, 840 patients ⨁⨁◯◯ Lowd,e | 1 study, 139 patients ⨁◯◯◯ Very lowd,e,f | ||||||
| Pretest Probability of 1%a | Pretest Probability of 10%b | Pretest Probability of 40%c | |||||||
| True negatives (patients without COVID-19) | 980 (980 to 980) | 891 (891 to 891) | 594 (594 to 594) | ||||||
| False positives (patients incorrectly classified as having COVID-19) | 10 (10 to 10) | 9 (9 to 9) | 6 (6 to 6) | ||||||
| Quality of evidence | 25 studies, 11 887 patients ⨁⨁⨁◯ Moderated |
| IgG . | Week 3 . | . | . | Week 4 . | . | . | Week 5 . | . | . |
|---|---|---|---|---|---|---|---|---|---|
| Sensitivity | .95 (95% CI: .92 to .96) | .88 (95% CI: .83 to .92) | .94 (95% CI: .88 to .97) | ||||||
| Specificity | .99 (95% CI: .99 to .99) | ||||||||
| Outcome | Effect per 1000 Patients Tested | ||||||||
| Pretest Probability | |||||||||
| 1%a | 10%b | 40%c | 1%a | 10%b | 40%c | 1%a | 10%b | 40%c | |
| True positives (patients with COVID-19) | 10 (9 to 10) | 95 (92 to 96) | 380 (368 to 384) | 9 (8 to 9) | 88 (83 to 92) | 352 (332 to 368) | 9 (9 to 10) | 94 (88 to 97) | 376 (352 to 388) |
| False negatives (patients incorrectly classified as not having COVID-19) | 0 (0 to 1) | 5 (4 to 8) | 20 (16 to 32) | 1 (1 to 2) | 12 (8 to 17) | 48 (32 to 68) | 1 (0 to 1) | 6 (3 to 12) | 24 (12 to 48) |
| Quality of the evidence | 16 studies, 2298 patients ⨁⨁⨁◯ Moderated | 8 studies, 840 patients ⨁⨁◯◯ Lowd,e | 1 study, 139 patients ⨁◯◯◯ Very lowd,e,f | ||||||
| Pretest Probability of 1%a | Pretest Probability of 10%b | Pretest Probability of 40%c | |||||||
| True negatives (patients without COVID-19) | 980 (980 to 980) | 891 (891 to 891) | 594 (594 to 594) | ||||||
| False positives (patients incorrectly classified as having COVID-19) | 10 (10 to 10) | 9 (9 to 9) | 6 (6 to 6) | ||||||
| Quality of evidence | 25 studies, 11 887 patients ⨁⨁⨁◯ Moderated |
Abbreviations: CI, confidence interval; COVID-19, coronavirus disease 2019; IgG, immunoglobulin G.
aTypically seen in the general population in areas that are not hotspots.
bTypically seen in the general population in high-risk populations.
cTypically seen in the general population in SARS-CoV-2 exposed and nursing homes.
dThe case-control design leads to a serious risk of bias.
eUnexplained inconsistency observed with considerably variable sensitivity. Sensitivity ranges: W3, 0.81–1.0; W4, 0.72–1.0; W5, 0.94.
fConsidering the upper vs lower limits of the sensitivity’s CI would lead to different clinical decisions.
Antibody Performance, Weeks 3 to 5: Immunoglobulin G
| IgG . | Week 3 . | . | . | Week 4 . | . | . | Week 5 . | . | . |
|---|---|---|---|---|---|---|---|---|---|
| Sensitivity | .95 (95% CI: .92 to .96) | .88 (95% CI: .83 to .92) | .94 (95% CI: .88 to .97) | ||||||
| Specificity | .99 (95% CI: .99 to .99) | ||||||||
| Outcome | Effect per 1000 Patients Tested | ||||||||
| Pretest Probability | |||||||||
| 1%a | 10%b | 40%c | 1%a | 10%b | 40%c | 1%a | 10%b | 40%c | |
| True positives (patients with COVID-19) | 10 (9 to 10) | 95 (92 to 96) | 380 (368 to 384) | 9 (8 to 9) | 88 (83 to 92) | 352 (332 to 368) | 9 (9 to 10) | 94 (88 to 97) | 376 (352 to 388) |
| False negatives (patients incorrectly classified as not having COVID-19) | 0 (0 to 1) | 5 (4 to 8) | 20 (16 to 32) | 1 (1 to 2) | 12 (8 to 17) | 48 (32 to 68) | 1 (0 to 1) | 6 (3 to 12) | 24 (12 to 48) |
| Quality of the evidence | 16 studies, 2298 patients ⨁⨁⨁◯ Moderated | 8 studies, 840 patients ⨁⨁◯◯ Lowd,e | 1 study, 139 patients ⨁◯◯◯ Very lowd,e,f | ||||||
| Pretest Probability of 1%a | Pretest Probability of 10%b | Pretest Probability of 40%c | |||||||
| True negatives (patients without COVID-19) | 980 (980 to 980) | 891 (891 to 891) | 594 (594 to 594) | ||||||
| False positives (patients incorrectly classified as having COVID-19) | 10 (10 to 10) | 9 (9 to 9) | 6 (6 to 6) | ||||||
| Quality of evidence | 25 studies, 11 887 patients ⨁⨁⨁◯ Moderated |
| IgG . | Week 3 . | . | . | Week 4 . | . | . | Week 5 . | . | . |
|---|---|---|---|---|---|---|---|---|---|
| Sensitivity | .95 (95% CI: .92 to .96) | .88 (95% CI: .83 to .92) | .94 (95% CI: .88 to .97) | ||||||
| Specificity | .99 (95% CI: .99 to .99) | ||||||||
| Outcome | Effect per 1000 Patients Tested | ||||||||
| Pretest Probability | |||||||||
| 1%a | 10%b | 40%c | 1%a | 10%b | 40%c | 1%a | 10%b | 40%c | |
| True positives (patients with COVID-19) | 10 (9 to 10) | 95 (92 to 96) | 380 (368 to 384) | 9 (8 to 9) | 88 (83 to 92) | 352 (332 to 368) | 9 (9 to 10) | 94 (88 to 97) | 376 (352 to 388) |
| False negatives (patients incorrectly classified as not having COVID-19) | 0 (0 to 1) | 5 (4 to 8) | 20 (16 to 32) | 1 (1 to 2) | 12 (8 to 17) | 48 (32 to 68) | 1 (0 to 1) | 6 (3 to 12) | 24 (12 to 48) |
| Quality of the evidence | 16 studies, 2298 patients ⨁⨁⨁◯ Moderated | 8 studies, 840 patients ⨁⨁◯◯ Lowd,e | 1 study, 139 patients ⨁◯◯◯ Very lowd,e,f | ||||||
| Pretest Probability of 1%a | Pretest Probability of 10%b | Pretest Probability of 40%c | |||||||
| True negatives (patients without COVID-19) | 980 (980 to 980) | 891 (891 to 891) | 594 (594 to 594) | ||||||
| False positives (patients incorrectly classified as having COVID-19) | 10 (10 to 10) | 9 (9 to 9) | 6 (6 to 6) | ||||||
| Quality of evidence | 25 studies, 11 887 patients ⨁⨁⨁◯ Moderated |
Abbreviations: CI, confidence interval; COVID-19, coronavirus disease 2019; IgG, immunoglobulin G.
aTypically seen in the general population in areas that are not hotspots.
bTypically seen in the general population in high-risk populations.
cTypically seen in the general population in SARS-CoV-2 exposed and nursing homes.
dThe case-control design leads to a serious risk of bias.
eUnexplained inconsistency observed with considerably variable sensitivity. Sensitivity ranges: W3, 0.81–1.0; W4, 0.72–1.0; W5, 0.94.
fConsidering the upper vs lower limits of the sensitivity’s CI would lead to different clinical decisions.
Antibody Performance, Weeks 3 to 5: Immunoglobulin A
| IgA . | Week 3 . | . | . | Week 4 . | . | . | Week 5 . | . | . |
|---|---|---|---|---|---|---|---|---|---|
| Sensitivity | .94 (95% CI: .87 to .98) | .95 (95% CI: .83 to .99) | NR | ||||||
| Specificity | .96 (95% CI: .91 to .99) | ||||||||
| Outcome | Effect per 1000 Patients Tested | ||||||||
| Pretest Probability | |||||||||
| 1%a | 10%b | 40%c | 1%a | 10%b | 40%c | 1%a | 10%b | 40%c | |
| True positives (patients with COVID-19) | 9 (9 to 10) | 94 (87 to 98) | 376 (348 to 392) | 10 (8 to 10) | 95 (83 to 99) | 380 (332 to 396) | NR | NR | NR |
| False negatives (patients incorrectly classified as not having COVID-19) | 1 (0 to 1) | 6 (2 to 13) | 24 (8 to 52) | 0 (0 to 2) | 5 (1 to 17) | 20 (4 to 68) | NR | NR | NR |
| Quality of the evidence | 4 studies, 163 patients ⨁⨁◯◯ Lowd,e | 3 studies, 191 patients ⨁⨁◯◯ Lowd,e | NR | ||||||
| Pretest Probability of 1%a | Pretest Probability of 10%b | Pretest Probability of 40%c | |||||||
| True negatives (patients without COVID-19) | 950 (901 to 980) | 864 (819 to 891) | 576 (546 to 594) | ||||||
| False positives (patients incorrectly classified as having COVID-19) | 40 (10 to 89) | 36 (9 to 81) | 24 (6 to 54) | ||||||
| Quality of evidence | 4 studies, 760 patients ⨁⨁◯◯ Lowd,e |
| IgA . | Week 3 . | . | . | Week 4 . | . | . | Week 5 . | . | . |
|---|---|---|---|---|---|---|---|---|---|
| Sensitivity | .94 (95% CI: .87 to .98) | .95 (95% CI: .83 to .99) | NR | ||||||
| Specificity | .96 (95% CI: .91 to .99) | ||||||||
| Outcome | Effect per 1000 Patients Tested | ||||||||
| Pretest Probability | |||||||||
| 1%a | 10%b | 40%c | 1%a | 10%b | 40%c | 1%a | 10%b | 40%c | |
| True positives (patients with COVID-19) | 9 (9 to 10) | 94 (87 to 98) | 376 (348 to 392) | 10 (8 to 10) | 95 (83 to 99) | 380 (332 to 396) | NR | NR | NR |
| False negatives (patients incorrectly classified as not having COVID-19) | 1 (0 to 1) | 6 (2 to 13) | 24 (8 to 52) | 0 (0 to 2) | 5 (1 to 17) | 20 (4 to 68) | NR | NR | NR |
| Quality of the evidence | 4 studies, 163 patients ⨁⨁◯◯ Lowd,e | 3 studies, 191 patients ⨁⨁◯◯ Lowd,e | NR | ||||||
| Pretest Probability of 1%a | Pretest Probability of 10%b | Pretest Probability of 40%c | |||||||
| True negatives (patients without COVID-19) | 950 (901 to 980) | 864 (819 to 891) | 576 (546 to 594) | ||||||
| False positives (patients incorrectly classified as having COVID-19) | 40 (10 to 89) | 36 (9 to 81) | 24 (6 to 54) | ||||||
| Quality of evidence | 4 studies, 760 patients ⨁⨁◯◯ Lowd,e |
Abbreviations: CI, confidence interval; COVID-19, coronavirus disease 2019; IgA, immunoglobulin A; NR, not reported.
aTypically seen in the general population in areas that are not hotspots.
bTypically seen in the general population in high-risk populations.
cTypically seen in the general population in SARS-CoV-2 exposed and nursing homes.
dThe case-control design leads to a serious risk of bias.
eConsidering the upper vs lower limits of the sensitivity’s CI would lead to different clinical decisions.
Antibody Performance, Weeks 3 to 5: Immunoglobulin A
| IgA . | Week 3 . | . | . | Week 4 . | . | . | Week 5 . | . | . |
|---|---|---|---|---|---|---|---|---|---|
| Sensitivity | .94 (95% CI: .87 to .98) | .95 (95% CI: .83 to .99) | NR | ||||||
| Specificity | .96 (95% CI: .91 to .99) | ||||||||
| Outcome | Effect per 1000 Patients Tested | ||||||||
| Pretest Probability | |||||||||
| 1%a | 10%b | 40%c | 1%a | 10%b | 40%c | 1%a | 10%b | 40%c | |
| True positives (patients with COVID-19) | 9 (9 to 10) | 94 (87 to 98) | 376 (348 to 392) | 10 (8 to 10) | 95 (83 to 99) | 380 (332 to 396) | NR | NR | NR |
| False negatives (patients incorrectly classified as not having COVID-19) | 1 (0 to 1) | 6 (2 to 13) | 24 (8 to 52) | 0 (0 to 2) | 5 (1 to 17) | 20 (4 to 68) | NR | NR | NR |
| Quality of the evidence | 4 studies, 163 patients ⨁⨁◯◯ Lowd,e | 3 studies, 191 patients ⨁⨁◯◯ Lowd,e | NR | ||||||
| Pretest Probability of 1%a | Pretest Probability of 10%b | Pretest Probability of 40%c | |||||||
| True negatives (patients without COVID-19) | 950 (901 to 980) | 864 (819 to 891) | 576 (546 to 594) | ||||||
| False positives (patients incorrectly classified as having COVID-19) | 40 (10 to 89) | 36 (9 to 81) | 24 (6 to 54) | ||||||
| Quality of evidence | 4 studies, 760 patients ⨁⨁◯◯ Lowd,e |
| IgA . | Week 3 . | . | . | Week 4 . | . | . | Week 5 . | . | . |
|---|---|---|---|---|---|---|---|---|---|
| Sensitivity | .94 (95% CI: .87 to .98) | .95 (95% CI: .83 to .99) | NR | ||||||
| Specificity | .96 (95% CI: .91 to .99) | ||||||||
| Outcome | Effect per 1000 Patients Tested | ||||||||
| Pretest Probability | |||||||||
| 1%a | 10%b | 40%c | 1%a | 10%b | 40%c | 1%a | 10%b | 40%c | |
| True positives (patients with COVID-19) | 9 (9 to 10) | 94 (87 to 98) | 376 (348 to 392) | 10 (8 to 10) | 95 (83 to 99) | 380 (332 to 396) | NR | NR | NR |
| False negatives (patients incorrectly classified as not having COVID-19) | 1 (0 to 1) | 6 (2 to 13) | 24 (8 to 52) | 0 (0 to 2) | 5 (1 to 17) | 20 (4 to 68) | NR | NR | NR |
| Quality of the evidence | 4 studies, 163 patients ⨁⨁◯◯ Lowd,e | 3 studies, 191 patients ⨁⨁◯◯ Lowd,e | NR | ||||||
| Pretest Probability of 1%a | Pretest Probability of 10%b | Pretest Probability of 40%c | |||||||
| True negatives (patients without COVID-19) | 950 (901 to 980) | 864 (819 to 891) | 576 (546 to 594) | ||||||
| False positives (patients incorrectly classified as having COVID-19) | 40 (10 to 89) | 36 (9 to 81) | 24 (6 to 54) | ||||||
| Quality of evidence | 4 studies, 760 patients ⨁⨁◯◯ Lowd,e |
Abbreviations: CI, confidence interval; COVID-19, coronavirus disease 2019; IgA, immunoglobulin A; NR, not reported.
aTypically seen in the general population in areas that are not hotspots.
bTypically seen in the general population in high-risk populations.
cTypically seen in the general population in SARS-CoV-2 exposed and nursing homes.
dThe case-control design leads to a serious risk of bias.
eConsidering the upper vs lower limits of the sensitivity’s CI would lead to different clinical decisions.
Antibody Performance, Weeks 3 to 5: Total Antibodies
| Total Antibodies . | Week 3 . | . | . | Week 4 . | . | . | Week 5 . | . | . |
|---|---|---|---|---|---|---|---|---|---|
| Sensitivity | .98 (95% CI: .89 to 1.00) | .95 (95% CI: .84 to .99) | .95 (95% CI: .90 to .98) | ||||||
| Specificity | 1.00 (95% CI: .99 to 1.00) | ||||||||
| Outcome | Effect per 1000 Patients Tested | ||||||||
| Pretest Probability | |||||||||
| 1%a | 10%b | 40%c | 1%a | 10%b | 40%c | 1%a | 10%b | 40%c | |
| True positives (patients with COVID-19) | 10 (9 to 10) | 98 (89 to 100) | 392 (356 to 400) | 5 (4 to 5) | 95 (84 to 99) | 380 (336 to 396) | 5 (5 to 5) | 95 (90 to 98) | 380 (360 to 392) |
| False negatives (patients incorrectly classified as not having COVID-19) | 0 (0 to 1) | 2 (0 to 11) | 8 (0 to 44) | 0 (0 to 1) | 5 (1 to 16) | 20 (4 to 64) | 0 (0 to 0) | 5 (2 to 10) | 20 (8 to 40) |
| Quality of the evidence | 6 studies, 472 patients ⨁⨁⨁◯ Moderated | 2 studies, 289 patients ⨁⨁◯◯ Lowd,e | 1 study, 121 patients ⨁⨁◯◯ Lowd,e | ||||||
| Pretest Probability of 1%a | Pretest Probability of 10%b | Pretest Probability of 40%c | |||||||
| True negatives (patients without COVID-19) | 990 (980 to 990) | 900 (891 to 900) | 600 (594 to 600) | ||||||
| False positives (patients incorrectly classified as having COVID-19) | 0 (0 to 10) | 0 (0 to 9) | 0 (0 to 6) | ||||||
| Quality of evidence | 8 studies, 4521 patients ⨁⨁⨁◯ Moderated |
| Total Antibodies . | Week 3 . | . | . | Week 4 . | . | . | Week 5 . | . | . |
|---|---|---|---|---|---|---|---|---|---|
| Sensitivity | .98 (95% CI: .89 to 1.00) | .95 (95% CI: .84 to .99) | .95 (95% CI: .90 to .98) | ||||||
| Specificity | 1.00 (95% CI: .99 to 1.00) | ||||||||
| Outcome | Effect per 1000 Patients Tested | ||||||||
| Pretest Probability | |||||||||
| 1%a | 10%b | 40%c | 1%a | 10%b | 40%c | 1%a | 10%b | 40%c | |
| True positives (patients with COVID-19) | 10 (9 to 10) | 98 (89 to 100) | 392 (356 to 400) | 5 (4 to 5) | 95 (84 to 99) | 380 (336 to 396) | 5 (5 to 5) | 95 (90 to 98) | 380 (360 to 392) |
| False negatives (patients incorrectly classified as not having COVID-19) | 0 (0 to 1) | 2 (0 to 11) | 8 (0 to 44) | 0 (0 to 1) | 5 (1 to 16) | 20 (4 to 64) | 0 (0 to 0) | 5 (2 to 10) | 20 (8 to 40) |
| Quality of the evidence | 6 studies, 472 patients ⨁⨁⨁◯ Moderated | 2 studies, 289 patients ⨁⨁◯◯ Lowd,e | 1 study, 121 patients ⨁⨁◯◯ Lowd,e | ||||||
| Pretest Probability of 1%a | Pretest Probability of 10%b | Pretest Probability of 40%c | |||||||
| True negatives (patients without COVID-19) | 990 (980 to 990) | 900 (891 to 900) | 600 (594 to 600) | ||||||
| False positives (patients incorrectly classified as having COVID-19) | 0 (0 to 10) | 0 (0 to 9) | 0 (0 to 6) | ||||||
| Quality of evidence | 8 studies, 4521 patients ⨁⨁⨁◯ Moderated |
Abbreviations: CI, confidence interval; COVID-19, coronavirus disease 2019.
aTypically seen in the general population in areas that are not hotspots.
bTypically seen in the general population in high-risk populations.
cTypically seen in the general population in SARS-CoV-2 exposed and nursing homes.
dThe case-control design leads to a serious risk of bias.
eUnexplained inconsistency observed with considerably variable sensitivity. Sensitivity ranges: W3, 0.78–1.00; W4, 0.88–0.94; W5, 0.95.
Antibody Performance, Weeks 3 to 5: Total Antibodies
| Total Antibodies . | Week 3 . | . | . | Week 4 . | . | . | Week 5 . | . | . |
|---|---|---|---|---|---|---|---|---|---|
| Sensitivity | .98 (95% CI: .89 to 1.00) | .95 (95% CI: .84 to .99) | .95 (95% CI: .90 to .98) | ||||||
| Specificity | 1.00 (95% CI: .99 to 1.00) | ||||||||
| Outcome | Effect per 1000 Patients Tested | ||||||||
| Pretest Probability | |||||||||
| 1%a | 10%b | 40%c | 1%a | 10%b | 40%c | 1%a | 10%b | 40%c | |
| True positives (patients with COVID-19) | 10 (9 to 10) | 98 (89 to 100) | 392 (356 to 400) | 5 (4 to 5) | 95 (84 to 99) | 380 (336 to 396) | 5 (5 to 5) | 95 (90 to 98) | 380 (360 to 392) |
| False negatives (patients incorrectly classified as not having COVID-19) | 0 (0 to 1) | 2 (0 to 11) | 8 (0 to 44) | 0 (0 to 1) | 5 (1 to 16) | 20 (4 to 64) | 0 (0 to 0) | 5 (2 to 10) | 20 (8 to 40) |
| Quality of the evidence | 6 studies, 472 patients ⨁⨁⨁◯ Moderated | 2 studies, 289 patients ⨁⨁◯◯ Lowd,e | 1 study, 121 patients ⨁⨁◯◯ Lowd,e | ||||||
| Pretest Probability of 1%a | Pretest Probability of 10%b | Pretest Probability of 40%c | |||||||
| True negatives (patients without COVID-19) | 990 (980 to 990) | 900 (891 to 900) | 600 (594 to 600) | ||||||
| False positives (patients incorrectly classified as having COVID-19) | 0 (0 to 10) | 0 (0 to 9) | 0 (0 to 6) | ||||||
| Quality of evidence | 8 studies, 4521 patients ⨁⨁⨁◯ Moderated |
| Total Antibodies . | Week 3 . | . | . | Week 4 . | . | . | Week 5 . | . | . |
|---|---|---|---|---|---|---|---|---|---|
| Sensitivity | .98 (95% CI: .89 to 1.00) | .95 (95% CI: .84 to .99) | .95 (95% CI: .90 to .98) | ||||||
| Specificity | 1.00 (95% CI: .99 to 1.00) | ||||||||
| Outcome | Effect per 1000 Patients Tested | ||||||||
| Pretest Probability | |||||||||
| 1%a | 10%b | 40%c | 1%a | 10%b | 40%c | 1%a | 10%b | 40%c | |
| True positives (patients with COVID-19) | 10 (9 to 10) | 98 (89 to 100) | 392 (356 to 400) | 5 (4 to 5) | 95 (84 to 99) | 380 (336 to 396) | 5 (5 to 5) | 95 (90 to 98) | 380 (360 to 392) |
| False negatives (patients incorrectly classified as not having COVID-19) | 0 (0 to 1) | 2 (0 to 11) | 8 (0 to 44) | 0 (0 to 1) | 5 (1 to 16) | 20 (4 to 64) | 0 (0 to 0) | 5 (2 to 10) | 20 (8 to 40) |
| Quality of the evidence | 6 studies, 472 patients ⨁⨁⨁◯ Moderated | 2 studies, 289 patients ⨁⨁◯◯ Lowd,e | 1 study, 121 patients ⨁⨁◯◯ Lowd,e | ||||||
| Pretest Probability of 1%a | Pretest Probability of 10%b | Pretest Probability of 40%c | |||||||
| True negatives (patients without COVID-19) | 990 (980 to 990) | 900 (891 to 900) | 600 (594 to 600) | ||||||
| False positives (patients incorrectly classified as having COVID-19) | 0 (0 to 10) | 0 (0 to 9) | 0 (0 to 6) | ||||||
| Quality of evidence | 8 studies, 4521 patients ⨁⨁⨁◯ Moderated |
Abbreviations: CI, confidence interval; COVID-19, coronavirus disease 2019.
aTypically seen in the general population in areas that are not hotspots.
bTypically seen in the general population in high-risk populations.
cTypically seen in the general population in SARS-CoV-2 exposed and nursing homes.
dThe case-control design leads to a serious risk of bias.
eUnexplained inconsistency observed with considerably variable sensitivity. Sensitivity ranges: W3, 0.78–1.00; W4, 0.88–0.94; W5, 0.95.
Antibody Performance, Weeks 3 to 5: Immunoglobulin M or Immunoglobulin G
| IgM or IgG . | Week 3 . | . | . | Week 4 . | . | . | Week 5 . | . | . |
|---|---|---|---|---|---|---|---|---|---|
| Sensitivity | .94 (95% CI: .92 to .96) | .88 (95% CI: .78 to .94) | NR | ||||||
| Specificity | .97 (95% CI: .95 to .98) | ||||||||
| Outcome | Effect per 1000 Patients Tested | ||||||||
| Pretest Probability | |||||||||
| 1%a | 10%b | 40%c | 1%a | 10%b | 40%c | 1%a | 10%b | 40%c | |
| True positives (patients with COVID-19) | 9 (9 to 10) | 94 (92 to 96) | 376 (368 to 384) | 9 (8 to 9) | 88 (78 to 94) | 352 (312 to 376) | NR | NR | NR |
| False negatives (patients incorrectly classified as not having COVID-19) | 1 (0 to 1) | 4 (6 to 8) | 24 (16 to 32) | 1 (1 to 2) | 12 (6 to 22) | 48 (24 to 88) | NR | NR | NR |
| Quality of the evidence | 6 studies, 1038 patients ⨁⨁◯◯ Lowd,e | 2 studies, 216 patients ⨁◯◯◯ Very lowd,e,f | NR | ||||||
| Pretest Probability of 1%a | Pretest Probability of 10%b | Pretest Probability of 40%c | |||||||
| True negatives (patients without COVID-19) | 960 (941 to 970) | 873 (855 to 882) | 582 (570 to 588) | ||||||
| False positives (patients incorrectly classified as having COVID-19) | 30 (20 to 50) | 27 (18 to 45) | 18 (12 to 30) | ||||||
| Quality of Evidence | 11 studies, 5660 patients ⨁⨁◯◯ Lowd,e |
| IgM or IgG . | Week 3 . | . | . | Week 4 . | . | . | Week 5 . | . | . |
|---|---|---|---|---|---|---|---|---|---|
| Sensitivity | .94 (95% CI: .92 to .96) | .88 (95% CI: .78 to .94) | NR | ||||||
| Specificity | .97 (95% CI: .95 to .98) | ||||||||
| Outcome | Effect per 1000 Patients Tested | ||||||||
| Pretest Probability | |||||||||
| 1%a | 10%b | 40%c | 1%a | 10%b | 40%c | 1%a | 10%b | 40%c | |
| True positives (patients with COVID-19) | 9 (9 to 10) | 94 (92 to 96) | 376 (368 to 384) | 9 (8 to 9) | 88 (78 to 94) | 352 (312 to 376) | NR | NR | NR |
| False negatives (patients incorrectly classified as not having COVID-19) | 1 (0 to 1) | 4 (6 to 8) | 24 (16 to 32) | 1 (1 to 2) | 12 (6 to 22) | 48 (24 to 88) | NR | NR | NR |
| Quality of the evidence | 6 studies, 1038 patients ⨁⨁◯◯ Lowd,e | 2 studies, 216 patients ⨁◯◯◯ Very lowd,e,f | NR | ||||||
| Pretest Probability of 1%a | Pretest Probability of 10%b | Pretest Probability of 40%c | |||||||
| True negatives (patients without COVID-19) | 960 (941 to 970) | 873 (855 to 882) | 582 (570 to 588) | ||||||
| False positives (patients incorrectly classified as having COVID-19) | 30 (20 to 50) | 27 (18 to 45) | 18 (12 to 30) | ||||||
| Quality of Evidence | 11 studies, 5660 patients ⨁⨁◯◯ Lowd,e |
Abbreviations: CI, confidence interval; COVID-19, coronavirus disease 2019; IgG, immunoglobulin G; IgM, immunoglobulin M; NR, not reported.
aTypically seen in the general population in areas that are not hotspots.
bTypically seen in the general population in high-risk populations.
cTypically seen in the general population in SARS-CoV-2 exposed and nursing homes.
dThe case-control design leads to a serious risk of bias.
eUnexplained inconsistency observed with considerably variable sensitivity. Sensitivity ranges: W3, 0.80–1.00; W4, 0.75–0.94.
fConsidering the upper vs lower limits of the sensitivity’s CI would lead to different clinical decisions.
Antibody Performance, Weeks 3 to 5: Immunoglobulin M or Immunoglobulin G
| IgM or IgG . | Week 3 . | . | . | Week 4 . | . | . | Week 5 . | . | . |
|---|---|---|---|---|---|---|---|---|---|
| Sensitivity | .94 (95% CI: .92 to .96) | .88 (95% CI: .78 to .94) | NR | ||||||
| Specificity | .97 (95% CI: .95 to .98) | ||||||||
| Outcome | Effect per 1000 Patients Tested | ||||||||
| Pretest Probability | |||||||||
| 1%a | 10%b | 40%c | 1%a | 10%b | 40%c | 1%a | 10%b | 40%c | |
| True positives (patients with COVID-19) | 9 (9 to 10) | 94 (92 to 96) | 376 (368 to 384) | 9 (8 to 9) | 88 (78 to 94) | 352 (312 to 376) | NR | NR | NR |
| False negatives (patients incorrectly classified as not having COVID-19) | 1 (0 to 1) | 4 (6 to 8) | 24 (16 to 32) | 1 (1 to 2) | 12 (6 to 22) | 48 (24 to 88) | NR | NR | NR |
| Quality of the evidence | 6 studies, 1038 patients ⨁⨁◯◯ Lowd,e | 2 studies, 216 patients ⨁◯◯◯ Very lowd,e,f | NR | ||||||
| Pretest Probability of 1%a | Pretest Probability of 10%b | Pretest Probability of 40%c | |||||||
| True negatives (patients without COVID-19) | 960 (941 to 970) | 873 (855 to 882) | 582 (570 to 588) | ||||||
| False positives (patients incorrectly classified as having COVID-19) | 30 (20 to 50) | 27 (18 to 45) | 18 (12 to 30) | ||||||
| Quality of Evidence | 11 studies, 5660 patients ⨁⨁◯◯ Lowd,e |
| IgM or IgG . | Week 3 . | . | . | Week 4 . | . | . | Week 5 . | . | . |
|---|---|---|---|---|---|---|---|---|---|
| Sensitivity | .94 (95% CI: .92 to .96) | .88 (95% CI: .78 to .94) | NR | ||||||
| Specificity | .97 (95% CI: .95 to .98) | ||||||||
| Outcome | Effect per 1000 Patients Tested | ||||||||
| Pretest Probability | |||||||||
| 1%a | 10%b | 40%c | 1%a | 10%b | 40%c | 1%a | 10%b | 40%c | |
| True positives (patients with COVID-19) | 9 (9 to 10) | 94 (92 to 96) | 376 (368 to 384) | 9 (8 to 9) | 88 (78 to 94) | 352 (312 to 376) | NR | NR | NR |
| False negatives (patients incorrectly classified as not having COVID-19) | 1 (0 to 1) | 4 (6 to 8) | 24 (16 to 32) | 1 (1 to 2) | 12 (6 to 22) | 48 (24 to 88) | NR | NR | NR |
| Quality of the evidence | 6 studies, 1038 patients ⨁⨁◯◯ Lowd,e | 2 studies, 216 patients ⨁◯◯◯ Very lowd,e,f | NR | ||||||
| Pretest Probability of 1%a | Pretest Probability of 10%b | Pretest Probability of 40%c | |||||||
| True negatives (patients without COVID-19) | 960 (941 to 970) | 873 (855 to 882) | 582 (570 to 588) | ||||||
| False positives (patients incorrectly classified as having COVID-19) | 30 (20 to 50) | 27 (18 to 45) | 18 (12 to 30) | ||||||
| Quality of Evidence | 11 studies, 5660 patients ⨁⨁◯◯ Lowd,e |
Abbreviations: CI, confidence interval; COVID-19, coronavirus disease 2019; IgG, immunoglobulin G; IgM, immunoglobulin M; NR, not reported.
aTypically seen in the general population in areas that are not hotspots.
bTypically seen in the general population in high-risk populations.
cTypically seen in the general population in SARS-CoV-2 exposed and nursing homes.
dThe case-control design leads to a serious risk of bias.
eUnexplained inconsistency observed with considerably variable sensitivity. Sensitivity ranges: W3, 0.80–1.00; W4, 0.75–0.94.
fConsidering the upper vs lower limits of the sensitivity’s CI would lead to different clinical decisions.
Benefits and Harms
The panel placed a high value on reducing false-positive test results and determining the optimal timing of testing to confidently assess previous infection. Detection of IgG or total antibody at 3 to 4 weeks after the onset of symptoms provides the highest sensitivity, and thus the lowest rate of false-negative results, compared with other immunoglobulin classes or earlier time points. IgG or total antibody tests also provide high specificity, and thus, reduce the rate of false-positive results relative to other antibody types.
The specificity of IgM tests in the 3 to 4 weeks post–symptom onset time frame was equivalent to IgG tests. However, in contrast to other viral infections where IgM tests show high sensitivity shortly after onset of symptoms compared with IgG, IgM sensitivity against SARS-CoV-2 is relatively low early on and there is no significant increase over time as is seen with IgG or total antibody (Tables 6, 7, and 9). The use of the IgM tests alone would result in increased false-negative rates compared with IgG or total antibody tests. Given these tradeoffs, the panel suggested neither for nor against the use of IgM tests to assess previous infection.
IgA tests or LF devices with both IgM and IgG targets, where detection of either antibody is considered a positive result, all suffer from lower specificity compared with tests involving IgG or total antibody targets, leading to increased false-positive rates. These tests would falsely increase seroprevalence and potentially mislead public health officials, policymakers, and the general public. In addition, false-positive results might detract from pursuing alternative diagnoses in symptomatic patients. Therefore, the panel suggested against the use of any of these tests at this time.
Other Considerations
The predictive value of diagnostic test results depends on the performance characteristics of the test (ie, sensitivity and specificity) and on the prevalence of the disease in the population tested. In general, the higher the prevalence, the higher the false-negative rate, and the lower the prevalence, the higher the false-positive rate. To illustrate the importance of this concept in interpreting antibody test results, the panel calculated the number of false-positive and false-negative results in populations with 1% and 10% SARS-CoV-2 seroprevalence, using pooled sensitivity and specificity from the data review. The 1% prevalence was chosen to represent a population with few prior cases of COVID-19, and the 10% prevalence was chosen to represent a COVID-19 “hot spot” population. As can be seen in Figure 3, even tests with seemingly high sensitivity and specificity can yield a large proportion of erroneous results at the extremes of prevalence. In this example, in a population with a 1% prevalence of SARS-CoV-2, a test with 96% sensitivity and 99% specificity would generate an equal number of true-positive and false-positive results.
Impact of prevalence on test predicative values. This graph models the predictive value of a theoretic serologic test with 96% sensitivity and 99% specificity across a spectrum of prevalence. When the true prevalence of SARS-CoV-2 infection in a population is 1%, the PPV of the test is only 49%. In other words, there is a 49% chance (close to a flip of the coin) that individuals with a positive screening test truly have the disease. As the prevalence increases, so does the predictive value. In contrast, when the prevalence of infection is 1%, the NPV high (ie, there is close to 100% probability that the disease is truly absent when the result of the test is negative). Abbreviations: NPV, negative-predictive value; PPV, positive-predictive value; SARS-CoV-2, severe acute respiratory syndrome coronavirus 2.
Impact of prevalence on test predicative values. This graph models the predictive value of a theoretic serologic test with 96% sensitivity and 99% specificity across a spectrum of prevalence. When the true prevalence of SARS-CoV-2 infection in a population is 1%, the PPV of the test is only 49%. In other words, there is a 49% chance (close to a flip of the coin) that individuals with a positive screening test truly have the disease. As the prevalence increases, so does the predictive value. In contrast, when the prevalence of infection is 1%, the NPV high (ie, there is close to 100% probability that the disease is truly absent when the result of the test is negative). Abbreviations: NPV, negative-predictive value; PPV, positive-predictive value; SARS-CoV-2, severe acute respiratory syndrome coronavirus 2.
Limited data are available on the duration of the IgG response, but in the studies we reviewed it appeared to be sustained for at least 3 to 5 weeks. The panel recommends testing for IgG or total antibody at 3 to 4 weeks because there are limited data on IgG or total antibody responses 5 weeks or longer after onset of symptoms. The IgM response begins to decrease by week 4 after symptom onset, such that IgM appears to be less sensitive over time than IgG for detection of recent past infection with SARS-CoV-2.
Overall, the panel identified substantial variability in sensitivity and specificity among tests (Supplementary Figure 3 [forest plots]). While inconsistent performance characteristics were observed across the 3 test methodologies included in the data summary, more inconsistency was seen for LF methods than for CIAs or ELISAs. Similar to the acute time frame, IgM assays showed more variable performance than did IgG assays at later time points after infection. Interestingly, the specific antigen targeted in the antibody test (eg, S protein or N protein) did not appear to impact test performance, although some studies did not report the antigen that was targeted.
Conclusions and Research Needs for This Recommendation
Sensitivity and specificity differences between available assays require that clinicians understand the performance characteristics of the particular test that is used. A general sense of prevalence (or pretest probability) in the population being tested is also helpful to ensure accurate interpretation of test results. While NAAT remains the recommended approach for diagnosis of COVID-19, detection of IgG or total antibody directed against SARS-CoV-2 at 3 to 4 weeks after symptom onset may be useful for determining past infection in selected clinical situations (see recommendations 6 and 7). IgG alone is typically used for determining seroprevalence, but information on the duration of a detectable antibody response beyond 4 weeks is limited.
It is important to realize that the presence of SARS-CoV-2 IgG or total antibody cannot be inferred to represent protective immunity against SARS-CoV-2 reinfection. Future research should include investigation as to whether SARS-CoV-2 reinfection occurs, and whether the presence of antibodies—including nAbs—or other measurable immunological responses (eg, T-cell responses) are markers of protection. Additional knowledge gaps include the duration of detectable IgG and total antibody, relationship of seropositivity to shedding of infectious virus in the convalescent phase, effect of seropositivity on COVID-19 vaccine response, and factors that affect antibody responses (eg, age, comorbid medical conditions, immunocompromised state). Larger studies of well-characterized and varied populations will be needed to begin to answer these questions. The availability of a reference standard for seropositivity (eg, presence of nAb) would also be valuable.
Recommendation 6: The IDSA panel suggests using IgG antibody to provide evidence of COVID-19 infection in symptomatic patients with a high clinical suspicion and repeatedly negative NAAT testing (weak recommendation, very low certainty of evidence).
- •
Remark—When serology is being considered as an adjunct to NAAT for diagnosis, testing 3 to 4 weeks post–symptom onset maximizes the sensitivity and specificity to detect past infection.
Summary of the Evidence
In addition to the evidence that informed recommendations 1 through 5, we identified 7 studies that compared the diagnostic test accuracy of serologic tests among patients with high clinical suspicion for COVID-19 and either negative NAAT throughout the course of their illness [57–61] or negative NAAT on repeat testing after initial positive tests [62, 63]. The 7 case studies showed an increase in diagnostic yield when antibodies were added to NAAT after 14 days of symptom onset. In addition, IgG identified a higher number of positive cases compared with IgM (Supplementary Table 3). It is again important to note that individual test sensitivity varied across studies. Variable test performance may have been due to suboptimal timing of serology, poorly performing assays, or misclassification of the reference standard (ie, labeling noninfected patients as true positive cases based on serology alone). The overall quality of evidence was low due to risk of bias (case-control study design) and imprecision as a result of small sample size.
Benefits and Harms
The potential benefit of true-positive serologic results in patients with clinical signs and symptoms of COVID-19, but repeatedly negative NAAT, is confirmation of diagnosis for epidemiological and prognostic purposes. Additionally, positive results may help reduce unnecessary ancillary testing or empiric therapies directed towards alternative diagnoses. False-negative serologic results, especially those obtained early in the course of symptomatic illness, could lead to inadequate isolation or quarantine as well as an underestimation of community prevalence. For example, assuming a prevalence of 40% in 1000 symptomatic patients with negative NAAT, based on the studies and package inserts that informed the diagnostic accuracy of IgG for recommendations 1 and 2 (above) [26–56], there will be 128 to 308 false negatives within the first 2 weeks of symptom onset compared with 20–48 false negatives during weeks 3 to 5 after symptoms onset.
Other Considerations
Studies in NAAT-negative symptomatic patients mainly reported on the use of IgM and IgG antibodies. Given the overall evidence supporting the use of total antibody in the later time periods after symptom onset, it is likely total antibody could also be useful for the evaluation of symptomatic patients presenting late (ie, 3 to 4 weeks after symptom onset) and with repeatedly negative NAAT. Although the timing of testing relative to symptom onset was reported in these case series, exact timing may be difficult to ascertain in routine clinical practice. In addition, most of the studies included in our review involved hospitalized patients with radiographic abnormalities suggestive of viral pneumonia. Severity of illness may correlate with the likelihood of seropositivity [64]. In addition, it is unclear whether RNA detection would have occurred in patients with pneumonia if repeat NAAT testing had been performed using a lower respiratory tract specimen (ie, sputum or bronchoalveolar lavage fluid) as follow-up of an initial negative upper respiratory swab.
Conclusions and Research Needs for This Recommendation
In conclusion, assessing anti–SARS-CoV-2 IgG antibodies provides evidence of COVID-19 in patients with high clinical suspicion and negative NAAT results. The timing of antibody testing is important, with optimal sensitivity observed at least 2 weeks from the time of symptom onset. Ideally, serologic studies of NAAT-negative patients should evaluate quantitative antibody responses at defined time points, such that seroconversion and/or nAbs can be measured as evidence of true infection. Furthermore, outpatients with mild illness, children, and immunocompromised patients should also be included in future studies.
Recommendation 7: In pediatric patients with multisystem inflammatory syndrome, the IDSA panel suggests using both IgG antibody and NAAT to provide evidence of current or past COVID-19 infection (strong recommendation, very low certainty of evidence).
Summary of the Evidence
We identified 9 case series [65–73] that evaluated the use of SARS-CoV-2 serologic tests in patients presenting with signs of multisystem inflammatory syndrome in children (MIS-C) (Supplementary Table 4). The case definitions of this inflammatory syndrome, typically manifesting with fever and shock similar to a Kawasaki-like syndrome, varied across studies and most did not require laboratory evidence of SARS-CoV-2 infection to define the illness. Instead, included studies showed a higher detection rate of MIS-C when IgG antibodies were added to NAAT compared with NAAT alone for the diagnosis. NAAT was positive in approximately 30–50% of cases in these studies, while IgG antibodies were positive in a majority (>80%) of cases. The major limitations of the evidence include the fact that not all case series specified the timing of testing (NAAT or serology) with respect to symptom onset. In addition, the type of serological test utilized and epidemiological links to COVID-19 cases were also not uniformly specified. The overall quality of evidence was very low due to the limitations of the case-series study design and serious risk of bias.
Benefits and Harms
Children infected with SARS-CoV-2 may have no history of symptomatic disease or known exposure to COVID-19. Therefore, viral and serological testing may be especially important in this population. In fact, positive anti–SARS-CoV-2 antibody testing is now part of the MIS-C case definition. However, false-positive results may detract from identifying the true cause of illness. Given the overlap in clinical features and difference in treatment of multisystem inflammatory syndrome diseases, it is important to differentiate MIS-C from Kawasaki disease and other inflammatory processes such as bacterial infections and rheumatic fever.
Other Considerations
In clinical practice, young patients with presumed MIS-C are often critically ill and therefore NAAT and IgG serology should be obtained simultaneously. Although we found no studies reporting on the use of total antibody in the setting of MIS-C, it seems likely that these assays could also be useful for defining the syndrome. Evaluation of serology prior to the administration of intravenous immunoglobulin or blood products is also important because these therapeutic modalities have the potential to alter the serologic response.
Conclusions and Research Needs for This Recommendation
Multisystem inflammatory syndrome in children is an emerging syndrome. Although causality due to COVID-19 is strongly suspected, it is not proven. Serologic testing should be performed to help establish a diagnosis when patients have signs or symptoms consistent with the late complications of COVID-19. However, relatively little is known about the performance of SARS-CoV-2 antibody testing in children. Therefore, prospective pediatric studies are needed to define diagnostic sensitivity and specificity relative to a reference standard as well as measure the timing and durability of antibody responses in children. Studies to better define the pathophysiology and risks for developing MIS-C following SARS-CoV-2 infection are also needed.
Recommendation 8: The IDSA panel makes no recommendation for or against using capillary versus venous blood for serologic testing to detect SARS-CoV-2 antibodies (knowledge gap).
Summary of Evidence
We identified 3 studies [48, 74, 75] and 1 package insert [76] that compared the diagnostic test accuracy of serologic tests using fingerstick (capillary) blood with venous blood samples obtained via venipuncture (Supplementary Table 5). Identified publications included both a cohort study [48] and case-control studies [74–76]. Only 1 study [76] compared fingerstick with venous blood directly, showing 100% positive and negative percent agreement between the 2 sampling methods. However, this study was excluded from the analyses because it did not provide diagnostic test accuracy results by week after symptom onset. The remaining studies [48, 74, 75] reported diagnostic test accuracy using fingerstick sampling relative to SARS-CoV-2 NAAT.
Due to the lack of evidence comparing the 2 sampling methods directly, we relied on indirect evidence by using conventional subgroup analysis methods to compare the pooled sensitivities and specificities from LF studies that used fingerstick sampling with the pooled sensitivities and specificities from LF studies that used venous blood [28, 29, 36, 38, 43, 44, 46, 51, 59, 60, 77–84]. As the timing of sampling seems crucial to evaluate the antibody responses, comparisons for this recommendation were made using data obtained at least 2 to 3 weeks after symptom initiation.
The included studies provided diagnostic test accuracy results for LF assays using IgM and/or IgG on fingerstick blood collected during weeks 2 and 3 post–symptom onset. The pooled sensitivities of IgG collected during week 3 and the presence of either IgM or IgG collected during weeks 2 and 3 using fingerstick blood were comparable to those with venous blood, except for wider confidence intervals with fingerstick blood. IgM collected during week 3 showed a lower pooled sensitivity when collected using fingerstick compared with venous blood. Of note, the main test modality available for capillary blood testing was LF assays, which may have variations in performance by manufacturer (Supplementary Figure 3 [forest plots]). In sum, there was very low certainty of evidence overall due to a serious risk of bias (ie, evidence from case-control studies), indirectness (ie, indirect comparisons as outlined above), and imprecision (due to the small number of studies and/or samples).
Benefits and Harms
Capillary blood is usually obtained from a dermal puncture of the fingertip capillary beds and represents a mixture of venous and arterial blood. In general, capillary blood is an ideal specimen for “point-of-care testing” and is often the specimen of choice for those with difficult venous access, such as very young infants, elderly patients, or severely burned patients. In addition, capillary blood can be used for dried blood spot testing, which can be mailed to the laboratory or preserved for future testing. Since this approach avoids large needles, it may be preferred by individuals who fear needles. Fingerstick collection also facilitates self-collection, which can save on personal protective equipment (PPE) and healthcare worker–facing time. In contrast, a potential harm of fingerstick blood is that the test accuracy (sensitivity and specificity) may be lower compared with venous sampling. The harms of venous blood draw are that it is considered by some to be an unpleasant experience, potentially leading to local side effects such as bleeding or bruising, requires specialized personnel, and use of PPE and healthcare worker–facing time.
Other Considerations
There is precedent for using capillary blood sampling for the detection of antibodies against other respiratory viruses, such as influenza [85]. Accurate fingerstick testing for SARS-CoV-2 antibodies would be ideal when evaluating postinflammatory complications of COVID-19, such as MIS-C. In addition, point-of-care testing would be useful for deployment in large surveillance projects and contact-tracing efforts. Capillary blood testing could also be utilized in clinical trials for monitoring serological response to anti–SARS-CoV-2 vaccines.
Conclusions and Research Needs for This Recommendation
In summary, there was inadequate information to compare the performance of capillary versus venous blood for the detection of antibodies against SARS-CoV-2. Due to the lack of high-quality evidence, the panel reached the conclusion not to recommend for or against the use of capillary blood to measure SARS-CoV-2 antibodies. If equivalent performance of capillary blood and venous blood is ultimately confirmed, then there is potential to deploy a simple point-of-care strategy for large-scale assessments of previous SARS-CoV-2 infection. This may also be the serologic test method of choice when larger volumes of blood cannot be obtained (eg, very young infants).
Further studies are needed to demonstrate equivalence of SARS-CoV-2 serological testing using capillary versus venous blood as well as to compare direct sample analysis with dried blood spots. It is likely that the potential impact of sample hemolysis, tissue clotting factors, and specimen volume will be extremely method dependent. Therefore, understanding each individual assay’s test performance will be essential. Accuracy studies should be designed such that venous and capillary blood samples are taken from the same patient, at the same time, with a robust sample size, and excellent clinical characterization of study subjects. Since diagnostic test accuracy may also be affected by clinical parameters (ie, presence or absence of symptoms, signs and/or radiologic findings, duration of symptoms, and disease severity), monitoring over time will be needed to assess waning antibody responses. Understanding the value of capillary blood testing in the context of assessing immunity after vaccination will also be warranted. Finally, beyond fingerstick blood collection, other non–venipuncture-based blood collection devices (especially self-collection) are being developed. As these become available, validation of individual serologic tests with blood collected from individual collection devices will need to be performed.
Narrative Summaries of Serodiagnostics Undergoing Evaluation
In addition to the clinical questions addressed above, the panel identified several diagnostic approaches currently undergoing evaluation for which additional data are needed to formulate recommendations. Narrative summaries for these approaches are provided below.
Neutralizing Antibody and Cellular Immune Responses
Neutralizing antibody titers vary among recovered patients [86], indicating that other parts of the immune system such as T cells and cytokines are likely to contribute to viral clearance. However, in some individuals who have recovered from COVID-19, detectable antibodies do correlate with the number of virus-specific T cells [87]. Standardized assays designed to accurately quantitate nAbs or measure T-cell–mediated immunity directed against SARS-CoV-2 do not currently exist but will be needed for comprehensive studies of immunity going forward. Whether nAb titers correlate with protection and/or infectiousness still needs to be determined. Assessing protection also has important implications for vaccine trials, selecting convalescent plasma donors, as well as determining which components of the plasma confer antiviral activity.
Classical nAb assays rely on culturing live virus. For SARS-CoV-2, this procedure requires appropriate biocontainment facilities certified for work with BSL-3 pathogens. Pseudovirus assays have been developed to circumvent the safety concerns associated with SARS-CoV-2 culture and, previously, other highly pathogenic viruses [88]. These methods typically use retroviruses capable of integrating the envelope glycoproteins of another virus to form a “pseudovirus.” Compared with natural viruses, pseudoviruses can only infect cells in a single round, produce high-titer infections, and are not easily inactivated by serum complement. Whether nAb titers determined by different pseudovirus assays produce similar results is not yet known and the FDA has yet to authorize use of neutralization tests for SARS-CoV-2. Of note, 1 small COVID-19 study observed excellent correlation between nAb titers measured by a pseudovirus compared with a live virus assay [86].
Detecting Antibodies Directed Against Different SARS-CoV-2 Antigens and Multiple-test Algorithms
In our systematic review, there was no difference in the performance of assays designed to detect the N protein or various portions of the S protein. However, patient sera that react with N protein–based tests may be different than the sera that react with S protein–based tests [89]. Combining the 2 antigens in a single test, or sequential testing with 2 tests targeting different proteins when the first assay is negative, may increase the number of true-positive results. Similarly, the CDC has recommended using an orthogonal testing algorithm (ie, employing 2 different tests when the first yields a positive result) to increase the positive-predictive value of testing when the likelihood of previous exposure to SARS-CoV-2 in the community is low [90]. The logistics, cost, and impact of reflexive testing strategies need to be determined prospectively.
DISCUSSION
The immune response to SARS-CoV-2 infection has not yet been fully elucidated. Early evidence suggests that, unlike other infectious diseases, anti–SARS-CoV-2 IgM antibodies become detectable later after symptom onset and increase nearly simultaneously with IgG after 2 weeks of infection (Figure 1). Therefore, detection of IgM without IgG is uncommon. In line with other systematic reviews, our antibody comparisons stratified by time post–symptom onset indicate that serologic testing has limited utility for ruling out SARS-CoV-2 infection in the acute phases of illness [91, 92]. The pooled sensitivity estimate for IgM in week 2 post–symptom onset was 73% (Table 1). Thus, relying on IgM for the diagnosis of COVID-19 at 2 weeks would miss 108 true cases out of 1000 patients when the clinical suspicion for infection is high. The sensitivity of other immunoglobulin classes is equally as poor early after infection.
Although these tests have limited value for diagnosing acute infection, serology complements molecular testing for individuals presenting later in the course of illness. The optimal performance of serologic testing occurs approximately 3 to 4 weeks post–symptom onset and is achieved using IgG or total antibody assays. Overall, the sensitivity of IgG or total antibody at 4 weeks was 88% and 95%, respectively. Using IgG or total antibody reduces, but does not eliminate, false-negative results at 4 weeks (20 to 48 false negatives out of 1000 individuals with high clinical suspicion for COVID-19) (Tables 6 and 9). Data on time points beyond 4 weeks are limited.
Interpretation of test specificity also requires caution because little information is available from individuals who were tested due to clinical suspicion of SARS-CoV-2. Moreover, using NAAT as the reference standard may spuriously increase “false positive” serology results, when it is the molecular diagnostic test that is falsely negative. Identifying a better reference standard for future serology studies will help to define the performance of antibody tests. Seroconversion with a documented 4-fold increase in IgG titers is suggestive of recent infection. However, most commercial antibody assays do not provide quantitative results and false-positive IgG detection is still possible. Correlation of the circulating antibodies detectable with available tests with nAb titers can also provide confidence in positive immunoglobulin results. Whether all individuals, including special populations, such as neonates, the elderly, or immunocompromised, generate a detectable nAb response to SARS-CoV-2 also needs to be established.
It is important to emphasize that serologic test performance is highly variable and sometimes the same manufacturer’s assay performed quite differently across studies. In subgroup analysis, we found no substantive differences in test characteristics (ie, sensitivity and specificity) based on viral antigen or whether the test has an EUA from the FDA (data not shown). Alternatively, the clinical performance of LF assays was more variable than ELISA or CIA tests. In addition to poorly performing LF tests, factors that may have contributed to heterogeneity across studies include differences in the patient population tested, the exact timing of testing relative to symptom onset, and the use of different NAAT assays as the reference standard for comparison. Patients with mild symptoms, or those who are asymptomatic [64], may display weaker immune responses to SARS-CoV-2 compared with those with severe illness. Standardized reporting of disease severity was not included in most studies. Going forward, serologic assay studies should include a description of the severity of illness using well-defined grading criteria. The exact timing of testing was also not specified in all studies. For example, patients with symptoms for “less than 21 days” likely bridged various time points but were lumped in this guideline in the 3-week time period. Uncertainty around the timing of testing could have affected the time-stratified analyses.
Perhaps the greatest interest in serologic testing has been for tracking SARS-CoV-2 exposure in the community. Accurate estimates of seroprevalence depend on the true prevalence of past infection in a given population along with the sensitivity and specificity of the test used to detect antibodies. When the true prevalence of infection is low, small decreases from a test sensitivity of 100% have a minimal impact on the negative-predictive value of a test, while reductions in specificity artificially inflate measures of seroprevalence. As the true prevalence increases, similar deviations in sensitivity are more impactful, while reductions in specificity are less noticeable. Across the United States, including regions significantly impacted by COVID-19, the prevalence of past infection is still expected to be relatively low (ie, <7%) [93, 94]. At the current time, choosing a test with high specificity is especially important to reduce false-positive results. Long-term-care facilities, congregate settings, or factories that have experienced an outbreak are likely to have much higher seroprevalence, but these are situations that do not mirror the broader pandemic in most communities.
The magnitude and duration of humoral immune responses to SARS-CoV-2 infection has not been defined. Early evidence suggests that IgG antibodies and nAbs begin to decline in the first 3 months following onset of symptoms in patients with mild COVID-19, similar to seasonal coronaviruses [64, 95]. These observations could have important implications for point-prevalence studies. Future studies are needed to define the antibody dynamics and to determine whether detection of antibodies (and if so at what titers) confers protection against reinfection.
CONCLUSIONS
Clinicians and public health officials need to understand the performance of serologic assays used in their settings in order to accurately interpret anti–SARS-CoV-2 antibody test results. Whenever possible, serologic assays with established high sensitivity and specificity (ie, ≥99.5%) should be employed. IgG and total antibody tests appear to have better sensitivity and specificity than other immunoglobulin classes and perform best when used between 3 to 4 weeks after symptom onset. The clinical indications for antibody testing to support a diagnosis of COVID-19 are limited at this time. Testing can be considered for the evaluation of patients with a high clinical suspicion for SARS-CoV-2 infection despite repeatedly negative NAATs (especially those presenting late after symptom onset) and should be included in the assessment of MIS-C. Serologic tests also have potential utility for tracking the course of the SARS-CoV-2 pandemic in the community. The effectiveness and durability of anti–SARS-CoV-2 antibody responses, however, have not yet been defined. Thus, serologic testing cannot be used to determine immune status.
Supplementary Data
Supplementary materials are available at Clinical Infectious Diseases online. Consisting of data provided by the authors to benefit the reader, the posted materials are not copyedited and are the sole responsibility of the authors, so questions or comments should be addressed to the corresponding author.
Notes
Author contributions. Panel members: K. E. H. (lead), A. M. C., C. A. A., J. A. E. (Pediatric Infectious Diseases Society representative), M. K. H., M. J. L., M. L. (Society for Healthcare Epidemiology of America representative), R. P. (American Society for Microbiology representative), and A. B. Methodologists: R. A. M. (lead), O. A., A. E. A., S. S., Y. F.-Y., V. L., R. L. M., and M. H. M.
Disclaimer. The contents of this guideline do not necessarily represent the policy of Centers for Disease Control and Prevention or the Department of Health and Human Services and should not be considered an endorsement by the federal government.
Financial support. This project was supported, in part, by a cooperative agreement with the Centers for Disease Control and Prevention (CDC; grant number 6 NU50CK000477-04-01). The CDC is an agency within the Department of Health and Human Services.
Potential conflicts of interest. The following list displays what has been reported to the Infectious Diseases Society of America (IDSA). To provide thorough transparency, the IDSA requires full disclosure of all relationships, regardless of relevancy to the guideline topic. Evaluation of such relationships as potential conflicts of interest is determined by a review process, which includes assessment by the Board of Directors liaison to the Standards and Practice Guideline Committee and, if necessary, the Conflicts of Interest (COI) and Ethics Committee. The assessment of disclosed relationships for possible COI is based on the relative weight of the financial relationship (ie, monetary amount) and the relevance of the relationship (ie, the degree to which an association might reasonably be interpreted by an independent observer as related to the topic or recommendation of consideration). The reader of these guidelines should be mindful of this when the list of disclosures is reviewed. K. E. H. serves as an advisor for BioFire and Quidel and receives research funding from the National Institutes of Health (NIH). A. M. C. serves as an advisor for Roche Diagnostics, Danaher, Quidel, First Light, Day Zero, Visby, and Chroma Code; receives research funding from ArcBio and Hologic; and has served as an advisor for Luminex. C. A. A. receives royalties from UpToDate and receives research funding from Merck, MeMed Diagnostics, Entasis Pharmaceuticals, and the National Institute of Allergy and Infectious Diseases/NIH. J. A. E. serves as a consultant for Sanofi Pasteur, an advisor/consultant for Meissa Vaccines, and receives research funding from the Centers for Disease Control and Prevention, Brotman Baty Research Institute, Merck, Novavax, GlaxoSmithKline, and AstraZeneca. M. L. serves as an advisor for Sanofi, Seqirus, Medicago, and Roche; has served as an advisor for Pfizer, Sunovion, and MD Brief; and receives research funding from the Canadian Institutes of Health Research and the Medical Research Council (United Kingdom). R. P. receives grants from Shionogi, CD Diagnostics, Merck, Hutchison Biofilm Medical Solutions, Accelerate Diagnostics, ContraFect, and TenNor; serves as a consultant for Curetis, Specific Technologies, Next Gen Diagnostics, Pathoquest, Selux Diagnostics, 1928 Diagnostics, PhAst, and Qvella; holds patents for B. pertussis/parapertussis polymerase chain reaction, device/method for sonification, and an anti-biofilm substance; receives research funding from the NIH, the National Science Foundation, and the US Department of Defense; and receives monies/reimbursement from the American Society for Microbiology, the IDSA, the National Board of Medical Examiners, UpToDate, and the Infectious Disease Board Review Course; Y. F.-Y. receives honoraria for evidence reviews and teaching from the Evidence Foundation, honoraria for evidence reviews for the American Gastroenterological Association, and serves as a Director for the Evidence Foundation and for the US GRADE Network; and M. H. M receives research funding from the Agency for Healthcare Research and Quality, the Endocrine Society, the Society for Vascular Surgery, and The American Society of Hematology and is a Board member for the Evidence Foundation. All other authors report no potential conflicts. All authors have submitted the ICMJE Form for Disclosure of Potential Conflicts of Interest. Conflicts that the editors consider relevant to the content of the manuscript have been disclosed.



Comments