Sensitivity and Specificity of Treponemal-specific Tests for the Diagnosis of Syphilis

Abstract We conducted a systematic review of relevant syphilis diagnostic literature to address the question, “What is the sensitivity and specificity of the treponemal tests currently approved by the Food and Drug Administration (FDA) for the diagnosis of syphilis (by stage)?” There were 16 treponemal assays evaluated: 13 immunoassays and 3 manual assays (fluorescent treponemal antibody absorbed test [FTA-ABS], microhemagglutination assay for Treponema pallidum antibodies [MHA-TP], Treponema pallidum particle agglutination assay [TP-PA]). MHA-TP and FTA-ABS were less sensitive in primary and secondary syphilis than TP-PA; TP-PA is the most specific manual treponemal assay. There is insufficient evidence to recommend one particular treponemal immunoassay (eg, enzyme immunoassays, chemiluminescence immunoassays, microbead immunoassays) over another based on published performance data. For diagnosis of neurosyphilis, cerebrospinal fluid (CSF) TP-PA has similar performance to CSF FTA-ABS in studies with patients with definitive or presumptive neurosyphilis. However, CSF treponemal testing has limitations in its sensitivity and specificity and should be interpreted within the context of the clinical scenario, additional CSF test results and syphilis prevalence.

Laboratory diagnosis of syphilis has traditionally involved an algorithm beginning with a nontreponemal test (eg, rapid plasma regain [RPR]) followed by a manual Treponema pallidum-specific assay (eg, T. pallidum particle agglutination assay [TP-PA]) for confirmation of reactive nontreponemal serology. Currently, various treponemal-specific immunoassays are increasingly being used for syphilis screening and diagnosis, including enzyme immunoassays (EIAs), chemiluminescence immunoassays (CIAs), and microbead immunoassays (MBIAs), among others. These assays can be automated, reducing labor and turnaround time. Because some of these assays are relatively nonspecific, a reverse-sequence algorithm has been employed beginning with a treponemal immunoassay, followed by reflex nontreponemal testing (eg, RPR) on initially reactive specimens [1]. Currently, the Centers for Disease Control and Prevention (CDC) recommends conducting a TP-PA if there are discordant results between the immunoassay and RPR (eg, EIA-reactive, RPR-nonreactive) [1]. Regardless of which algorithm is used, for laboratories to select the most appropriate treponemal test(s) it is important to consider the sensitivity and specificity of these assays in clinically characterized sera, stratified by stage of syphilis.
We conducted a systematic review of the literature on the test performance of treponemal-specific tests, and results of this review were presented to a national consultation of experts in November 2017. Our review was based on a single key question: What is the sensitivity and specificity of the treponemal tests currently approved by the Food and Drug Administration (FDA) for the diagnosis of syphilis (by stage)? Our objective of this review was to inform the selection of the appropriate confirmatory treponemal test for laboratories using the traditional algorithm. These data will assist laboratories in their selection of an initial treponemal test when the reverse sequence algorithm is used for diagnosis of syphilis. Additionally, the data will facilitate selection of the appropriate second treponemal test for patients with initially discordant treponemal and nontreponemal serology (eg, CIA-reactive, RPR-nonreactive).

METHODS
We searched Medline, Embase, Scopus, Cochrane Library, and CINAHL from 1960 to 30 June 2017. Following the consultation in November 2017, we subsequently updated the literature search from July 2017 to September 2018 using the following search terms: (Treponema pallidum OR Neurosyphilis OR Syphilis) AND (serodiagnos* OR serodiagnos* OR (serolog* AND (test* OR exam* OR assay* OR screen* OR lab* OR diagnos* OR nontreponemal OR treponemal OR algorithm* OR antibody titer) OR serofast)). The search was limited to human studies published in English.
The initial search yielded n = 4851 nonduplicated abstracts. We excluded n = 4504 abstracts that were not relevant to the key question: studies of nontreponemal testing only, animal studies, direct detection studies, review articles, guidelines, letters to the editor, and other publications that were not primary research studies. We reviewed 347 abstracts, and further excluded n = 230 studies that described obsolete tests only, tests not approved by the FDA, those that used a gold standard based exclusively on non-FDA approved tests, studies of prevalence or laboratory technique only (no test performance), any duplicate publications, and abstracts without a full manuscript. After exclusions, 117 full papers were reviewed for potential inclusion, 81 studies with either descriptive data on use of treponemal tests or actual test performance data were abstracted into Tables of Evidence (Supplementary Table) Studies with test performance data were prioritized according to their relevance to the key question (Supplementary Table). Studies of high relevance were those with clinically characterized specimens, stratified by stage of syphilis (with/without use of dark-field microscopy for diagnosis of primary syphilis), and included studies that utilized syphilis specimens from commercial or CDC serum banks. Studies of moderate relevance were those with clinically characterized specimens but no stratification by stage (all patients with syphilis analyzed together). Lower relevance studies were those that used a laboratory reference standard only (single or multiple tests) without clinical characterization, and also include studies where clinical characterization could not be assessed or was not performed uniformly across specimens. Studies of high and moderate relevance were abstracted into tables of test performance, and the range of sensitivity and specificity estimates from all studies was abstracted. If only a single study was available for a particular assay, the proportions (n/N) and 95% confidence intervals were abstracted.
Following presentation of the published test performance data at the national consultation, it was noted that many of the treponemal immunoassays had little or no data on test performance published in the peer-reviewed literature. Therefore, for the treponemal immunoassays, we obtained 510(k) Premarket Notification data submitted to the FDA and also abstracted these data into the Tables of Evidence.

RESULTS
A summary of characteristics of FDA-approved treponemal tests, including manufacturer, assay type, antigens, antibodies detected, and specimen type, is detailed in Table 1. Among the 16 treponemal assays reviewed, there were 13 immunoassays and 3 manual assays: fluorescent treponemal antibody absorbed test (FTA-ABS), microhemagglutination assay for Treponema pallidum antibodies (MHA-TP), and TP-PA. Ten treponemal tests had published data on sensitivity and specificity. Two immunoassays had performance data that were not stratified by stage of syphilis (Abbott Architect, Roche Elecsys).
Among the other 8 that had data stratified by stage of syphilis, 3 were manual treponemal assays and 5 were immunoassays (ADVIA Centaur, Bioplex 2200, Captia Syphilis, G, LIAISON, Trep-Sure). The performance characteristics for these 10 treponemal assays were summarized from 21 highly relevant studies and 11 moderately relevant studies in Tables 2 and 3.

Primary and Secondary Syphilis
Among the manual assays, MHA-TP was less sensitive for primary syphilis (45.9-88.6%) than FTA-ABS (78.  Table 2). Based on data from 2 studies that compared head-to-head test performance, FTA-ABS was less sensitive than TP-PA in both primary and secondary syphilis [11,13].
A study by Park et al [13] found similar sensitivity for the ADVIA Centaur, Bioplex 2200, LIAISON, and Trep-Sure in primary syphilis compared with TP-PA and FTA-ABS; however, Gratzer et al [35] found poorer sensitivity of Trep-Sure in primary syphilis (54.8%, 39.5-67.8%). The Captia Syphilis G was 82.3-100% sensitive for primary syphilis in 3 studies, but sample sizes were small (6-13 cases) [17,30,32]. Overall, based on limited studies with small sample sizes, the sensitivity of the immunoassays in primary syphilis was comparable to the manual treponemal assays.
For the 5 treponemal assays that had data stratified by stage, all were 100% sensitive for secondary syphilis [13,25,34] (Table 3). Therefore, the sensitivity of the immunoassays was comparable to TP-PA and slightly higher than MHA-TP or FTA-ABS.
The immunoassays demonstrated specificity ranging from 94.5% to 100% (Table 3), with the exception of TrepSure, which was 82.6% specific in a single study [13].

FTA-ABS
Thirteen studies described CSF FTA-ABS test performance (not all studies included both sensitivity/specificity) and were summarized in a prior systematic review [47]. In 3 studies of patients with definitive neurosyphilis (reactive CSF Venereal Disease Research Laboratory Test), the sensitivity of CSF FTA-ABS was 90.9-100% [48][49][50]. Among those with presumptive neurosyphilis where diagnosis was made based on reactive serology, other abnormal CSF indices, and clinical signs/symptoms, the sensitivity ranged widely (22.2-100%) [47]. A study by Luger et al [51] of 60 symptomatic patients defined neurosyphilis by comparing ratios of serum protein and CSF protein with a ratio of serum treponemal antibody and CSF treponemal antibody; in this study, the sensitivity of the CSF FTA-ABS was 100%. Another large study by Hooshmand et al [52] (n = 156) also found 100% sensitivity of CSF FTA-ABS, but a reactive CSF FTA-ABS was part of the case definition, thus the sensitivity results cannot be interpreted. The specificity of FTA-ABS varied greatly depending on whether true negatives were patients without syphilis or patients with syphilis, but not neurosyphilis. Six studies included patients without syphilis as true negatives, and the specificity of FTA-ABS was 100%; however, a study by Jaffe et al [53] found that CSF FTA was reactive in 5 of 15 patients with syphilis who had no other evidence of neurosyphilis. Eleven studies included patients with syphilis, but not neurosyphilis, and the specificity ranged from 55% to 100% [47]. For CSF TP-PA, 4 studies described test performance. A study by Castro et al [54] reported a sensitivity of 100% for the CSF TP-PA but the clinical characterization of true positives could not be interpreted given the data provided. The other 3 studies reported a sensitivity of 75.6-95.0%, with the highest sensitivities when using reactive CSF VDRL as the criterion for true positivity [54][55][56][57]. Specificities ranged from 85.5% to 100% and were highest if a titer of 1:640 or greater was used to define neurosyphilis [57]. Based on these limited data, CSF TP-PA appears to have similar performance to CSF FTA-ABS in studies with a mixed population of patients with definitive/presumptive neurosyphilis. Captia Syphilis-G Assay N/A, see Table 3 Elecsys Syphilis Untreated Treated [37] Primary ( Among the numerous treponemal assays currently approved by the FDA, comparison of performance characteristics was more robust for the manual assays because there were few studies of the immunoassays that included clinically characterized specimens, stratified by stage. Among the manual treponemal assays included in this review (ie, MHA-TP, FTA-ABS, TP-PA), MHA-TP demonstrated poorer sensitivity for all stages of syphilis. Between the FTA-ABS and TP-PA, the 2 studies that compared their performance found lower sensitivity for the FTA-ABS for primary and secondary syphilis [11,13]. Given the subjective nature of FTA-ABS interpretation, lack of quality control for FTA-ABS reagents, and the need for microbiologist experience, it is not recommended for use. TP-PA is the recommended assay among the manual treponemal tests. Among the treponemal immunoassays, there were few published data on test performance stratified by stage, and sample sizes for Premarket FDA 510K data were small. There are insufficient data to distinguish differences in performance between treponemal immunoassays (eg, EIAs, CIAs, MBIAs) for laboratory diagnosis of syphilis. Of note, 2 studies found that TrepSure had poor sensitivity for primary syphilis [24] and significantly lower specificity than other immunoassays [13].
Several factors should be considered when interpreting these test performance data. Most studies were retrospective and used reactive serology as part of the inclusion criteria, which would bias sensitivity estimates towards 100%, particularly for primary syphilis. Studies utilizing previously banked specimens (both CDC and commercial serum banks) were included in this analysis, but the quality of staging/characterization of these specimens could not be assessed. For primary syphilis, it was unclear whether studies using banked specimens included dark-field positive-seronegative cases or just cases with reactive nontreponemal and/or treponemal serology. For latent syphilis, most studies combined early and late latent into a single category defined as "combined latent syphilis" or used a 2-year cutoff for defining early versus late latent disease. Some studies included prior treated cases and untreated (current) syphilis cases. When possible, our evaluation focused on untreated or current syphilis because the time between treatment and specimen collection was not described.
With regard to neurosyphilis, diagnostic criteria of the included studies were diverse and included various combinations of signs/symptoms with abnormal white blood cell count/protein and/or reactive CSF VDRL. As T. pallidum IgG can cross the intact blood-CSF barrier, reactive treponemal tests in the CSF are not specific for the diagnosis of neurosyphilis. Although the CSF TP-PA and CSF FTA-ABS demonstrated similar sensitivity and specificity, Harding et al found that a negative CSF treponemal test may not rule out neurosyphilis among patients with a high pretest probability (patients with syphilis and neurologic symptoms) [47]. Therefore, CSF treponemal tests have limitations with both sensitivity and specificity, and results need to be evaluated within the context of the clinical scenario, additional CSF testing (eg, VDRL, cell count, protein), and syphilis prevalence.

Future Needs and Recommendations
1. Performance data are needed for the immunoassays using clinically characterized specimens, stratified by stage of syphilis. Studies should include sufficient numbers to stratify by HIV status so that performance among persons living with HIV can be assessed in the era of combined antiretroviral therapy. Many assays currently in use had no published data of this kind. This is particularly an issue with early primary and late latent disease. 2. Additional data are needed on the performance of treponemal tests in latent syphilis based on the CDC case definitions for latent syphilis [1]. 3. Additional data are needed on the comparative performance of assays for diagnosis of neurosyphilis: CSF FTA-ABS, CSF TP-PA, and treponemal CIA/EIA in CSF. 4. Performance data are needed for the immunoassays (in serum) among patients with neurosyphilis. 5. There is a need to define serologic windows using modern treponemal and nontreponemal tests.
Some of these research and programmatic goals could be facilitated by the creation or resurrection of the CDC Syphilis Serum Bank with validated specimens, characterized by stage using standardized criteria, including seronegative, dark-fieldpositive primary syphilis specimens. This should also include specimens among patients without syphilis for specificity evaluations.
Other facilitators would include harmonization of criteria for evaluating performance of treponemal and nontreponemal tests, in particular characterization of true/false positives. Several newer immunoassays (eg, Elecsys Syphilis, Architect Syphilis TP, Lumipulse G TP-N) achieved this through a consensus of testing with a predicate immunoassay, plus RPR, plus TP-PA, where any 2 of 3 reactive specimens would be considered a true positive. More data are needed to determine whether this approach should become the common reference standard or predicate against which new immunoassays should be measured.