The costs of scaling up HIV and syphilis testing in low- and middle-income countries: a systematic review

Abstract Around two-thirds of all new HIV infections and 90% of syphilis cases occur in low- and middle-income countries (LMICs). Testing is a key strategy for the prevention and treatment of HIV and syphilis. Decision-makers in LMICs face considerable uncertainties about the costs of scaling up HIV and syphilis testing. This paper synthesizes economic evidence on the costs of scaling up HIV and syphilis testing interventions in LMICs and evidence on how costs change with the scale of delivery. We systematically searched multiple databases (Medline, Econlit, Embase, EMCARE, CINAHL, Global Health and the NHS Economic Evaluation Database) for peer-reviewed studies examining the costs of scaling up HIV and syphilis testing in LMICs. Thirty-five eligible studies were identified from 4869 unique citations. Most studies were conducted in Sub-Saharan Africa (N = 17) and most explored the costs of rapid HIV in facilities targeted the general population (N = 19). Only two studies focused on syphilis testing. Seventeen studies were cost analyses, 17 were cost-effectiveness analyses and 1 was cost–benefit analysis of HIV or syphilis testing. Most studies took a modelling approach (N = 25) and assumed costs increased linearly with scale. Ten studies examined cost efficiencies associated with scale, most reporting short-run economies of scale. Important drivers of the costs of scaling up included testing uptake and the price of test kits. The ‘true’ cost of scaling up testing is likely to be masked by the use of short-term decision frameworks, linear unit-cost projections (i.e. multiplying an average cost by a factor reflecting activity at a larger scale) and availability of health system capacity and infrastructure to supervise and support scale up. Cost data need to be routinely collected alongside other monitoring indicators as HIV and syphilis testing continues to be scaled up in LMICs.


Introduction
HIV and syphilis infections are major public health problems worldwide (Hook, 2017 . Untreated HIV and syphilis can lead to mother-to-child transmission (MTCT), which in turn leads to adverse maternal and neonatal outcomes (2018). The global health community, led by the WHO, has identified the dual elimination of MTCT of HIV and syphilis as a public health priority (World Health Organization, 2017a).
Epidemiologically, syphilis has been closely associated with HIV. Both infections have a common route of transmission and syphilitic genital ulcers provide a portal for HIV acquisition (Hook, 2017). Research has long shown that early access to treatment can reduce HIV-and syphilis-related deaths and prevent transmission (Sanders et al., 2005;Bert et al., 2018). However, these benefits are only likely to occur if individuals know their HIV/syphilis status. To facilitate early diagnosis, low-cost, easy to use and highly sensitive and specific HIV and syphilis rapid tests are available (Storey et al., 2019). By the end of 2017, only around 75% of people living with HIV knew their status, and of those diagnosed positive, only 79% received treatment globally (Joint United Nations Programme on HIV/AIDS, 2018). The coverage of syphilis testing lags behind HIV testing in many countries, especially among pregnant women and other high-risk groups such as female sex workers (FSWs) and men who have sex with men (MSM) (Kamb et al., 2010;UNICEF East Asia and Pacific Regional Office, 2016). It has been estimated that, in 2017, only 56% of pregnant women were tested for syphilis in Africa and 31% in South-East Asia (World Health Organization, 2018). In 2019, $95% and 57% of pregnant women were tested and treated for HIV in Eastern and Southern Africa and East Asia and Pacific, respectively (UNICEF, 2020). Low testing uptake has led to renewed commitments by national governments and international agencies to scale up HIV and syphilis 'test and treat' strategies (Kamb et al., 2010; UNICEF East Asia and Pacific Regional Office, 2016).
Scaling up, broadly defined as the 'deliberate effort to increase the impact of successfully tested health innovations, so as to benefit more people' (Simmons and Shiffman, 2007), has a direct impact on costs. Scaling up decentralized tests, outside of laboratories, requires multiple systems to be considered other than the test itself, such as training, quality management, stock managing, reporting and these need to be considered when calculating costs (Johns and Torres, 2005;Mabey et al., 2012). Major divergences have been reported between the costs of HIV and syphilis testing observed in pilot or small-scale studies relative to national roll-out (Johns, 2015;Shelley et al., 2015;Bautista-Arredondo et al., 2018a). Empirical studies have found that this has resulted from the exclusion or inaccurate measurement of the unit costs of transport especially in rural and remote areas, procurement and re-supply of test kits, recruiting and training new staff and strengthening health system infrastructure required for quality control and maintenance (Johns and Torres, 2005;Mikkelsen et al., 2017). Economies of scale also lead to changes in the cost structure of programmes. Theories of economies of scale suggest that the cost per unit decreases as more units are produced (Getzen, 2014). This could be the result of fixed costs being spread over more units of output, reducing per unit costs or due to higher volumes that permit greater specialization of staff (e.g. health facilities in less densely areas may not have enough patient volume for a full-time testing counsellor). In contrast, variable inputs such as test kits and supplies vary directly with the level of output; the amount of fixed inputs cannot be changed in what economists refer to as the 'short run'. In the short run, the limits of these fixed inputs, whether they are buildings, testing equipment or staffs, imply that further scale up is not possible or will create inefficiencies. In the long run, the fixed inputs are flexible and can be changed to adapt to the different levels of activity, resulting in a non-linear relationship between scale and cost and requiring analysis to distinguish between the short and long runs (Vita, 1990). These relationships and the critical inputs to production will also be different when scaling up services at a clinic to reach more individuals and when scaling up from a single clinic to a national programme (Gomez et al., 2020).
Interest in measuring economies in HIV and syphilis testing of scale has risen with the development of rapid point of care tests (Mabey et al., 2012;Storey et al., 2019). Unit costs of existing automated laboratory testing at a central laboratory may benefit from economies of scale through the distribution of fixed costs, but these costs do not take into account the need for multiple clinic visits, patient costs of accessing testing services and the failure of results to reach patients due to the slow turnaround associated with these tests. Rapid testing at or near the point of care has the potential for savings to the patient through the quick delivery of results and to

Key Messages
• Scale is an important driver in determining the costs of HIV and syphilis testing programmes in resource-constrained health systems • Common methodological assumptions including short-run framework, linear unit cost projections and the availability of health system capacity and infrastructure to supervise expanded delivery currently mask the 'true' costs of scaling up HIV and syphilis testing in low-and middle-income countries. • Financing and budgeting for the scale up of HIV and syphilis testing would benefit from the monitoring of costs across delivery sites over multiple timepoints.
the provider through a reduction in the number of visits and loss to follow-up (Mabey et al., 2012;Fleming et al., 2017). Levels of testing may also be lower in less densely populated rural areas while the costs of delivery (e.g. cost of the test or transport) can be higher (Shelley et al., 2015). Currently, decision-makers in LMICs lack a consolidated evidence base from which to understand the cost implications of scaling up HIV and syphilis testing. Three reviews of economic studies evaluating the costs of scaling up of health interventions in LMICs (Kumaranayake et al., 2001;Johns and Torres, 2005) and HIV interventions specifically (Kumaranayake, 2008) have been published, each over a decade old. Each review found evidence of economies of scale, recommending that future work take scale and other cost drivers into account when estimating the costs of scaling up (Kumaranayake et al., 2001;Johns and Torres, 2005;Kumaranayake, 2008). These reviews also identified best practice recommendations for measuring economies of scale and the costs of scaling up public health interventions in LMICs including: ensuring sufficient and representative sample sizes to capture differences in cost characteristics across sites; distinguishing between and measuring both fixed and variable costs; and using appropriate analytical methods, e.g. econometric estimation. None of the reviews capture the recent proliferation of studies assessing the impact of rapid tests for HIV and syphilis. In this paper, we systematically review the current evidence on the costs of scaling up HIV and syphilis testing in LMICs, with a focus on key findings, quality and methodological issues.

Search strategy
The review team defined the search terms according to four domains based on the aims of this review. The four domains included: infectious disease; screening and testing; economics; and LMICs. Despite the frequent use of the term 'scaling up' in international health, in practice it has been interpreted in many different ways (Mangham and Hanson, 2010). Search terms relating to 'scaling up' were not included in the initial screening, rather an extensive manual screening was conducted to filter articles according to the eligibility criteria listed below. The search terms were applied in six databases: Medline via Ovid; Econlit via Proquest; Embase via Ovid; EMCARE via EBSCO; CINAHL via EBSCO; and Global Health via Ovid. The search strategy used medical subject headings for Medline and comparable terms for the other databases (complete Medline search terms can be found in Supplementary Appendix S1). The NHS Economic Evaluation Database, a bibliographic record of published health technology assessments, was also searched using the same search strategy. A librarian from the authors institute was consulted on the search strategy including the selection of domains and search terms. The systematic review followed the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines (Moher et al., 2009) and is registered on the International Prospective Register of Systematic Review (PROSPERO), identification number CRD42018103890.

Eligibility criteria
Studies were eligible for this review if they: (i) measured the costs of scaling up HIV and/or syphilis testing at scale using empirical data, modelling or a hybrid of these approaches; (ii) focused on HIV and/ or syphilis testing to identify new infections; (iii) were conducted in an LMICs, defined according to the World Bank Country and Lending Groups classification in 2019 (World Bank, 2019); and (iv) were full research papers (reviews, editorials, letters and conference papers were excluded). No date or language restrictions were applied.

Study selection
The search results were imported into an Endnote library and were independently screened by two reviewers based on their titles and abstracts. Both reviewers screened the first 10% of articles (N ¼ 487) against the eligibility criteria to determine the inter-rater reliability of the reviews. Agreement was assessed using a simple kappa analysis (McHugh, 2012) and near perfect agreement (kappa score of 0.96) was achieved and subsequently the screening continued with a single reviewer (RA) (McHugh, 2012;Wiseman et al., 2016). Two reviewers read the full text of the selected studies and any disagreements regarding eligibility were resolved by consulting a third reviewer. Reasons for exclusion were recorded. Scopus-Elsevier was used to track the reference lists in the final papers to identify any additional relevant studies.

Data extraction and appraisal
The variables used to describe the different studies in this review are shown in Table 1. All data were extracted by two reviewers and all differences were again resolved by a third reviewer. A narrative synthesis describing the included studies and their conclusions was considered to be the most appropriate approach to synthesizing the findings of the studies (Ryan and Cochrane Consumer and Communication Review Group, 2013). The synthesis from data extraction is presented according to the characteristic of studies included, type of cost studies undertaken, methods to assess cost and impact of scale on costs.
The reporting and methodological quality of studies estimating cost at scale were assessed using a checklist designed for this review ( Table 2). The checklist was developed in consultation with global experts in the development of existing checklists for costing exercises and include a checklist for good practice in the economic evaluation of health interventions (Husereau et al., 2013); checklists for appraising priority setting studies in the health sector (Peacock et al., 2007;Wiseman et al., 2016); Global Health Cost Consortium guiding principles (Vassall et al., 2017); and best practice guidelines for calculating the costs of scaling up health interventions (Johns and Baltussen, 2004;Johns and Torres, 2005;Kumaranayake, 2008). Three reviewers independently appraised the studies against this checklist, and any disagreements were resolved by consulting two additional reviewers.

Results
We identified 35 eligible studies from 4869 unique citations as shown in the PRISMA flow diagram (see Figure 1). Table 3 summarizes the characteristics of 35 included studies in this review. The majority of studies were conducted in Sub-Saharan Africa (N ¼ 17), published after 2010 (N ¼ 24) and focused on HIV testing (N ¼ 33). Only two studies examined the costs of scaling up syphilis testing (Schackman et al., 2007;Shelley et al., 2015). Most of the HIV studies involved facility-based testing using rapid tests in the general adult population (N ¼ 19) or among pregnant women (N ¼ 6), while all syphilis studies were focused on pregnant women (N ¼ 2). Only three studies explored the use of HIV self-testing in the community and general population (N ¼ 3) (Cambiano et al., 2015;McCreesh et al., 2017;Mangenah et al., 2019). Four studies evaluated community-based testing (Tromp et al., 2013;McCreesh et al., 2017;Cherutich et al., 2018;Mangenah et al., 2019), only one on home-based testing (Sharma et al., 2016), one on mobile testing (Verstraaten et al., 2017) and one in the prison (Nelwan et al., 2016). 'Scale up' was most commonly defined in terms of an increase in population coverage (N ¼ 32) with the remaining studies defining it as an increase in the number of test kits distributed (N ¼ 2) or an expansion of a geographical catchment area (N ¼ 1).

Type of cost studies undertaken
Around half of the studies undertook a cost analysis (N ¼ 17) and another half conducted a cost-effectiveness analysis (CEA) (N ¼ 17) typically comparing facility-based HIV testing using rapid tests with laboratory-based testing, with two of those complemented their Service through which the intervention is delivered (e.g. health centre, hospital) and sector (e.g. public/private) Time horizon The duration over which costs and/or consequences are calculated Study design Randomized controlled trial, cross-sectional, cohort, case-control, modelling Type of economic analysis (and ratio if applicable) Cost analysis, cost-effectiveness, cost-utility or cost-benefit analysis. Includes ratio used (e.g. cost per DALY averted) Data source(s) Primary data collection, expert/stakeholder opinion, published data or literature or combination of those Analytical approach to measure costs at scale Econometric, empirical, modelling or a hybrid of these approaches (Kumaranayake, 2008) Costs of scaling up  CEAs with budget impact analysis (BIA) (Cherutich et al., 2018;Luong Nguyen et al., 2018). One cost-benefit analysis of HIV testing in pregnant women was reported (Kumar et al., 2006). Around half of the studies relied on primary data (N ¼ 18) and the remaining half used a combination of published evidence and expert opinion (N ¼ 17). Most studies took a modelling approach to the estimation of the costs of scaling up testing for HIV and/or syphilis (N ¼ 25).

Methods to assess cost according to scale
Of the 35 eligible studies, 10 measured how costs varied with scale (see Table 4). All other studies used a constant average cost assumption (i.e. no adjustment for scale) to estimate resource requirements of scaling up, implicitly assuming that costs were indifferent to scale and other potential cost drivers. Of the 10 studies measuring the impact of scale on costs, six conducted an econometric analysis by estimating the relationship between cost and cost determinants using regression analysis and remaining four are empirical studies (Forsythe, 2002;McConnel et al., 2005;

Impact of scale on costs
The six studies measuring the impact of scale on costs found that the average cost per person tested (most commonly for HIV using a rapid test) decreased as scale increased, demonstrating that economies of scale were possible (i.e. confirmed through a negative coefficient on the scale variable), with coefficient of scale is ranging from 0.18 to 0.83 (Dandona et al., 2005;McConnel et al., 2005;Galá rraga et al., 2017;Mwenge et al., 2017;Bautista-Arredondo et al., 2018b;Mangenah et al., 2019). The main driver of these economies of scale is the distribution of fixed cost to a larger number of patients or outputs. The remaining four studies took an empirical-based approach whereby the relationship between scale and cost was based on observations of the actual cost at different levels of scale (Forsythe, 2002;McConnel et al., 2005;Dandona et al., 2008a,b;Shelley et al., 2015). Three of the four studies showed that the costs of screening per person decreased as scale increased (Forsythe, 2002;McConnel et al., 2005;Dandona et al., 2008a,b). One empirical study reported an increase in the average cost per client tested as the intervention was rolled out to new sites, which was attributed to several factors including higher rapid pointof-care syphilis tests prices and lower rapid syphilis testing (RST) uptake in the targeted population (Shelley et al., 2015).

Fixed vs variable costs
All studies in this review were based on a short-run decision framework, during which the amount of fixed inputs could not be easily varied. The costs of scaling up HIV and syphilis testing tended to be narrowly defined as the cost of the test, personnel and associated consumables such as gloves and cotton swabs (N ¼ 18). Only 10 out    of 35 included studies separated the fixed and variable component of costs. A small number of studies (N ¼ 8) attempted to include more cost items such as the costs of educational materials, monitoring and supervision or waste management. Few studies (N ¼ 4) considered the costs of managing the scaling up process including the costs of quality management and investment in procuring new equipment. In addition to scale, other key drivers of cost included availability of transport infrastructure, variation in the price of local goods (e.g. test kits, medicines, fuel), costs and frequency of supervisory trips and the recruitment and training of health personnel, especially in remote areas. Table 5 summarizes the results of the appraisal. Most studies clearly reported the research question(s), perspective taken and the time horizon for the analysis (N ¼ 29). No study justified their sample size for the costing and around half undertook a sensitivity analysis for major cost inputs (N ¼ 19). There were widespread gaps in the methodological quality of estimating costs at scale. Specifically, less than half of the studies in this review estimated average costs at different levels of scale (N ¼ 6), measured the relationship between average costs and scale (N ¼ 9) or separated fixed and variable costs (N ¼ 10), which are all necessary to accurately measure economies or diseconomies of scale (Kumaranayake, 2008).

Discussion
To expand access to HIV and syphilis testing and reach elimination targets, successful small-scale programmes need to cover broader populations in LMICs. The availability of reliable and detailed information on the resources required to do this is a key determinant of success (Johns and Torres, 2005;Kumaranayake, 2008;World Health Organization and ExpandNet, 2009;Mangham and Hanson, 2010). This review validates that scale is an important driver in determining the costs of HIV and syphilis testing programmes in resource-constrained health systems. It also reveals the potential for economies of scale (i.e. a reduction in average costs as the number of people tested increases) at least in the short run when structural changes to health systems (e.g. training, quality management and stock management) necessary for the large scale delivery of testing have not yet been undertaken. Despite syphilis testing being widely recommended for use in LMICs, particularly in pregnancy (Newman et al., 2013;Storey et al., 2019), only two studies explored the costs of scaling up syphilis testing. Recent pilot studies have demonstrated the costeffectiveness of HIV and syphilis screening using a dual rapid test over single HIV and syphilis tests or HIV testing alone (Bristow et al., 2016). It has been argued that dual tests for HIV and syphilis may potentially contribute to economies of scale and scope (associate with the sharing of fixed costs across activities) in terms of startup, training, quality management, supervision and monitoring while also serving to promote syphilis testing, which is lagging well behind HIV testing in many LMICs (Bristow et al., 2016;Taylor et al., 2017). While there were many studies that found community-based strategies are effective in increasing uptake of testing (Asiimwe et al., 2017;Ahmed et al., 2018), this review revealed a lack of attention paid to the economic impact of scaling up these strategies. Many key populations have expressed a strong preference for community-based testing, which is seen as less stigmatizing and more accessible (Suthar et al., 2013). Despite the growing importance of the private sector in the delivery of HIV and syphilis care in    LMICs (Rao et al., 2011;Wang et al., 2017), only one study exploring the costs of scaling up the delivery of testing through private or non-government providers (Verstraaten et al., 2017) was identified. Further research on the costs of scaling up syphilis testing and modelling community-based HIV and syphilis testing in public and private sectors are needed. Our review highlighted that while the overall standard of reporting costs was reasonable, with most studies partially or fully addressing seven or more of the ten questions, shortcomings manifested in two key areas. First, none of the studies in this review justified their sample size for the costing. Economic evaluations often require larger sample sizes for adequate power compared to a typical clinical study evaluating the health impact of a testing intervention (Taylor et al., 2017). However, clinical studies and economic evaluations generally investigate differences in outcomes at the individual level. Cost drivers, such as scale, are better identified through multisite cost studies with the production unit or clinic as the unit of analysis (Tagar et al., 2014). A representative sample should capture differences in clinic characteristics such as geographical setting, ownership, and management systems. However, commonly the trial sample size or convenience sampling, in a limited number of sites, determines sample size for a full or partial economic evaluation. Given the considerable uncertainties around the costs of going to scale including the varying cost of procuring and delivering test kits (Johns and Torres, 2005;Shelley et al., 2015), it was surprising that almost half of the studies failed to explore the influence of different key inputs on unit costs or total programme costs by conducting sensitivity analysis. For example, volume purchasing, which is being explored in many LMICs, is likely to impact the cost and uptake of rapid diagnostic tests (Taylor et al., 2017) and worth closer consideration.
From a methodological viewpoint, the appraisal revealed that only eight studies adjusted for changes in unit costs as HIV and syphilis testing was scaled up (McConnel et al., 2005;Dandona et al., 2008a,b;Shelley et al., 2015;Galá rraga et al., 2017;Mwenge et al., 2017;Bautista-Arredondo et al., 2018a;Mangenah et al., 2019). Scaling up HIV and syphilis testing programmes typically involves transporting supplies longer distances and to more remote areas compared to pilot programs, which can lead to variations in the prices of consumables such as testing kits. The impact on programme outcomes must also be considered alongside any changes in costs. Sweeney et al. (2014) have argued that, while devolving supervision and monitoring of RST in Tanzania to authorities at the subnational level may lead to reductions in the frequency and cost of external quality assurance, this may pose challenges for quality maintenance (Sweeney et al., 2014) . Our appraisal also revealed that despite majority (22 out of 35) of studies have clearly stated their perspective, only three of these adopted a societal perspective. This means that a significant proportion of direct and indirect cost incurred by patients for accessing tests is not considered. Moreover, while most studies included a range of recurrent costs (e.g. staff time, training, testing commodities and other medical supplies) and capital costs (e.g. vehicles and computers) and were focused at the level of a clinic or health centre, few acknowledged the required investments in infrastructure (e.g. quality management, reporting system) and broader health systems strengthening needed as HIV and syphilis testing programmes are scaled up to the national level across all clinics and facilities (Kumaranayake et al., 2001). These 'higher level' investments refer 'to the policy, political, legal, regulatory, budgetary or other health systems changes needed to institutionalize new innovations at the national or sub-national level' (World Health Organization and ExpandNet, 2010). Categories for key drivers are summarized as geography and infrastructure, fixed costs, personnel, managing the process of scaling up and others, as discussed by Johns and Torres (2005). b Coefficient of scale is a measure of association between average cost and level of scale.
Another area for methodological improvement relates to the quantification of the relationship between average costs and the scale of delivering HIV or syphilis testing. Most studies undertook a simple form of modelling whereby costs were scaled up linearly. For these studies, an empirical average cost associated with a testing programme was multiplied by a factor representing activity at a larger scale. For example, if the unit cost per person tested for HIV and syphilis is United States Dollar (USD) 10 for 100 people then expanding coverage by another 50 people would be USD 500 (Kumaranayake, 2008). Only 10 of the 35 studies developed nonlinear cost functions, allowing some costs to be fixed regardless of the size of the population reached-e.g. medical equipment, vehicles or buildings-leading to economies of scale. For one study on the roll out of syphilis rapid testing in Zambia, lower clinic catchment populations combined with higher unit costs for transport, supervision and test kits reduced the economies of scale achieved in the high coverage pilot sites (Shelley et al., 2015). This resonates with economic theory as output increases average costs will first fall and then rise, following a 'u'-shaped curve (Guinness, Kumaranayake and Hanson, 2007;Kumaranayake, 2008). This small subset of studies represents an important step forward in providing a more accurate estimation of the costs of scaling up HIV and syphilis testing interventions in LMICs, painting a more complex picture of the relationship between scale and costs.
There are some limitations of this review that need to be acknowledged. Studies were restricted to those found in the published literature, a potential source of reporting bias. In addition, the studies identified varied by analytical approach (empirical, econometric or model), testing intervention and types of costs measured. This diversity prevented the pooling of results for a meta-analysis. Instead, studies were qualitatively reviewed, and their results and characteristics tabulated which helped to highlight evidence gaps and methodological weaknesses.
In summary, this review highlights evidence of the relationship between the costs and scale of delivering HIV and syphilis testing in LMICs. What is less clear is how costs change with scale and in turn the potential for economies of scale. Scaling up costs linearly, an assumption that underpins most studies in this review, and runs the risk of misleading policymakers as to the true costs of providing universal access to HIV and syphilis testing. Collecting empirical cost data alongside the roll-out of HIV and syphilis testing is a priority. Financing and budgeting for the scale up of HIV and syphilis testing will benefit from the monitoring of costs across different sites, contexts and settings as well as over time, for greater consistency in the categorization of fixed and variables costs, and the inclusion of costs associated with strengthening health systems to support quality assurance and stock management systems.

Supplementary data
Supplementary data are available at Health Policy and Planning online